Skip to content

Is marqo's text embedding multiple lingual? #470

Answered by jn2clark
lanyusan asked this question in Q&A
Discussion options

You must be logged in to vote

You can use a few models through the custom models api . The one below should be good to start with
"We used the following 50+ languages: ar, bg, ca, cs, da, de, el, en, es, et, fa, fi, fr, fr-ca, gl, gu, he, hi, hr, hu, hy, id, it, ja, ka, ko, ku, lt, lv, mk, mn, mr, ms, my, nb, nl, pl, pt, pt-br, ro, ru, sk, sl, sq, sr, sv, th, tr, uk, ur, vi, zh-cn, zh-tw." see here https://www.sbert.net/docs/pretrained_models.html.

settings = {
  "index_defaults": {
    "treat_urls_and_pointers_as_images": False,
    "text_preprocessing": {
      "split_length": 2,
      "split_overlap": 0,
      "split_method": "sentence"
    },
    "model": 'unique-model-alias',
    "model_properties": {"name": "sen…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@lanyusan
Comment options

Answer selected by lanyusan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants