-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: RPC error: [batch_insert], <ParamError: (code=1, message=invalid input, length of string exceeds max length. length: 131239, max length: 65535)>, <Time:{'RPC start': '2024-05-11 14:17:48.677254', 'RPC error': '2024-05-11 14:17:48.678019'}> #3995
Comments
The error you're encountering is due to a string field in your documents exceeding Milvus's maximum allowed length of 65535 characters, with a string length of 131239 being reported. This issue arises when attempting to insert documents into a Milvus collection where at least one string exceeds this limit. To resolve this, you can implement a check and truncate strategy before insertion. This involves modifying your document processing logic to ensure no string field exceeds the maximum length allowed by Milvus. Here's a concise example of how you might adjust your document processing to truncate longer strings: MAX_LENGTH = 65535 # Maximum length allowed for string fields in Milvus
for doc in docs:
for k, v in doc.metadata.items():
str_v = str(v)
# Truncate strings to MAX_LENGTH if they exceed it
if len(str_v) > MAX_LENGTH:
doc.metadata[k] = str_v[:MAX_LENGTH] This code snippet should be integrated into your document preparation process, ensuring that all string fields are within the allowed length before attempting to insert them into Milvus. This approach helps avoid the "length of string exceeds max length" error by preemptively addressing the root cause.
|
在项目哪个地方更改代码? @dosu |
To address the issue of string lengths exceeding Milvus's maximum allowed length in your project, you should modify the Here's a concise example of how to integrate string truncation into the def do_add_doc(self, docs: List[Document], **kwargs) -> List[Dict]:
max_length = 65535 # Max length for string fields in Milvus
for doc in docs:
for k, v in doc.metadata.items():
# Ensure string fields do not exceed max_length
doc.metadata[k] = str(v)[:max_length]
# Additional processing for document metadata before insertion
# (e.g., setting default values, removing unnecessary fields)
ids = self.milvus.add_documents(docs)
doc_infos = [{"id": id, "metadata": doc.metadata} for id, doc in zip(ids, docs)]
return doc_infos This modification ensures that all string fields in your documents are truncated to the maximum length allowed by Milvus before they are inserted, effectively preventing the error you encountered.
|
2024-05-11 14:17:48,678 - decorators.py[line:146] - ERROR: RPC error: [batch_insert], <ParamError: (code=1, message=invalid input, length of string exceeds max length. length: 131239, max length: 65535)>, <Time:{'RPC start': '2024-05-11 14:17:48.677254', 'RPC error': '2024-05-11 14:17:48.678019'}>
2024-05-11 14:17:48,678 - milvus.py[line:595] - ERROR: Failed to insert batch starting at entity: 4000/11304
Traceback (most recent call last):
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/init_database.py", line 113, in
folder2db(kb_names=args.kb_name, mode="increment", embed_model=args.embed_model)
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/migrate.py", line 150, in folder2db
files2vs(kb_name, kb_files)
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/migrate.py", line 113, in files2vs
kb.add_doc(kb_file=kb_file, not_refresh_vs_cache=True)
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 131, in add_doc
doc_infos = self.do_add_doc(docs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/kb_service/milvus_kb_service.py", line 84, in do_add_doc
ids = self.milvus.add_documents(docs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 119, in add_documents
return self.add_texts(texts, metadatas, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/langchain_community/vectorstores/milvus.py", line 598, in add_texts
raise e
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/langchain_community/vectorstores/milvus.py", line 592, in add_texts
res = self.col.insert(insert_list, timeout=timeout, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/orm/collection.py", line 500, in insert
return conn.batch_insert(
^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/decorators.py", line 147, in handler
raise e from e
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/decorators.py", line 143, in handler
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/decorators.py", line 182, in handler
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/decorators.py", line 122, in handler
raise e from e
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/decorators.py", line 87, in handler
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 575, in batch_insert
raise err from err
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 558, in batch_insert
request = self._prepare_batch_insert_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 542, in _prepare_batch_insert_request
else Prepare.batch_insert_param(collection_name, entities, partition_name, fields_info)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/prepare.py", line 531, in batch_insert_param
return cls._parse_batch_request(request, entities, fields_info, location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/prepare.py", line 507, in _parse_batch_request
field_data = entity_helper.entity_to_field_data(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/entity_helper.py", line 374, in entity_to_field_data
entity_to_str_arr(entity, field_info, CHECK_STR_ARRAY)
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/entity_helper.py", line 236, in entity_to_str_arr
return convert_to_str_array(entity.get("values", []), field_info, check=check)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/entity_helper.py", line 231, in convert_to_str_array
check_str_arr(arr, max_len)
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/pymilvus/client/entity_helper.py", line 216, in check_str_arr
raise ParamError(
pymilvus.exceptions.ParamError: <ParamError: (code=1, message=invalid input, length of string exceeds max length. length: 131239, max length: 65535)>
我的jsonl文件只有15M,在进行初始化数据库的时候,出现这种情况。
The text was updated successfully, but these errors were encountered: