Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Bulk insert will fail if the file is in JSON format and the JSON field contains non-standard JSON data types. #32999

Open
1 task done
zhuwenxing opened this issue May 13, 2024 · 1 comment
Assignees
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@zhuwenxing
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:10 - DEBUG - ci_test]: (api_request)  : [Collection] args: ['bulk_insert_yuDIumhG', {'auto_id': True, 'description': '', 'fields': [{'name': 'uid', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': True}, {'name': 'int_scalar', 'description': '', 'type': <DataType.INT64: 5>}, {'name': 'float_scalar', 'description': '', 'type': <......, kwargs: {'consistency_level': 'Strong'} (api_request.py:62)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:10 - DEBUG - ci_test]: (api_response) : <Collection>:

[2024-05-11T11:55:06.080Z] -------------

[2024-05-11T11:55:06.080Z] <name>: bulk_insert_yuDIumhG

[2024-05-11T11:55:06.080Z] <description>: 

[2024-05-11T11:55:06.080Z] <schema>: {'auto_id': True, 'description': '', 'fields': [{'name': 'uid', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': True}, {'name': 'int_scalar', 'description': '', 'type': <DataType.INT64: ......  (api_request.py:37)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:10 - INFO - ci_test]: before bulk load, there are 0 working tasks (utility_wrapper.py:25)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:10 - INFO - ci_test]: files to load: ['data-fields-12-rows-1000-dim-128-file-num-0-1715428028.json'] (utility_wrapper.py:26)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:10 - DEBUG - ci_test]: (api_request)  : [do_bulk_insert] args: ['bulk_insert_yuDIumhG', ['data-fields-12-rows-1000-dim-128-file-num-0-1715428028.json'], None, None, 'default'], kwargs: {} (api_request.py:62)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:10 - DEBUG - ci_test]: (api_response) : 449688385500757637  (api_request.py:37)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:11 - INFO - ci_test]: after bulk load, there are 1 working tasks (utility_wrapper.py:34)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:11 - INFO - ci_test]: wait bulk load timeout is 300 (utility_wrapper.py:111)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:11 - INFO - ci_test]: before waiting, there are 0 pending tasks (utility_wrapper.py:113)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - DEBUG - ci_test]: (api_request)  : [get_bulk_insert_state] args: [449688385500757637, 300, 'default'], kwargs: {} (api_request.py:62)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - DEBUG - ci_test]: (api_response) : <Bulk insert state:

[2024-05-11T11:55:06.080Z]     - taskID          : 449688385500757637,

[2024-05-11T11:55:06.080Z]     - state           : Failed,

[2024-05-11T11:55:06.080Z]     - row_count       : 0,

[2024-05-11T11:55:06.080Z]     - infos           : {'failed_reason': "expected type 'JSON' for field 'json', got type 'json.Number' with value '1.0': importing data failed", 'progress_percent': '0'},

[2024-05-11T11:55:06.080Z]     ......  (api_request.py:37)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - INFO - ci_test]: after waiting, there are 0 pending tasks (utility_wrapper.py:148)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - INFO - ci_test]: task state distribution: {'success': set(), 'failed': {449688385500757637}, 'in_progress': set()} (utility_wrapper.py:149)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - INFO - ci_test]: {449688385500757637: <Bulk insert state:

[2024-05-11T11:55:06.080Z]     - taskID          : 449688385500757637,

[2024-05-11T11:55:06.080Z]     - state           : Failed,

[2024-05-11T11:55:06.080Z]     - row_count       : 0,

[2024-05-11T11:55:06.080Z]     - infos           : {'failed_reason': "expected type 'JSON' for field 'json', got type 'json.Number' with value '1.0': importing data failed", 'progress_percent': '0'},

[2024-05-11T11:55:06.080Z]     - id_ranges       : [],

[2024-05-11T11:55:06.080Z]     - create_ts       : 2024-05-11 11:47:10

[2024-05-11T11:55:06.080Z] >} (utility_wrapper.py:150)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - INFO - ci_test]: wait for bulk load tasks completed failed, cost time: 2.0055181980133057 (utility_wrapper.py:155)

[2024-05-11T11:55:06.080Z] [2024-05-11 11:47:13 - INFO - ci_test]: bulk insert state:False in 3.0168752670288086 with states:{449688385500757637: <Bulk insert state:

[2024-05-11T11:55:06.080Z]     - taskID          : 449688385500757637,

[2024-05-11T11:55:06.080Z]     - state           : Failed,

[2024-05-11T11:55:06.080Z]     - row_count       : 0,

[2024-05-11T11:55:06.080Z]     - infos           : {'failed_reason': "expected type 'JSON' for field 'json', got type 'json.Number' with value '1.0': importing data failed", 'progress_percent': '0'},

[2024-05-11T11:55:06.080Z]     - id_ranges       : [],

[2024-05-11T11:55:06.080Z]     - create_ts       : 2024-05-11 11:47:10

[2024-05-11T11:55:06.080Z] >} (test_bulk_insert.py:882)

Expected Behavior

bulk insert success

Steps To Reproduce

No response

Milvus Log

failed job: https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus%20HA%20CI/detail/PR-32975/4/pipeline/

Anything else?

No response

@zhuwenxing zhuwenxing added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 13, 2024
@zhuwenxing zhuwenxing changed the title [Bug]: Bulk insert may fail if the file is in JSON format and the JSON field contains non-standard JSON data types. [Bug]: Bulk insert will fail if the file is in JSON format and the JSON field contains non-standard JSON data types. May 13, 2024
@zhuwenxing
Copy link
Contributor Author

/assign @bigsheeper

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 14, 2024
@yanliang567 yanliang567 added this to the 2.4.2 milestone May 14, 2024
@yanliang567 yanliang567 removed their assignment May 14, 2024
@yanliang567 yanliang567 modified the milestones: 2.4.2, 2.4.3 May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants