-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization task panicked after a collection is recovered, causing search API timeout #4131
Comments
Hey @no7dw, thanks for reporting this. Could you please provide a bit more details on how the snapshot was created (on which version of the engine?) |
the current version is v1.9 while the backup snapshot is created by v1.8.3 we used the following command just follow the docs:
additional details: |
Does the very same happen repeatedly? Or was this a one time occurrence? That isn't entirely clear to me from the issue description. |
happen repeatedly , once I use a new vm, after restore one of the collection (new_text) will soom become |
I do indeed expect a corrupted snapshot state somehow. We did just merge #4132 which likely fixes this panic. It'll be part of our next release. In 1.9 we also improved snapshots, also improving data consistency, which would likely solve this from happening in the future. Here's the related PR if you're interested: #3420 |
Current Behavior
we have a collection report error:
optimizer_status": {
"error": "Service internal error: Optimization task panicked: called
Option::unwrap()
on aNone
value"},
which seems causing slow performance usually timeout on search .
we check the docker logs
show as :
config for the collection name : new_text
Context (Environment)
version: Linux running in Dockerv1.9
this collection(new_text) is backup from a snapshot, seems sth wrong with the snapshot.
when snapshot recovered is completed (took 1-2hours), the collections could query/search with good performance.
when we query
and the error show up ( probably in ~6 hours or so)
search API become extreme slow with usually timeout(>60secs).
once this issue happened, will the status will always turn red.
one weird thing is:
when query has filter ( index hit), the collection is queryable without timeout.
Detailed Description
the snapshot is broken due to some deletion of some abnormal operation (probably by us).
we have other collections in this instance, seems they works fine.
Possible Implementation
Will it be possible to rebuild index to avoid the collection issue.
Thx in advance.
The text was updated successfully, but these errors were encountered: