-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection with 60 points has huge size and write ahead log #4159
Comments
I seems to have collected an enormous amount of operations in the WAL, which would explain the large size. Could you elaborate on what operations you've been using lately? A delete by filter operation is known to cause this. But I don't think you've used 76k of them in sequence, or did you? |
Thanks for the really quick response :-) We do use Even if we did 76K in sequence is there a chance to get rid of the WAL somehow or is this just something that must be avoided? |
Any update operation affecting points should resolve that (insert or update). Could you verify whether you've indeed send just 80 operations. If you did, something else is going on. During this whole process, did you see any errors or weird output in the logs? There may be an additional hint in there. Based on the above, I do assume that you did not make any configuration changes. Of course, I was just guessing here. It may be a different issue. |
Sorry for the long delay, but I was distracted. I checked our logs and this is what I figured out: We had 81 inserts on this collection and unfortunately I do not know exactly how many delete operations but in the end we have just 60 points in the collection so something between 1 and 21 I guess. If I am not mistaken there were no updates. I also checked the Qdrant logs and there were basically only logs about creating collections, but no errors or warnings or the like. The only thing that might be special is that the collection was created and deleted a couple of times. Next I will check if I can get rid of the WAL by inserting/updating anything. |
@timvisee Could we re-open this issue please? |
That shouldn't be an issue. Recreating a collection basically lets you start from scratch, wiping all previous WAL data for that collection. |
Verified that inserting a point into this collection removed the WAL. After that qdrant starts up very quickly and log indicates this WAL size of "1". Creating a snapshot is also very fast. It still uses 96M, but way less than the previous 1.6GB. So for me this fixes the issue at hand. The question that remains is how did Qdrant get into this state. I could provide you with the 1.6GB snapshot file, if you are interested or try to analyze it on my side, if you could give me hints on how to do this. |
Yes please, that would be helpful if you still have it available. |
Current Behavior
We have a collection where Qdrant show unexpected behavior for now obvious reason:
The following snippets use a Qdrant Version: 1.9.0, build: b99d507 started through docker-compose.
Thie is the collection:
Pretty much default settings. However its already unclear why the collection needs 8 segments and why the
indexed_vectors_count
is 0:The 60 points with some small payload (same structure for all points) can be confirmed:
Nonetheless the snapshot has a size of 1.6 GB:
And restarting Qdrant show a write ahead log of significant size:
Steps to Reproduce
Unfortunately I have no idea how to reproduce such a collection. I could provide the snapshot (export/import does not change the behavior), but its 1.6G.
Expected Behavior
Context (Environment)
Main problem with this collection is that with it restart of Qdrant takes forever and during restart Qdrant is unavailable.
The text was updated successfully, but these errors were encountered: