Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Out-Of-Disk gracefully #4108

Open
generall opened this issue Apr 24, 2024 · 8 comments
Open

Handle Out-Of-Disk gracefully #4108

generall opened this issue Apr 24, 2024 · 8 comments

Comments

@generall
Copy link
Member

Is your feature request related to a problem? Please describe.

Currently, there are situations, when Qdrant service can crush if it is not enough disk space to perform update operation.
This is sub-optimal behavior, as it should still be possible to respond to the search requests in this case.

Describe the solution you'd like

Add improve the handling of the situation, where qdrant faces out-of-disk problem. Instead of crashing, it should answer 500 to the user and still be able to process incoming search requests.

Describe alternatives you've considered

Block requests if the disk usage is above some threshold. This would require configuration of the arbitrary threshold and overall less desirable.

Additional context

We prepared an automated test scenario - #4105
Solution of this issue should include a PR into test/low-disk-tests branch, which makes the OOD test pass.

@generall
Copy link
Member Author

/bounty $250

Copy link

algora-pbc bot commented Apr 24, 2024

💎 $250 bounty • Qdrant

Steps to solve:

  1. Start working: Comment /attempt #4108 with your implementation plan
  2. Submit work: Create a pull request including /claim #4108 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Additional opportunities:

Thank you for contributing to qdrant/qdrant!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @Rutik7066 Apr 24, 2024, 3:27:47 PM WIP
🟢 @kemkemG0 May 3, 2024, 9:09:45 AM #4165

@Rutik7066
Copy link

@generall I would like to solve this issue. Could you please assign me?

@Rutik7066
Copy link

Rutik7066 commented Apr 24, 2024

/attempt #4108

@kemkemG0
Copy link
Contributor

kemkemG0 commented May 3, 2024

/attempt #4108

@kemkemG0
Copy link
Contributor

kemkemG0 commented May 3, 2024

@generall

I think RocksDB employs a similar approach as you propsed, such as this if (free_space < reserved_disk_buffer_)...
https://github.com/facebook/rocksdb/blob/ed01babd07ab23788f563e78c234c01d247c09b9/file/sst_file_manager_impl.cc#L272-L291

https://github.com/facebook/rocksdb/blob/ed01babd07ab23788f563e78c234c01d247c09b9/db/db_impl/db_impl_open.cc#L2241-L2248

Additionally, it appears that RocksDB allows users to set the disk buffer size as cf.options.write_buffer_size.
https://github.com/facebook/rocksdb/blob/ed01babd07ab23788f563e78c234c01d247c09b9/db/db_impl/db_impl_open.cc#L2029-L2034

https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide#flushing-options

I wonder if we can use wal_capacity_mb for this, but I'm not sure if write_buffer_size is equivalent to WAL size.

Either way, since RocksDB employs a strategy to maintain a maximum disk usage threshold, I think we should adopt a similar approach. I would love to proceed with this strategy. What do you think?

Copy link

algora-pbc bot commented May 4, 2024

💡 @kemkemG0 submitted a pull request that claims the bounty. You can visit your bounty board to reward.

Copy link

algora-pbc bot commented May 13, 2024

🎉🎈 @kemkemG0 has been awarded $250! 🎈🎊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants