Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine (re)sharding hash rings in a single type, update split-by-shard logic #4236

Merged
merged 5 commits into from
May 15, 2024

Conversation

timvisee
Copy link
Member

@timvisee timvisee commented May 14, 2024

Tracked in #4213.

This combines (re)sharding hash rings in a single type, replacing the single hash ring we had before. It provides a consistent interface for all scenarios, whether we have a single hash ring or multiple while resharding. It keeping most of the existing logic intact, and allows us to extend and/or tweak behavior in a single place in the future.

Here I propose to replace the initial approach with a separate hash ring on a shared resharding key introduced in #4216. It seems more future proof, easier to manage, and remains on the shard key level. When working with shards, we always namespace to a specific shard key and so it makes sense to combine all necessary state in a single type with a single lookup.

It doesn't only replace the way of storing a resharding hash ring, but also updates all point-to-shard routing to spread to two shards in case of resharding.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

@timvisee timvisee marked this pull request as ready for review May 15, 2024 08:55
@ffuugoo ffuugoo self-requested a review May 15, 2024 08:56
@ffuugoo ffuugoo mentioned this pull request May 15, 2024
38 tasks
Copy link
Contributor

@ffuugoo ffuugoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM! Thanks for taking over it! 👍

My only real comment would be: we might not want to always split-by-shard to new hashring, when doing resharding. E.g., we might want to introduce explicit stages when we do (or do not) split-by-shard to the new shard.

I'd go and add explicit .take(1) to the point_to_shards results, so that this PR won't change current behavior yet.

And then we will introduce another PR that will update split-by-shard logic, so that it would properly split only when required.

lib/collection/src/shards/shard_holder.rs Outdated Show resolved Hide resolved
@timvisee timvisee merged commit ea6a49a into dev May 15, 2024
17 checks passed
@timvisee timvisee deleted the shard-key-multi-hashring branch May 15, 2024 10:06
generall pushed a commit that referenced this pull request May 26, 2024
…rd logic (#4236)

* Add shard hash ring type, with single and resharding variant

* Use multiple shard IDs when splitting while resharding

* Use SmallVec for shard IDs to prevent heap allocation

* Add constructor to easily create single shard hashring

* Temporarily disable routing to two shards with resharding until reads work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants