New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce shared query buffer for client reads #258
base: unstable
Are you sure you want to change the base?
Conversation
6ce2ea4
to
fc9ee81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good to me. Can you document the procedure you were using when you saw the 3% performance boost? Did we also test it for pipelined clients?
I would also ideally like some protection in freeClient() to make sure if a client gets freed it somehow doesn't also free the shard query buffer.
Performance Benchmark: Server Setup:
Benchmark Configuration:
Server Command:
Benchmark Command:
Results (Average of 3 runs):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change LTGM overall. Thanks @uriyage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with it. @uriyage Can you do the merge to get it up to date?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #258 +/- ##
============================================
+ Coverage 69.80% 69.82% +0.01%
============================================
Files 109 109
Lines 61801 61839 +38
============================================
+ Hits 43141 43178 +37
- Misses 18660 18661 +1
|
@uriyage can you give more tests about the cases that the whole command or argv exceed |
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
I checked with 24K values (since with jemalloc we actually allocate more than 16K). Commands:
I get 83,544 without a shared query buffer and 83,238 with a shared query buffer. So, essentially about the same, which is expected since with larger values, we revert to the current logic where we use a private buffer per client. |
@soloestoy Do you have any further followup? |
@@ -24,8 +24,24 @@ start_server {tags {"querybuf slow"}} { | |||
# The test will run at least 2s to check if client query | |||
# buffer will be resized when client idle 2s. | |||
test "query buffer resized correctly" { | |||
set rd [valkey_client] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to add two more cases that commands with large arguments exceed 16KB (PROTO_IOBUF_LEN) and 32KB (PROTO_MBULK_BIG_ARG), that we can cover all special cases.
I'm thinking if we can eliminate the limit of shared query buffer (and resize in cron if needed), maybe we can get more benefits. |
This PR optimizes client query buffer handling in Valkey by introducing a shared query buffer that is used by default for client reads. This reduces memory usage by ~20KB per client by avoiding allocations for most clients using short (<16KB) complete commands. For larger or partial commands, the client still gets its own private buffer.
The primary changes are:
shared_qb
that clients use by defaultIn addition to the memory savings, this change shows a 3% improvement in latency and throughput when running with 1000 active clients.
The memory reduction may also help reduce the need to evict clients when reaching max memory limit, as the query buffer is the main memory consumer per client.