-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multidense vectors quantization #4202
Conversation
6f9eb62
to
c8de47c
Compare
b8b371c
to
940f004
Compare
pub count: PointOffsetType, | ||
} | ||
|
||
pub struct QuantizedMultivectorStorage<TEncodedQuery, QuantizedStorage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quantized multivectors container. It aggregates SQ/PQ/BQ for all inner vectors and contains offsets
TEncodedQuery: Sized, | ||
QuantizedStorage: EncodedVectors<TEncodedQuery>, | ||
{ | ||
quantized_storage: QuantizedStorage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quantized storage for all inner vectors
}) | ||
} | ||
|
||
fn score_point_max_similarity(&self, query: &Vec<TEncodedQuery>, vector_index: u32) -> f32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We duplicate here MaxSim metric because it uses another interface for metric calculation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can have a comment mentioning that several implementations need to be kept in sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
) | ||
} | ||
|
||
fn encode_query(&self, query: &[f32]) -> Vec<TEncodedQuery> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EncodedVectors
wasn't designed for multivectors, query
here is a flattened data. Like with save/load, it requires refactoring
@@ -172,4 +190,68 @@ impl<'a> QuantizedScorerBuilder<'a> { | |||
} | |||
} | |||
} | |||
|
|||
fn new_multi_quantized_scorer<TElement, TMetric, TEncodedQuery>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as new_quantized_scorer
but for multivectors. It uses QuantizedQueryScorer::new_multi
and QuantizedCustomQueryScorer::new_multi
inside
@@ -41,6 +48,23 @@ pub enum QuantizedVectorStorage { | |||
PQMmap(EncodedVectorsPQ<QuantizedMmapStorage>), | |||
BinaryRam(EncodedVectorsBin<ChunkedVectors<u8>>), | |||
BinaryMmap(EncodedVectorsBin<QuantizedMmapStorage>), | |||
|
|||
ScalarRamMulti( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important part, definition of multivector quantized storages
// Config files | ||
self.path.join(QUANTIZED_CONFIG_PATH), | ||
// Storage file | ||
self.path.join(QUANTIZED_DATA_PATH), | ||
// Meta file | ||
self.path.join(QUANTIZED_META_PATH), | ||
] | ||
]; | ||
if self.is_multivector() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more file with offsets if multivector
@@ -226,6 +302,151 @@ impl QuantizedVectors { | |||
Ok(quantized_vectors) | |||
} | |||
|
|||
fn create_multi_impl< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encode different as regular dense, multivector version of Self::create_impl
let vector_parameters = | ||
Self::construct_vector_parameters(distance, dim, inner_vectors_count); | ||
|
||
let quantized_storage = match quantization_config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, encode all flattened vectors as regular dense storage
.collect_vec(); | ||
|
||
// convert into multivector quantized storage | ||
let quantized_storage = match quantized_storage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second, convert all inner vectors storage into multivector quantized storage
multi_vector_config, | ||
)) | ||
} | ||
QuantizedVectorStorage::ScalarRamMulti(_) => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreachable
is fine because we previously constructed regular dense quantization for all inner vectors
} else { | ||
QuantizedVectorStorage::ScalarMmap( | ||
EncodedVectorsU8::<QuantizedMmapStorage>::load( | ||
let quantized_store = if let Some(multivector_config) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add multivector case loading. Dense case wasn't changed, multicast is the same but with another types
Distance::Dot, | ||
128, // dim | ||
32, // ef | ||
5., // min_acc out of 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are some of those min_acc
so low?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binary quantization works badly on random data. Combination of HNSW+BQ+Discovery+Random data gives very low ACC but non-zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not discovery, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increased acc, 25 is too large to show that is works and too low to be sure that this test won't be flaky. Test shows some about 33 acc
} | ||
|
||
fn sames_count(a: &[Vec<ScoredPointOffset>], b: &[Vec<ScoredPointOffset>]) -> usize { | ||
a[0].iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ask for the top 5 in the test but then just compare the first result between plain and indexed.
I find this implementation a bit misleading given the name.
Were the results really bad for the rest of the tops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search with exact flag uses original vectors instead of quantized. That's why I compare hnsw search with plain - it's a comparison between between quantized and non-quantized data
offsets_path: &Path, | ||
) -> OperationResult<()> { | ||
let offsets_serialized = bincode::serialize(&self.offsets).map_err(|_| { | ||
OperationError::service_error("Cannot serialize quantized multivector offsets") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can we have a log::error
with the actual error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think serialization of Vec<_>
will produce some unexpected errors. Anyway this error message will appear in logs
let mut file = File::open(offsets_path)?; | ||
let mut offsets_serialized = Vec::new(); | ||
file.read_to_end(&mut offsets_serialized)?; | ||
let offsets = bincode::deserialize(&offsets_serialized).map_err(|_| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the error message will be shown in logs anyway. I added specific filename into error message
match query { | ||
QueryVector::Nearest(vector) => { | ||
let query_scorer = QuantizedQueryScorer::<TElement, TMetric, _, _>::new_multi( | ||
vector.try_into()?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a nice explicit call to try_from
to help code navigation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Is it possible already to write a short integration test in our |
new multiquantization scorers encode query maxsim for quantized vectors remove obsolete todo reuse existing scorers create multivector quantized storage save load offsets add test fix vectors count tempopery disable test while debugging fix tests fix build less static lifetimes less static lifetimes fix build
a93b189
to
3dd7465
Compare
Will do as a separate PR |
* quantized multivector definition new multiquantization scorers encode query maxsim for quantized vectors remove obsolete todo reuse existing scorers create multivector quantized storage save load offsets add test fix vectors count tempopery disable test while debugging fix tests fix build less static lifetimes less static lifetimes fix build * fix build after rebase * add persistence test * fix codespell * increase accuracy in tests * review remarks * add comment references * are you happy codespell * don't use bincode
Add quantization support for multivectors.
The main idea is to reuse quantization integration for multivectors.
Encoded vectors are a
struct
which implementsEncodedVectors
trait.https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors.rs#L21
There are encoded vectors for scalar, PQ and binary quantizations:
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors_u8.rs#L262
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors_pq.rs#L497
https://github.com/qdrant/quantization/blob/master/quantization/src/encoded_vectors_binary.rs#L160
Multivector encoding aggregates one of this structure into generic encoded type
struct QuantizedMultivectorStorage<TEncodedQuery, QuantizedStorage: EncodedVectors<TEncodedQuery>>
https://github.com/qdrant/qdrant/blob/basic-multidense-vectors-quantization/lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs#L20
It implements
EncodedVectors
trait.https://github.com/qdrant/qdrant/blob/basic-multidense-vectors-quantization/lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs#L136
So we can reuse quantization scorers and extend
enum QuantizedVectorStorage
:https://github.com/qdrant/qdrant/blob/basic-multidense-vectors-quantization/lib/segment/src/vector_storage/quantized/quantized_vectors.rs#L44
Encoded query type is
Vec<TEncodedQuery>
. It's maybe not efficient because it's unflattened data, keep it as is in this PR.Integration test covers SQ, PQ, BQ, HNSW search, and persistence.