-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float16 integration and API #4234
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosine metric was missed, added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does preprocess work for non float32? I can add tests / SIMD implementations for cosine metric if helpful.
Distance::Manhattan => <ManhattanMetric as Metric<VectorElementType>>::preprocess(v), | ||
}; | ||
Cow::from(preprocessed_vector) | ||
Cow::Owned(vector.iter().map(|&x| f16::to_f32(x)).collect_vec()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was an incorrect implementation. quantization_preprocess
is a trick for byte type, where we need to apply -127
for binary quantization. Here we need conversion into f32
only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
VectorStorageDatatype::Float16, | ||
64, | ||
20 | ||
)] | ||
fn test_byte_storage_hnsw( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extended existing test for byte vector to test HNSW
@@ -166,6 +194,7 @@ fn sames_count(a: &[Vec<ScoredPointOffset>], b: &[Vec<ScoredPointOffset>]) -> us | |||
)] | |||
fn test_byte_storage_binary_quantization_hnsw( | |||
#[case] query_variant: QueryVariant, | |||
#[case] storage_data_type: VectorStorageDatatype, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extended existing test for byte vector to test quantization
@@ -46,40 +48,68 @@ enum QuantizationVariant { | |||
Binary, | |||
} | |||
|
|||
fn random_discovery_query<R: Rng + ?Sized>(rnd: &mut R, dim: usize) -> QueryVector { | |||
fn random_vector<R>(rnd_gen: &mut R, dim: usize, data_type: VectorStorageDatatype) -> DenseVector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we test binary quantization, we need to specify vector to utilize binary condition. for u8
we generate vectors in range [0; 255]
, for f16
we generate vectors in range [-0.5; 0.5]
tests api fix test are you happy clippy
c129559
to
b014ed2
Compare
* f16 integration tests api fix test are you happy clippy * fix build
Float16 REST and Grpc, storages, scorers, and constructors
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?Changes to Core Features: