-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc(js): remove duplicate legacy require("vectordb") for installation #1004
Open
changhiskhan
wants to merge
382
commits into
main
Choose a base branch
from
changhiskhan/update-example
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Use pathlib for local paths so that pathlib can handle the correct separator on windows. Closes #703 --------- Co-authored-by: Will Jones <willjones127@gmail.com>
This is a pretty direct binding to the underlying lance capability
This command hasn't been run for a while...
Modify some grammar, punctuation, and spelling errors.
For object detection, each row may correspond to an image and each image can have multiple bounding boxes of x-y coordinates. This means that a `bbox` field is potentially "list of list of float". This adds support in our pydantic-pyarrow conversion for nested lists.
Closes #721 fts will return results as a pyarrow table. Pyarrow tables has a `filter` method but it does not take sql filter strings (only pyarrow compute expressions). Instead, we do one of two things to support `tbl.search("keywords").where("foo=5").limit(10).to_arrow()`: Default path: If duckdb is available then use duckdb to execute the sql filter string on the pyarrow table. Backup path: Otherwise, write the pyarrow table to a lance dataset and then do `to_table(filter=<filter>)` Neither is ideal. Default path has two issues: 1. requires installing an extra library (duckdb) 2. duckdb mangles some fields (like fixed size list => list) Backup path incurs a latency penalty (~20ms on ssd) to write the resultset to disk. In the short term, once #676 is addressed, we can write the dataset to "memory://" instead of disk, this makes the post filter evaluate much quicker (ETA next week). In the longer term, we'd like to be able to evaluate the filter string on the pyarrow Table directly, one possibility being that we use Substrait to generate pyarrow compute expressions from sql string. Or if there's enough progress on pyarrow, it could support Substrait expressions directly (no ETA) --------- Co-authored-by: Will Jones <willjones127@gmail.com>
If you add timezone information in the Field annotation for a datetime then that will now be passed to the pyarrow data type. I'm not sure how pyarrow enforces timezones, right now, it silently coerces to the timezone given in the column regardless of whether the input had the matching timezone or not. This is probably not the right behavior. Though we could just make it so the user has to make the pydantic model do the validation instead of doing that at the pyarrow conversion layer.
API has changed significantly, namely `openai.Embedding.create` no longer exists. openai/openai-python#742 Update the OpenAI embedding function and put a minimum on the openai sdk version.
issue separate requests under the hood and concatenate results
Co-authored-by: Aidan <64613310+aidangomar@users.noreply.github.com>
I found that it was quite incoherent to have to read through the documentation and having to search which submodule that each class should be imported from. For example, it is cumbersome to have to navigate to another documentation page to find out that `EmbeddingFunctionRegistry` is from `lancedb.embeddings`
If the input text is None, Tantivy raises an error complaining it cannot add a NoneType. We handle this upstream so None's are not added to the document. If all of the indexed fields are None then we skip this document.
This also renames the new experimental node package to lancedb. The classic node package remains named vectordb. The goal here is to avoid introducing piecemeal breaking changes to the vectordb crate. Instead, once the new API is stabilized, we will officially release the lancedb crate and deprecate the vectordb crate. The same pattern will eventually happen with the npm package vectordb.
eddyxu
approved these changes
Feb 23, 2024
westonpace
approved these changes
Feb 23, 2024
…984) BREAKING CHANGE: users will now need to npm install `apache-arrow` and `@apache-arrow/ts` themselves.
This changes `lancedb` from a "pure python" setuptools project to a maturin project and adds a rust lancedb dependency. The async python client is extremely minimal (only `connect` and `Connection.table_names` are supported). The purpose of this PR is to get the infrastructure in place for building out the rest of the async client. Although this is not technically a breaking change (no APIs are changing) it is still a considerable change in the way the wheels are built because they now include the native shared library.
…table (#1022) this will work after upgrading lance with lancedb/lance#1995 merged see #884 for details Signed-off-by: BubbleCal <bubble-cal@outlook.com>
upgrade to lance 0.10.1 and update doc string to reflect dynamic projection options
#1036) A simple base usage that install the dependencies necessary to use FTS and Hybrid search --------- Co-authored-by: Nat Roth <natroth@Nats-MacBook-Pro.local> Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
This will eventually replace the remote table implementations in python and node.
…PI (#1031) I've also started `ASYNC_MIGRATION.MD` to keep track of the breaking changes from sync to async python.
…1049) small QoL improvement
typo and broken table
changhiskhan
force-pushed
the
changhiskhan/update-example
branch
from
March 3, 2024 23:23
b7a83a2
to
f718e0e
Compare
The renaming of `vectordb` to `lancedb` broke the [quick start docs](https://lancedb.github.io/lancedb/basic/#__tabbed_5_3) (it's pointing to a non-existent directory). This PR fixes the code snippets and the paths in the docs page. Additionally, more fixes related to indexing docs below 👇🏽.
changhiskhan
force-pushed
the
changhiskhan/update-example
branch
from
March 4, 2024 01:08
f718e0e
to
97bca0c
Compare
alexkohler
pushed a commit
to alexkohler/lancedb
that referenced
this pull request
Apr 20, 2024
* quick documentation for merge method * format and check
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.