Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(java): add table names java api #1279

Merged
merged 15 commits into from
May 24, 2024
Merged

Conversation

LuQQiu
Copy link
Collaborator

@LuQQiu LuQQiu commented May 8, 2024

Add lancedb-jni and table names API

@LuQQiu
Copy link
Collaborator Author

LuQQiu commented May 8, 2024

@QianZhu @eddyxu @walterddr PTAL the table names
I am trying to save the connection on JNI side so that the connection can be reused for create_table, open_table and other funcs
https://github.com/LuQQiu/lancedb/tree/addJavaAPI
but encountering some JNI crash issue, still debugging.

@LuQQiu
Copy link
Collaborator Author

LuQQiu commented May 8, 2024

DONE testing for this PR.
Lancedb requires all the tables in format of table_name.lance

@LuQQiu
Copy link
Collaborator Author

LuQQiu commented May 8, 2024

@QianZhu PTAL, thanks!

@LuQQiu it looks good overall. You will address some optimization and add tests cover some error cases right?

@QianZhu QianZhu requested review from QianZhu and eddyxu May 8, 2024 16:49
assertEquals("dataset_version", tableNames.get(0));
assertEquals("new_empty_dataset", tableNames.get(1));
assertEquals("test", tableNames.get(2));
assertEquals("write_stream", tableNames.get(3));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the returned table names sorted in a certain way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's always in ascending order (alphabetical order), will add it to the method description


/// TODO(lu) import from lance-jni without duplicate
/// Extend JNIEnv with helper functions.
pub trait JNIEnvExt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we add unit tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i added some examples of test cases but haven't got time to add more.
https://github.com/lancedb/lance/blob/main/java/core/src/test/java/com/lancedb/lance/JNITest.java

Because testing JNIEnv requires the launch of an actual JVM, so i plan to add the test on the Java side.
Ideally the tests will be passing some object from Java to Rust, Rust side consumes the Java object, convert to rust object, and convert back to java object to return to the Java side. If anything is wrong, the test will fail

OtherLanceDB { message: String, location: Location },
}

impl Error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob it is easier to map errors (with an ErrorExt trait). You can see an example here:

impl<T> PythonErrorExt<T> for std::result::Result<T, LanceError> {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, i will test this tonight and create another PR for the error related changes

F: FnOnce(&mut JNIEnv, &JObject) -> Result<T>;
}

impl JNIEnvExt for JNIEnv<'_> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All functions are returning Result but do we need it if the result is always Ok?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using "Result" for convenient handle the returned result of the inner called methods. JNI needs special error handling differently, require calling env.throw_new(JAVA exception) to throw java exception to end user.

/**
* Represents LanceDB database.
*/
public class Database {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make the naming conventions be consistent with the rest of the rust/python/nodejs packages.

YOu can take a look of https://docs.rs/lancedb/latest/lancedb/

Copy link

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks i will take a look

I am trying to save the connection on JNI side so that the connection can be reused for create_table, open_table and other funcs

i am pretty sure it is OK to cache the connection on the Java side as well as long as the API is clear :-P

@LuQQiu
Copy link
Collaborator Author

LuQQiu commented May 13, 2024

@QianZhu @eddyxu @walterddr PTAL, thanks!

import org.junit.jupiter.api.io.TempDir;

public class ConnectionTest {
@TempDir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to close the connection?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the connections are closed in try (Connection conn = Connection.connect(databaseUri)) {

assertEquals("new_empty_dataset", tableNames.get(1));
}
}
}
Copy link
Contributor

@QianZhu QianZhu May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test where both startAfter and limit are specified?

@github-actions github-actions bot added the Rust Rust related issues label May 14, 2024
Copy link

ACTION NEEDED

Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@LuQQiu LuQQiu changed the title feat(java) add table names java api feat(java): add table names java api May 14, 2024
@github-actions github-actions bot added the enhancement New feature or request label May 14, 2024
@LuQQiu
Copy link
Collaborator Author

LuQQiu commented May 14, 2024

@QianZhu Addressed the comments, added test cases, PTAL

Comment on lines +15 to +17
<artifactId>lancedb-core</artifactId>
<name>LanceDB Core</name>
<packaging>jar</packaging>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the relationship between this and the lancedb/lance/pull/2115?

AFAIK these cannot be identical. (maybe this should be lancedb-core and the other one should be lance-core?)

Copy link
Collaborator Author

@LuQQiu LuQQiu May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@walterddr This is the LanceDB java module with LanceDB functionalities e.g. embeddings, table names
The other one is Lance java module with Lance Dataset/Fragment functionalities

Put it to the sub module to give space if we want to put other connectors e.g. Spark connectors directly in LanceDB java side

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i see. so lancedb-core and lance-core are both core modules and doesn't depend on each other (and thus are independent basically?) nice i will pull in both in the trino side.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah yeah!! From the Java Side you need to import both
The only problem later is the packaged jar may be bigger, but we can consider that issue later

@LuQQiu LuQQiu merged commit db712b0 into lancedb:main May 24, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Rust Rust related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants