Contribute the WordLift Vector Store #13028

ziodave · 2024-04-22T15:48:22Z

Description

Add support for the WordLift Vector Store.

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Yes
No

Type of Change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Added new unit/integration tests
Added new notebook (that tests end-to-end)
I stared at the code and made sure it makes sense

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

…wordlift-vector-store

…2107-add-support-for-the-wordlift-vector-store

…' of github.com:wordlift/llama_index into feature/12107-add-support-for-the-wordlift-vector-store

…2107-add-support-for-the-wordlift-vector-store

…-wordlift-vector-store Feature/12107 add support for the wordlift vector store

review-notebook-app · 2024-04-22T15:48:28Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

logan-markewich · 2024-04-23T19:49:56Z

@ziodave Seems like the tests aren't working. Have you tried running them locally?

ziodave · 2024-04-24T05:15:26Z

@ziodave Seems like the tests aren't working. Have you tried running them locally?

@logan-markewich yes, sorry I converted the PR to draft until we fix it. May I ask, we recreated the ubuntu-latest-unit-tester in our organization GH Runners so that we can have the GH Actions run on our fork, https://github.com/wordlift/llama_index/actions/runs/8800952266.

Oddly enough the tests pass there, is there a special configuration we need to apply to the GH Runner? This is basically the configuration we did:

cc @EthanWordlift

logan-markewich · 2024-04-25T04:56:52Z

@ziodave I don't think any special config is needed.

The errors in the test seem unrelated to env though

# Get the key to use for the operation.
>       key = await self.key_provider.for_query(query)
E       TypeError: object str can't be used in 'await' expression

ziodave · 2024-04-25T05:02:36Z

@logan-markewich we're on it, we'll update the PR soon. Thanks!

ziodave · 2024-04-25T12:11:42Z

@logan-markewich we're ready for review 🙏

llama-index-integrations/vector_stores/llama-index-vector-stores-wordlift/README.md

logan-markewich · 2024-04-26T20:55:12Z

.../vector_stores/llama-index-vector-stores-wordlift/llama_index/vector_stores/wordlift/base.py

+    def __init__(self, key: str):
+        self.key = key
+
+    async def for_add(self, nodes: List[BaseNode]) -> str:


curious why these methods have to be async? Or what this is even for actually (they all return the same thing 👀

Its async because the API key is stored on a different server and accessed by API

logan-markewich · 2024-04-26T20:56:49Z

...s/vector_stores/llama-index-vector-stores-wordlift/examples/wordlift_vector_store_demo.ipynb

so if I'm understanding properly, I can't use just any embedding model with wordlift, it has to be a specific one.

I'm unfamiliar with wordlift, but what's the reason for that restriction?

It's not necessary, we removed any reference to the use of a specific embedding model.

logan-markewich · 2024-04-26T20:57:48Z

.../vector_stores/llama-index-vector-stores-wordlift/llama_index/vector_stores/wordlift/base.py

+                node_id=node.node_id,
+                embeddings=node.get_embedding(),
+                text=node.get_content(metadata_mode=MetadataMode.NONE) or "",
+                metadata={},


any reason to not store the node metadata here?

We don't have any metadata to store.

logan-markewich · 2024-04-26T20:58:45Z

.../vector_stores/llama-index-vector-stores-wordlift/llama_index/vector_stores/wordlift/base.py

+        for node in nodes:
+            node_dict = node.dict()
+            metadata: Dict[str, Any] = node_dict.get("metadata", {})
+            entity_id = metadata.get("entity_id", None)


is an entity id required? What happens if it's not provided?

(I really should go read more about wordlift haha)

Wordlift Knowledge Graphs are built on the principles of fully Linked Data, where each entity is assigned a permanent dereferentiable URI. When adding nodes to an existing Knowledge Graph, it's essential to include an "entity_id" in the metadata of each loaded document. In the future we may provide an endpoint that generates the ID

logan-markewich · 2024-04-26T20:59:59Z

...or_stores/llama-index-vector-stores-wordlift/manager_client/api/vector_search_queries_api.py

+from manager_client.rest import RESTResponseType
+
+
+class VectorSearchQueriesApi:


Any plans to publish this as its own package? Like a client SDK? Its fine if it lives here for now though

yes, we’ll publish the API as its own package in the future

logan-markewich

This looks mostly good to me, just worried about a few UX things that might trip up users. I wonder if we can smooth out some of this or not?

ziodave · 2024-04-27T04:14:04Z

This looks mostly good to me, just worried about a few UX things that might trip up users. I wonder if we can smooth out some of this or not?

Hello @logan-markewich yes, we'll review your comments and be back soon with updates and answers.

This reverts commit 1fd1da4, reversing changes made to 5c7ca29.

ossaleon and others added 18 commits March 25, 2024 14:03

created main structure of the package

287ddd6

unit test main structure, example notebook draft

9a41faa

unit tests, fixed generated apis

65a38ef

excluded client code from precommit

cf68a0d

testing exclusion of api client

d98b3a1

refactored test to pass ruff hook

9800df6

changed test names

5e01411

exclude manager client from pre-commit-config

8271f84

update vector store, change folder structure

ae0f23b

added init to utils

56b3681

removed async from query function

51b65c9

updated example notebook

27003b4

pointed developers to wordlift website in example notebook

f0ac272

Merge branch 'run-llama:main' into feature/12107-add-support-for-the-…

360017d

…wordlift-vector-store

Merge branch 'main' of github.com:wordlift/llama_index into feature/1…

5d532e4

…2107-add-support-for-the-wordlift-vector-store

Merge branch 'feature/12107-add-support-for-the-wordlift-vector-store…

02b082e

…' of github.com:wordlift/llama_index into feature/12107-add-support-for-the-wordlift-vector-store

Merge branch 'main' of github.com:wordlift/llama_index into feature/1…

9fcd917

…2107-add-support-for-the-wordlift-vector-store

Merge pull request #1 from wordlift/feature/12107-add-support-for-the…

05fa5a2

…-wordlift-vector-store Feature/12107 add support for the wordlift vector store

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 22, 2024

EthanWordlift and others added 6 commits April 23, 2024 09:51

Update file end with new line

03d79e2

Update Build file

a6a6de1

Remove unused files and add BUILD files

110eb53

Merge branch 'main' of https://github.com/run-llama/llama_index

7b8f465

restore file

599d559

add aiohttp-retry

3dfab8c

ziodave marked this pull request as draft April 24, 2024 05:11

Merge branch 'run-llama:main' into main

77fbcbd

logan-markewich closed this Apr 25, 2024

logan-markewich reopened this Apr 25, 2024

Update test mock

71a5825

ziodave marked this pull request as ready for review April 25, 2024 12:11

Merge branch 'main' of https://github.com/run-llama/llama_index

9a973fb

logan-markewich reviewed Apr 26, 2024

View reviewed changes

llama-index-integrations/vector_stores/llama-index-vector-stores-wordlift/README.md Outdated Show resolved Hide resolved

logan-markewich reviewed Apr 26, 2024

View reviewed changes

EthanWordlift and others added 3 commits May 3, 2024 16:18

Merge branch 'main' of https://github.com/run-llama/llama_index

5c7ca29

Merge branch 'run-llama:main' into main

1fd1da4

Revert "Merge branch 'run-llama:main' into main"

7a94a71

This reverts commit 1fd1da4, reversing changes made to 5c7ca29.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contribute the WordLift Vector Store #13028

Contribute the WordLift Vector Store #13028

ziodave commented Apr 22, 2024

review-notebook-app bot commented Apr 22, 2024

logan-markewich commented Apr 23, 2024

ziodave commented Apr 24, 2024 •

edited

logan-markewich commented Apr 25, 2024 •

edited

ziodave commented Apr 25, 2024

ziodave commented Apr 25, 2024

logan-markewich Apr 26, 2024

ossaleon May 13, 2024

logan-markewich Apr 26, 2024

ossaleon May 13, 2024

logan-markewich Apr 26, 2024

ossaleon May 13, 2024

logan-markewich Apr 26, 2024

ossaleon May 13, 2024

logan-markewich Apr 26, 2024

ossaleon May 13, 2024

logan-markewich left a comment

ziodave commented Apr 27, 2024

		from manager_client.rest import RESTResponseType


		class VectorSearchQueriesApi:

Contribute the WordLift Vector Store #13028

Are you sure you want to change the base?

Contribute the WordLift Vector Store #13028

Conversation

ziodave commented Apr 22, 2024

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

review-notebook-app bot commented Apr 22, 2024

logan-markewich commented Apr 23, 2024

ziodave commented Apr 24, 2024 • edited

logan-markewich commented Apr 25, 2024 • edited

ziodave commented Apr 25, 2024

ziodave commented Apr 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

logan-markewich left a comment

Choose a reason for hiding this comment

ziodave commented Apr 27, 2024

ziodave commented Apr 24, 2024 •

edited

logan-markewich commented Apr 25, 2024 •

edited