Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Integration of VLM embedding model #446

Open
wants to merge 53 commits into
base: master
Choose a base branch
from
Open

feat: Integration of VLM embedding model #446

wants to merge 53 commits into from

Conversation

FUYICC
Copy link
Contributor

@FUYICC FUYICC commented Feb 1, 2024

Description

Issue #445

Summary by CodeRabbit

  • New Features
    • Introduced CLIPEmbedding for image and text embedding functionalities.
  • Bug Fixes
    • Improved file encoding handling in license updates.
  • Tests
    • Added tests for the new CLIPEmbedding functionality, covering initialization, embedding processes, and output dimension retrieval.

@FUYICC FUYICC self-assigned this Feb 1, 2024
@FUYICC FUYICC linked an issue Feb 1, 2024 that may be closed by this pull request
4 tasks
@FUYICC FUYICC closed this Feb 4, 2024
@FUYICC FUYICC reopened this Feb 5, 2024
@FUYICC FUYICC marked this pull request as ready for review February 9, 2024 03:11
@Appointat
Copy link
Member

@FUYICC Hi, is the pr still in progress? Let me know if you have any difficulties.

@FUYICC
Copy link
Contributor Author

FUYICC commented Mar 27, 2024

@FUYICC Hi, is the pr still in progress? Let me know if you have any difficulties.

Thank you for your kind help! Sorry I've been mostly working on my dissertation for the past 3 weeks so I haven't had time to move forward, I'll be up and running starting next week, we'll discuss any questions anytime!

@FUYICC FUYICC changed the title Integration of CLIP embedding model Integration of VLM embedding model Apr 15, 2024
@FUYICC FUYICC requested a review from Wendong-Fan May 5, 2024 17:14
@Wendong-Fan Wendong-Fan added this to the Sprint 4 milestone May 25, 2024
Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution and sorry for the late review, left some comments

camel/embeddings/vlm_embedding.py Outdated Show resolved Hide resolved
Comment on lines 71 to 74
images=obj, return_tensors="pt", padding=True, **kwargs
)
image_feature = (
self.model.get_image_features(**input, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant kwargs could lead to unexpected behaviors if kwargs contains overlapping keys

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about separate kwargs into 2 dict?

def embed_list(
    self,
    objs: List[Union[Image.Image, str]],
    processor_kwargs: dict ={},
    model_kwargs: dict = {},
) -> List[List[float]]:
            text_input = self.processor(
                text=obj,
                return_tensors="pt",
                padding=True,
                **processor_kwargs,
            )
            text_feature = (
                self.model.get_text_features(**text_input, **model_kwargs)
                .squeeze(dim=0)
                .tolist()
            )

Comment on lines 82 to 85
text=obj, return_tensors="pt", padding=True, **kwargs
)
text_feature = (
self.model.get_text_features(**input, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

camel/embeddings/vlm_embedding.py Outdated Show resolved Hide resolved
camel/embeddings/vlm_embedding.py Outdated Show resolved Hide resolved
camel/embeddings/vlm_embedding.py Outdated Show resolved Hide resolved
camel/embeddings/vlm_embedding.py Outdated Show resolved Hide resolved
camel/embeddings/vlm_embedding.py Outdated Show resolved Hide resolved
@@ -59,7 +59,7 @@ pyowm = { version = "^3.3.0", optional = true }
googlemaps = { version = "^4.10.0", optional = true }
requests_oauthlib = { version = "^1.3.1", optional = true }
unstructured = { extras = ["all-docs"], version = "^0.10.30", optional = true }

pillow = { version = "^10.2.0", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need this library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we need to determine if the input is an image or not in vlm embedding class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add it under [tool.poetry.extras] tools and all part, as well as [[tool.mypy.overrides]]

FUYICC and others added 6 commits May 27, 2024 20:57
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
@Wendong-Fan Wendong-Fan changed the title Integration of VLM embedding model feat: Integration of VLM embedding model May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Embeddings lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Reviewing
Development

Successfully merging this pull request may close these issues.

[Feature Request] Multi-modal RAG(Retrieval-Augmented Generation)
6 participants