Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support concurrent embedding, update LangChain QA demo with multithreaded embedding creation #348

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jiayini1119
Copy link
Contributor

@jiayini1119 jiayini1119 commented Aug 14, 2023

No description provided.

@XprobeBot XprobeBot added the enhancement New feature or request label Aug 14, 2023
@XprobeBot XprobeBot added this to the v0.2.0 milestone Aug 14, 2023
xinference/core/model.py Outdated Show resolved Hide resolved
@XprobeBot XprobeBot modified the milestones: v0.2.0, v0.2.1 Aug 21, 2023
@aresnow1
Copy link
Contributor

Embedding is a CPU-intensive call, and even for a stateless actor, it is not executed simultaneously because the current loop lock is not released until the first call. Therefore, the embedding operation needs to be called with 'to_thread' in model actor. However, I have tried it, and even embedding is not thread-safe for llamacpp, and the process results in a core dump if called concurrently.

@jiayini1119
Copy link
Contributor Author

We can first try supporting concurrent embedding creation for PyTorch models.

Embedding is a CPU-intensive call, and even for a stateless actor, it is not executed simultaneously because the current loop lock is not released until the first call. Therefore, the embedding operation needs to be called with 'to_thread' in model actor. However, I have tried it, and even embedding is not thread-safe for llamacpp, and the process results in a core dump if called concurrently.

@XprobeBot XprobeBot modified the milestones: v0.2.1, v0.3.1 Sep 5, 2023
@XprobeBot XprobeBot modified the milestones: v0.4.0, v0.4.2, v0.4.3, v0.4.4 Sep 12, 2023
@XprobeBot XprobeBot removed this from the v0.4.4 milestone Sep 19, 2023
@XprobeBot XprobeBot modified the milestones: v0.8.4, v0.8.5, v0.9.0 Feb 4, 2024
@XprobeBot XprobeBot modified the milestones: v0.9.0, v0.9.1 Feb 22, 2024
@XprobeBot XprobeBot modified the milestones: v0.9.1, v0.9.2, v0.9.3 Mar 1, 2024
@XprobeBot XprobeBot modified the milestones: v0.9.3, v0.9.4, v0.9.5 Mar 15, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.0, v0.10.1 Mar 29, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.1, v0.10.2 Apr 12, 2024
@XprobeBot XprobeBot modified the milestones: v0.10.2, v0.10.3, v0.11.0 Apr 19, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1, v0.11.2 May 11, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.3, v0.11.4 May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants