Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watchdog does not kill and idle sentencetransformer backend #2277

Open
joseluisll opened this issue May 9, 2024 · 3 comments
Open

Watchdog does not kill and idle sentencetransformer backend #2277

joseluisll opened this issue May 9, 2024 · 3 comments
Labels
bug Something isn't working unconfirmed

Comments

@joseluisll
Copy link

Environment, Container Image, Hardware
LocalAI version: 2.14.0

Container Image: localai/localai:v2.14.0-cublas-cuda12-ffmpeg

nvidia CUDA 12, intel x86_64, Ubuntu 22.04 LTS.

Linux makinota08 6.5.0-28-generic 29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr 4 14:39:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

16G RAM, Nvidia 1660Ti 6G, Acer Nitro Laptop

Describe the bug
I have localAI deployed in a docker container. I load a sentence transformer embeddings model, and test it succesfully using curl. Then I wait for 5 minutes, and check in the log that the watchdog tries to kill the process. The log shows that the process is succesfully killed, it no longer detects the idle connection.

However, if I issue a nvidia-smi, I can see the python process stuck and loaded into the memory of the GPU.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1660 Ti Off | 00000000:01:00.0 Off | N/A |
| N/A 45C P8 3W / 80W | 2268MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 5574 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 25660 C python 2260MiB |
+---------------------------------------------------------------------------------------+

also, with ps -fe you can see the process still there:

root 25660 24649 0 11:33 pts/0 00:00:09 python /build/backend/python/sentencetransformers/sentencetransformers.py --addr 127.0.0.1:39807

To Reproduce
Activate Watchdog using ENV variables.

WATCHDOG_BUSY | true
WATCHDOG_BUSY_TIMEOUT | 30m
WATCHDOG_IDLE | true
WATCHDOG_IDLE_TIMEOUT | 5m

Test sentencetransformers model with the following yaml config:

name: e5-large
backend: sentencetransformers
embeddings: true
parameters:
model: intfloat/multilingual-e5-large
context_size: 1024

Wait for 5 minutes.

Expected behavior

The sentencetransformers backend should have been killed by the watchdog after 5 minutes, so the GPU memory is cleared.

Logs
test of embeddings model.txt

@joseluisll joseluisll added bug Something isn't working unconfirmed labels May 9, 2024
@localai-bot
Copy link
Contributor

It seems like there might be an issue with the watchdog not effectively killing the sentence transformers backend process. To investigate further, we will need to check if any other part of the code or configuration is keeping the process alive or if there are any other threads associated with the process that are preventing the GPU memory from being cleared. We'll also review the logs and related code to pinpoint the root cause of the problem.

@joseluisll
Copy link
Author

joseluisll commented May 9, 2024

I have tested as well the watchdog with a chat model ('mistral-openorca') and it works fine, the process is cleared after 5 minutes idle.

@xjm1285
Copy link

xjm1285 commented May 13, 2024

same issue when use vllm backend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

3 participants