Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The VRAM usage is too high and cannot be released. #67

Open
QL-boy opened this issue Apr 10, 2024 · 1 comment
Open

The VRAM usage is too high and cannot be released. #67

QL-boy opened this issue Apr 10, 2024 · 1 comment

Comments

@QL-boy
Copy link

QL-boy commented Apr 10, 2024

image
image

The model will not be unloaded from the VRAM after each generation, and using multiple identical nodes will load the model multiple times, resulting in high VRAM usage.

The screenshot shows the LLM VRAM usage after running the workflow once after a fresh boot and automatically unloading the SD model.

Even using --disable-smart-memory doesn't help.

Even if I use a 4090 graphics card, I still can't bear this consumption.

Is there any way to automatically unload the model from the VRAM after each generation? Or is there any other solution that can reduce the model's video memory usage?

@QL-boy QL-boy changed the title The video memory usage is too high and cannot be released. The VRAM usage is too high and cannot be released. Apr 13, 2024
@gokayfem
Copy link
Owner

im working on the gpu memory release after generation. i will add this to all of the vlm nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants