24GB的显存只能占用12GB，CUDA占用也不到10%。但是CPU占用100%内存占用35GB #159

NerounCstate · 2024-03-03T11:46:51Z

.\build\bin\Release\main.exe -m .\ReluLLaMA-70B-PowerInfer-GGUF\llama-70b-relu.q4.powerinfer.gguf -n 128 -t 32 -p "Once upon a time"
我用这段命令试了一下效果，速度很慢而且CPU和内存占用很大，我检查了一下输出信息
llm_load_sparse_model_tensors: offloaded layers from VRAM budget(-2147483648 bytes): 81/80
llm_load_sparse_model_tensors: mem required = 40226.35 MB
llm_load_sparse_model_tensors: VRAM used: 9842.91 MB
我的4090的24G显存显然只占用了一半
llama_new_context_with_model: compute buffer total size = 14.50 MB
llama_new_context_with_model: VRAM scratch buffer: 12.94 MB
llama_new_context_with_model: total VRAM used: 10015.84 MB (model: 9842.91 MB, context: 172.94 MB)
这里也显示占用显存为10G

czq693497091 · 2024-04-04T11:15:00Z

So does this question has been solved?

NerounCstate added the question Further information is requested label Mar 3, 2024

hodlen mentioned this issue Mar 4, 2024

24GB的显存只能占用12GB，CUDA占用也不到10%。但是CPU占用100%内存占用35GB #160

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

24GB的显存只能占用12GB，CUDA占用也不到10%。但是CPU占用100%内存占用35GB #159

24GB的显存只能占用12GB，CUDA占用也不到10%。但是CPU占用100%内存占用35GB #159

NerounCstate commented Mar 3, 2024

czq693497091 commented Apr 4, 2024

24GB的显存只能占用12GB，CUDA占用也不到10%。但是CPU占用100%内存占用35GB #159

24GB的显存只能占用12GB，CUDA占用也不到10%。但是CPU占用100%内存占用35GB #159

Comments

NerounCstate commented Mar 3, 2024

czq693497091 commented Apr 4, 2024