You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Bug] [chatglm3-6b] Using chatglm3-6b model QUANTIZE_8bit can be successful, but QUANTIZE_4bit is invalid, and the error message is that the video memory is not enough OOM
#1481
Open
3 of 15 tasks
zyf950120 opened this issue
Apr 30, 2024
· 0 comments
Aries-ckt
changed the title
[Bug] [chatglm3-6b] 使用chatglm3-6b模型QUANTIZE_8bit能成功,但是QUANTIZE_4bit无效,报错显存不够OOM
[Bug] [chatglm3-6b] Using chatglm3-6b model QUANTIZE_8bit can be successful, but QUANTIZE_4bit is invalid, and the error message is that the video memory is not enough OOM
Apr 30, 2024
Search before asking
Operating system information
Linux
Python version information
3.10
DB-GPT version
main
Related scenes
Installation Information
Installation From Source
Docker Installation
Docker Compose Installation
Cluster Installation
AutoDL Image
Other
Device information
GPU: 3060 12G
Models information
LLM: chatglm3-6b
What happened
使用8bit量化能够成功,显存使用量在9000M,下图所示
但是使用4bit量化没有效果,像是直接加载了原来的精度,报错显存不足
但是使用其他模型codellama-7b时,8bit和4bit都能够量化成功
What you expected to happen
我看提示说chatglm和chatglm2不支持量化
但是查看了chatglm3的github,3是能够支持4bit量化的
How to reproduce
none
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: