Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [chatglm3-6b] Using chatglm3-6b model QUANTIZE_8bit can be successful, but QUANTIZE_4bit is invalid, and the error message is that the video memory is not enough OOM #1481

Open
3 of 15 tasks
zyf950120 opened this issue Apr 30, 2024 · 0 comments
Labels
bug Something isn't working Waiting for reply

Comments

@zyf950120
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

Operating system information

Linux

Python version information

3.10

DB-GPT version

main

Related scenes

  • Chat Data
  • Chat Excel
  • Chat DB
  • Chat Knowledge
  • Model Management
  • Dashboard
  • Plugins

Installation Information

Device information

GPU: 3060 12G

Models information

LLM: chatglm3-6b

What happened

使用8bit量化能够成功,显存使用量在9000M,下图所示
image
image
image

但是使用4bit量化没有效果,像是直接加载了原来的精度,报错显存不足
image
image
image

但是使用其他模型codellama-7b时,8bit和4bit都能够量化成功

What you expected to happen

我看提示说chatglm和chatglm2不支持量化
但是查看了chatglm3的github,3是能够支持4bit量化的

How to reproduce

none

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@zyf950120 zyf950120 added bug Something isn't working Waiting for reply labels Apr 30, 2024
@Aries-ckt Aries-ckt changed the title [Bug] [chatglm3-6b] 使用chatglm3-6b模型QUANTIZE_8bit能成功,但是QUANTIZE_4bit无效,报错显存不够OOM [Bug] [chatglm3-6b] Using chatglm3-6b model QUANTIZE_8bit can be successful, but QUANTIZE_4bit is invalid, and the error message is that the video memory is not enough OOM Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Waiting for reply
Projects
None yet
Development

No branches or pull requests

1 participant