Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Differences in quantization logic compared to AWQ #663

Open
wenhuach21 opened this issue May 6, 2024 · 0 comments
Open

[Question] Differences in quantization logic compared to AWQ #663

wenhuach21 opened this issue May 6, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@wenhuach21
Copy link

wenhuach21 commented May 6, 2024

If we compare with the asym quantization logic with AWQ, there are some differences, a major distinction is whether the range of min-max values should include zero.
In AWQ, zero is not included in the range, as depicted in https://github.com/mit-han-lab/llm-awq/blob/main/awq/quantize/quantizer.py#L74,
whereas GPTQ does include zero, as demonstrated in https://github.com/AutoGPTQ/AutoGPTQ/blob/main/auto_gptq/quantization/quantizer.py#L64.

Since intel/auto-round also follows AutoGPTQ, and the way used by AWQ is better I think

So my questions is does the kernel of AutoGPTQ support not including zero or AutoGPTQ will support this?

@wenhuach21 wenhuach21 added the enhancement New feature or request label May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant