You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am getting "float division by zero" error whenever I try to quantize mixtral related models with autogptq,
and here is my code.
from transformers import AutoTokenizer, TextGenerationPipeline
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import logging
logging.basicConfig(
format="%(asctime)s %(levelname)s [%(name)s] %(message)s", level=logging.INFO, datefmt="%Y-%m-%d %H:%M:%S"
)
# pretrained_model_dir = "mistralai/Mistral-7B-Instruct-v0.2"
# quantized_model_dir = "Mistral-7B-Instruct-v0.2-gptq"
pretrained_model_dir = "hfl/Chinese-Mixtral-Instruct"
quantized_model_dir = "Chinese-Mixtral-Instruct-gptq"
# pretrained_model_dir = 'cognitivecomputations/dolphin-2.6-mixtral-8x7b'
# quantized_model_dir = "dolphin-2.6-mixtral-8x7b-gptq"
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_dir, use_fast=True)
examples = [
tokenizer(
"auto-gptq is an easy-to-use model quantization library with user-friendly apis, based on GPTQ algorithm."
)
]
quantize_config = BaseQuantizeConfig(
bits=4, # quantize model to 4-bit
# group_size=128, # it is recommended to set the value to 128
desc_act=True, # set to False can significantly speed up inference but the perplexity may slightly bad
damp_percent=0.1
)
# load un-quantized model, by default, the model will always be loaded into CPU memory
model = AutoGPTQForCausalLM.from_pretrained(pretrained_model_dir, quantize_config)
# quantize model, the examples should be list of dict whose keys can only be "input_ids" and "attention_mask"
model.quantize(examples)
# save quantized model
model.save_quantized(quantized_model_dir)
It looks like nsmaples becomes 0, but I don't know how it happens, has anyone had a similar problem?
The text was updated successfully, but these errors were encountered:
I am getting "float division by zero" error whenever I try to quantize mixtral related models with autogptq,
and here is my code.
It looks like nsmaples becomes 0, but I don't know how it happens, has anyone had a similar problem?
The text was updated successfully, but these errors were encountered: