Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

微调后量化,执行推理命令时报错:load_in_4bit #952

Closed
cjq345994512 opened this issue May 17, 2024 · 3 comments
Closed

微调后量化,执行推理命令时报错:load_in_4bit #952

cjq345994512 opened this issue May 17, 2024 · 3 comments
Assignees
Labels
question Further information is requested solved

Comments

@cjq345994512
Copy link

参考LLM量化文档里的这个

awq/gptq量化模型支持vllm推理加速. 也支持模型部署.

CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'output/qwen1half-4b-chat/vx-xxx/checkpoint-xxx-merged-awq-int4'
出现下方报错是什么原因呢

[INFO:swift] Global seed set to 42
[INFO:swift] device_count: 2
[INFO:swift] Loading the model using model_dir: /root/autodl-tmp/models/swift/output/baichuan2-7b-chat/v16-20240517-135014/checkpoint-220-awq-int4
Traceback (most recent call last):
File "/root/autodl-tmp/models/swift/swift/cli/infer.py", line 5, in
infer_main()
File "/root/autodl-tmp/models/swift/swift/utils/run_utils.py", line 27, in x_main
result = llm_x(args, **kwargs)
File "/root/autodl-tmp/models/swift/swift/llm/infer.py", line 263, in llm_infer
model, template = prepare_model_template(args, device_map=device_map)
File "/root/autodl-tmp/models/swift/swift/llm/infer.py", line 164, in prepare_model_template
model, tokenizer = get_model_tokenizer(
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 4141, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 1117, in get_model_tokenizer_baichuan2
model, tokenizer = get_model_tokenizer_from_repo(
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 826, in get_model_tokenizer_from_repo
model = automodel_class.from_pretrained(
File "/root/autodl-tmp/conda/envs/swift/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/root/autodl-tmp/conda/envs/swift/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-220-awq-int4/modeling_baichuan.py", line 595, in from_pretrained
if hasattr(config, "quantization_config") and config.quantization_config['load_in_4bit']:
KeyError: 'load_in_4bit'

这是什么原因呢

@Jintao-Huang
Copy link
Collaborator

我来看一下

@Jintao-Huang Jintao-Huang self-assigned this May 17, 2024
@Jintao-Huang
Copy link
Collaborator

baichuan2的代码改一下

改成:

        if hasattr(config, "quantization_config") and config.quantization_config.get('load_in_4bit'):

@Jintao-Huang Jintao-Huang added question Further information is requested solved labels May 19, 2024
@cjq345994512
Copy link
Author

我将文件File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-220-awq-int4/modeling_baichuan.py", 中的
if hasattr(config, "quantization_config") and config.quantization_config['load_in_4bit']:
修改为
if hasattr(config, "quantization_config") and config.quantization_config.get('load_in_4bit'):

再运行启动
CUDA_VISIBLE_DEVICES=0,1 swift infer
--ckpt_dir /root/autodl-tmp/models/baichuan-inc/checkpoint-220-awq-int4
发现在运行过程中,刚刚修改的文件内容 自动还原了
因此还是相同的错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested solved
Projects
None yet
Development

No branches or pull requests

2 participants