You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'output/qwen1half-4b-chat/vx-xxx/checkpoint-xxx-merged-awq-int4'
出现下方报错是什么原因呢
[INFO:swift] Global seed set to 42
[INFO:swift] device_count: 2
[INFO:swift] Loading the model using model_dir: /root/autodl-tmp/models/swift/output/baichuan2-7b-chat/v16-20240517-135014/checkpoint-220-awq-int4
Traceback (most recent call last):
File "/root/autodl-tmp/models/swift/swift/cli/infer.py", line 5, in
infer_main()
File "/root/autodl-tmp/models/swift/swift/utils/run_utils.py", line 27, in x_main
result = llm_x(args, **kwargs)
File "/root/autodl-tmp/models/swift/swift/llm/infer.py", line 263, in llm_infer
model, template = prepare_model_template(args, device_map=device_map)
File "/root/autodl-tmp/models/swift/swift/llm/infer.py", line 164, in prepare_model_template
model, tokenizer = get_model_tokenizer(
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 4141, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 1117, in get_model_tokenizer_baichuan2
model, tokenizer = get_model_tokenizer_from_repo(
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 826, in get_model_tokenizer_from_repo
model = automodel_class.from_pretrained(
File "/root/autodl-tmp/conda/envs/swift/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/root/autodl-tmp/conda/envs/swift/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-220-awq-int4/modeling_baichuan.py", line 595, in from_pretrained
if hasattr(config, "quantization_config") and config.quantization_config['load_in_4bit']:
KeyError: 'load_in_4bit'
这是什么原因呢
The text was updated successfully, but these errors were encountered:
我将文件File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-220-awq-int4/modeling_baichuan.py", 中的
if hasattr(config, "quantization_config") and config.quantization_config['load_in_4bit']:
修改为
if hasattr(config, "quantization_config") and config.quantization_config.get('load_in_4bit'):
再运行启动
CUDA_VISIBLE_DEVICES=0,1 swift infer
--ckpt_dir /root/autodl-tmp/models/baichuan-inc/checkpoint-220-awq-int4
发现在运行过程中,刚刚修改的文件内容 自动还原了
因此还是相同的错误
参考LLM量化文档里的这个
awq/gptq量化模型支持vllm推理加速. 也支持模型部署.
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'output/qwen1half-4b-chat/vx-xxx/checkpoint-xxx-merged-awq-int4'
出现下方报错是什么原因呢
[INFO:swift] Global seed set to 42
[INFO:swift] device_count: 2
[INFO:swift] Loading the model using model_dir: /root/autodl-tmp/models/swift/output/baichuan2-7b-chat/v16-20240517-135014/checkpoint-220-awq-int4
Traceback (most recent call last):
File "/root/autodl-tmp/models/swift/swift/cli/infer.py", line 5, in
infer_main()
File "/root/autodl-tmp/models/swift/swift/utils/run_utils.py", line 27, in x_main
result = llm_x(args, **kwargs)
File "/root/autodl-tmp/models/swift/swift/llm/infer.py", line 263, in llm_infer
model, template = prepare_model_template(args, device_map=device_map)
File "/root/autodl-tmp/models/swift/swift/llm/infer.py", line 164, in prepare_model_template
model, tokenizer = get_model_tokenizer(
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 4141, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 1117, in get_model_tokenizer_baichuan2
model, tokenizer = get_model_tokenizer_from_repo(
File "/root/autodl-tmp/models/swift/swift/llm/utils/model.py", line 826, in get_model_tokenizer_from_repo
model = automodel_class.from_pretrained(
File "/root/autodl-tmp/conda/envs/swift/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/root/autodl-tmp/conda/envs/swift/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-220-awq-int4/modeling_baichuan.py", line 595, in from_pretrained
if hasattr(config, "quantization_config") and config.quantization_config['load_in_4bit']:
KeyError: 'load_in_4bit'
这是什么原因呢
The text was updated successfully, but these errors were encountered: