Got error when exporting model with quantization #3516

dickens88 · 2024-04-29T17:19:44Z

Reminder

I have read the README and searched the existing issues.

Reproduction

Dears,

I'm using the latest code from master and i deploy my env with docker-compose. When i try to export model with quantization, the backend gives errors like this

llama_factory  | importlib.metadata.PackageNotFoundError: No package metadata was found for The 'optimum>=1.16.0' distribution was not found and is required by this application. 
llama_factory  | To fix: pip install optimum>=1.16.0

after i installed all the necessary packages with this command:

pip install auto_gptq
pip install optimum
pip install -U accelerate bitsandbytes datasets peft transformers

I got another error:

llama_factory  | Traceback (most recent call last):
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 566, in process_events
llama_factory  |     response = await route_utils.call_process_api(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 261, in call_process_api
llama_factory  |     output = await app.get_blocks().process_api(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1788, in process_api
llama_factory  |     result = await self.call_function(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1352, in call_function
llama_factory  |     prediction = await utils.async_iteration(iterator)
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 583, in async_iteration
llama_factory  |     return await iterator.__anext__()
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 576, in __anext__
llama_factory  |     return await anyio.to_thread.run_sync(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 56, in run_sync
llama_factory  |     return await get_async_backend().run_sync_in_worker_thread(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
llama_factory  |     return await future
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 851, in run
llama_factory  |     result = context.run(func, *args)
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 559, in run_sync_iterator_async
llama_factory  |     return next(iterator)
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 742, in gen_wrapper
llama_factory  |     response = next(iterator)
llama_factory  |   File "/app/src/llmtuner/webui/components/export.py", line 79, in save_model
llama_factory  |     export_model(args)
llama_factory  |   File "/app/src/llmtuner/train/tuner.py", line 57, in export_model
llama_factory  |     model = load_model(tokenizer, model_args, finetuning_args)  # must after fixing tokenizer to resize vocab
llama_factory  |   File "/app/src/llmtuner/model/loader.py", line 113, in load_model
llama_factory  |     model = AutoModelForCausalLM.from_pretrained(**init_kwargs)
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
llama_factory  |     return model_class.from_pretrained(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3165, in from_pretrained
llama_factory  |     hf_quantizer.validate_environment(
llama_factory  |   File "/usr/local/lib/python3.10/dist-packages/transformers/quantizers/quantizer_gptq.py", line 56, in validate_environment
llama_factory  |     raise ImportError(
llama_factory  | ImportError: Loading a GPTQ quantized model requires optimum (`pip install optimum`) and auto-gptq library (`pip install auto-gptq`)

my settings is like following:

Expected behavior

No response

System Info

No response

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga added the pending This problem is yet to be addressed. label May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Got error when exporting model with quantization #3516

Got error when exporting model with quantization #3516

dickens88 commented Apr 29, 2024 •

edited

Got error when exporting model with quantization #3516

Got error when exporting model with quantization #3516

Comments

dickens88 commented Apr 29, 2024 • edited

Reminder

Reproduction

Expected behavior

System Info

Others

dickens88 commented Apr 29, 2024 •

edited