Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: expected scalar type Half but found Float #52

Open
jasperan opened this issue May 24, 2023 · 3 comments
Open

RuntimeError: expected scalar type Half but found Float #52

jasperan opened this issue May 24, 2023 · 3 comments

Comments

@jasperan
Copy link

jasperan commented May 24, 2023

CUDA SETUP: Loading binary /home/opc/anaconda3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda114_nocublaslt.so...
Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://b38eaf88d60145f161.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
/home/opc/anaconda3/lib/python3.9/site-packages/peft/utils/other.py:76: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead.
  warnings.warn(
/home/opc/anaconda3/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py:318: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
Traceback (most recent call last):
  File "/home/opc/anaconda3/lib/python3.9/site-packages/gradio/routes.py", line 399, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/opc/anaconda3/lib/python3.9/site-packages/gradio/blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "/home/opc/anaconda3/lib/python3.9/site-packages/gradio/blocks.py", line 1022, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/opc/anaconda3/lib/python3.9/site-packages/anyio/to_thread.py", line 28, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
  File "/home/opc/anaconda3/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 818, in run_sync_in_worker_thread
    return await future
  File "/home/opc/anaconda3/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 754, in run
    result = context.run(func, *args)
  File "/home/opc/anaconda3/lib/python3.9/site-packages/gradio/helpers.py", line 588, in tracked_fn
    response = fn(*args)
  File "/home/opc/simple-llama-finetuner/app.py", line 131, in train
    self.trainer.train(
  File "/home/opc/simple-llama-finetuner/trainer.py", line 273, in train
    result = self.trainer.train(resume_from_checkpoint=False)
  File "/home/opc/anaconda3/lib/python3.9/site-packages/transformers/trainer.py", line 1696, in train
    return inner_training_loop(
  File "/home/opc/anaconda3/lib/python3.9/site-packages/transformers/trainer.py", line 1972, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/home/opc/anaconda3/lib/python3.9/site-packages/transformers/trainer.py", line 2796, in training_step
    self.scaler.scale(loss).backward()
  File "/home/opc/anaconda3/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
    torch.autograd.backward(
  File "/home/opc/anaconda3/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/opc/anaconda3/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/opc/anaconda3/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 157, in backward
    torch.autograd.backward(outputs_with_grad, args_with_grad)
  File "/home/opc/anaconda3/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/opc/anaconda3/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/opc/anaconda3/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py", line 476, in backward
    grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype_A)
RuntimeError: expected scalar type Half but found Float
@cwzhao
Copy link

cwzhao commented Jun 9, 2023

same here

@FJGEODEV
Copy link

FJGEODEV commented Jun 26, 2023

change to "fp16=False" in trainer.py, should work.

training work, but inference break. May need help on this.

@raminmardani
Copy link

raminmardani commented Aug 8, 2023

same here
fp16=False didn't solve the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants