We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qwen-1.5 14b lora微调loss极大,是不是代码支持得有问题? ***** train metrics ***** epoch = 1.0 train_loss = 4784435875.5407 train_runtime = 3:32:59.59 train_samples_per_second = 5.407 train_steps_per_second = 0.037
详细训练参数如下: output_dir:"output/firefly-qwen1.5-14b" model_name_or_path:"Qwen/Qwen1.5-14B-Chat" train_file:"./data/qwen_20240320.jsonl" template_name:"qwen" train_mode:"lora" num_train_epochs:1 per_device_train_batch_size:2 gradient_accumulation_steps:18 learning_rate:0.0002 max_seq_length:1024 logging_steps:100 save_steps:100 save_total_limit:1 lr_scheduler_type:"constant_with_warmup" warmup_steps:100 lora_rank:64 lora_alpha:16 lora_dropout:0.05 gradient_checkpointing:true disable_tqdm:false optim:"paged_adamw_32bit" seed:42 fp16:true report_to:"tensorboard" dataloader_num_workers:0 save_strategy:"steps" weight_decay:0 max_grad_norm:0.3 remove_unused_columns:false
The text was updated successfully, but these errors were encountered:
No branches or pull requests
qwen-1.5 14b lora微调loss极大,是不是代码支持得有问题?
***** train metrics *****
epoch = 1.0
train_loss = 4784435875.5407
train_runtime = 3:32:59.59
train_samples_per_second = 5.407
train_steps_per_second = 0.037
详细训练参数如下:
output_dir:"output/firefly-qwen1.5-14b"
model_name_or_path:"Qwen/Qwen1.5-14B-Chat"
train_file:"./data/qwen_20240320.jsonl"
template_name:"qwen"
train_mode:"lora"
num_train_epochs:1
per_device_train_batch_size:2
gradient_accumulation_steps:18
learning_rate:0.0002
max_seq_length:1024
logging_steps:100
save_steps:100
save_total_limit:1
lr_scheduler_type:"constant_with_warmup"
warmup_steps:100
lora_rank:64
lora_alpha:16
lora_dropout:0.05
gradient_checkpointing:true
disable_tqdm:false
optim:"paged_adamw_32bit"
seed:42
fp16:true
report_to:"tensorboard"
dataloader_num_workers:0
save_strategy:"steps"
weight_decay:0
max_grad_norm:0.3
remove_unused_columns:false
The text was updated successfully, but these errors were encountered: