Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qwen-1.5 14b lora微调loss极大,是不是代码有需要调整的? #224

Open
sarahyongq opened this issue Mar 21, 2024 · 0 comments
Open

Comments

@sarahyongq
Copy link

qwen-1.5 14b lora微调loss极大,是不是代码支持得有问题?
***** train metrics *****
epoch = 1.0
train_loss = 4784435875.5407
train_runtime = 3:32:59.59
train_samples_per_second = 5.407
train_steps_per_second = 0.037

详细训练参数如下:
output_dir:"output/firefly-qwen1.5-14b"
model_name_or_path:"Qwen/Qwen1.5-14B-Chat"
train_file:"./data/qwen_20240320.jsonl"
template_name:"qwen"
train_mode:"lora"
num_train_epochs:1
per_device_train_batch_size:2
gradient_accumulation_steps:18
learning_rate:0.0002
max_seq_length:1024
logging_steps:100
save_steps:100
save_total_limit:1
lr_scheduler_type:"constant_with_warmup"
warmup_steps:100
lora_rank:64
lora_alpha:16
lora_dropout:0.05
gradient_checkpointing:true
disable_tqdm:false
optim:"paged_adamw_32bit"
seed:42
fp16:true
report_to:"tensorboard"
dataloader_num_workers:0
save_strategy:"steps"
weight_decay:0
max_grad_norm:0.3
remove_unused_columns:false

@sarahyongq sarahyongq changed the title qwen-1.5 14b lora微调loss极大,是不是代码支持得有问题? qwen-1.5 14b lora微调loss极大,是不是代码有需要调整的? Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant