Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练出错 #154

Open
loki1017 opened this issue Aug 24, 2023 · 0 comments
Open

训练出错 #154

loki1017 opened this issue Aug 24, 2023 · 0 comments

Comments

@loki1017
Copy link

参数详情:
--do_train
--do_eval
--train_file
/Work/.../train.json
--validation_file
/Work/.../valid.json
--preprocessing_num_workers
10
--prompt_column
instruction
--query_column
input
--response_column
output
--overwrite_cache
--model_name_or_path
/Work/pre_model/chatglm2-6b
--output_dir
output/IM-04
--overwrite_output_dir
--max_source_length
256
--max_target_length
256
--per_device_train_batch_size
4
--per_device_eval_batch_size
4
--gradient_accumulation_steps
1
--predict_with_generate
--num_train_epochs
1
--logging_strategy
steps
--logging_steps
10
--eval_steps
50
--evaluation_strategy
steps
--save_steps
1000
--save_strategy
steps
--learning_rate
2e-5
--lora_r
8
--model_parallel_mode
True
--warmup_ratio
0.05
--weight_decay
0.05
--max_train_samples
500000
--max_eval_samples
10000

我的训练集大约90W左右,测试集大约10W左右,我分别设置了不同的样本量,在设置--max_train_samples 200000
--max_eval_samples 2000 的时候模型可以正常训练,但是当--max_train_samples 500000 --max_eval_samples 10000的时候,却出现了下面的错误:
Traceback (most recent call last):
File "/Work/zhanglongji7036/chatglm_sft/ptuning/main.py", line 566, in
main()
File "/Work/zhanglongji7036/chatglm_sft/ptuning/main.py", line 492, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/transformers/trainer.py", line 1664, in train
return inner_training_loop(
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/transformers/trainer.py", line 2019, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/transformers/trainer.py", line 2300, in _maybe_log_save_evaluate
metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
File "/Work/zhanglongji7036/chatglm_sft/ptuning/trainer_seq2seq.py", line 78, in evaluate
return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/transformers/trainer.py", line 3029, in evaluate
output = eval_loop(
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/transformers/trainer.py", line 3318, in evaluation_loop
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
File "/Work/zhanglongji7036/chatglm_sft/ptuning/main.py", line 444, in compute_metrics
scores = rouge.get_scores(
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/rouge_chinese/rouge.py", line 116, in get_scores
return self._get_scores(hyps, refs)
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/rouge_chinese/rouge.py", line 129, in _get_scores
sc = fn(
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/rouge_chinese/rouge.py", line 54, in
"rouge-1": lambda hyp, ref, **k: rouge_score.rouge_n(hyp, ref, 1, **k),
File "/home/zhanglongji7036/anaconda3/envs/loki/lib/python3.10/site-packages/rouge_chinese/rouge_score.py", line 253, in rouge_n
raise ValueError("Hypothesis is empty.")
ValueError: Hypothesis is empty.
0%| | 50/125000 [06:11<258:02:21, 7.43s/it]

Process finished with exit code 1

同时,我打印了模型的输出内容,发现完全没有生成任何内容,对此我非常困惑,不知道是否有人遇到同样的问题:
hypothesis: []
reference: ['约翰', '是', '一名', '软件', '工程师', ',', '居住', '在', '旧金山', '。', '他', '在', '一家', '科技', '公司', '工作', '了', '5', '年', ',', '业余时间', '喜欢', '打网球', '。']


hypothesis: ['《']
reference: ['《', '时间', '旅行者', '的', '妻子', '》', '是', '我', '的', '最', '爱', '之一', ',', '因为', '它', '不仅仅', '是', '一部', '科幻小说', ',', '而是', '一部', '强烈', '的', '情感故事', '。', '它', '讲述', '了', '一位', '时间', '旅行者', '和', '他', '的', '妻子', '之间', '的', '爱情', ',', '跨越', '时空', '交织', '在', '一起', '。', '小说', '的', '结构', '非常', '独特', ',', '以', '非线性', '的', '方式', '呈现', '了', '两位', '主人公', '的', '故事', ',', '让', '读者', '能够', '体会', '到', '他们', '内心', '的', '情感', '和', '思绪', '。', '此外', ',', '作者', 'Audrey', ' ', 'Niffenegger', '深入', '探索', '了', '人类', '的', '情感', '、', '时间', '、', '家庭', '以及', '命运', '等', '主题', ',', '使', '小说', '变得', '更加', '深刻', '和', '有', '意义', '。']


hypothesis: []
reference: ['自然风光', '如此', '迷人', ',', '冬日', '里', '的', '雪花', '在', '阳光', '下', '闪耀', ',', '春天里', '的', '草木', '吐露', '新芽', ',', '秋季', '的', '枫叶', '像', '火焰', '一般', '绚烂', '多彩', ',', '让', '人', '不由得', '沉醉', '其中', '。']


hypothesis: []
reference: ['日本', '发生', '大规模', '地震', ',', '紧急', '启动', '预防措施']


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant