We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我用llama3 qlora训练单轮和多轮的对话, 发现单轮学习的效果比较好,用训练语料里的问题去问,基本遵照答案去回答。但多轮的效果较差,每一轮不会遵循已经训练的内容去答,但按照项目里提到的训练方式,每一轮的回答应该和单轮的效果差不多吧,只有把前面几轮问题答案按照训练语料拼好预测最后一轮的答案,回答才会对齐训练语料里的内容。
所以这种多轮训练方式(把所有答案都mask放到一条数据里训)是不是没有起到太大作用,不如把1条多轮拆分成多条数据去训练效果好呀(4轮多轮拆成4条数据)?
有没有大佬可以指点一下多轮怎么训练才能把每一轮都学习进去并且比较高效呀~
The text was updated successfully, but these errors were encountered:
想问一下你的多轮训练数据集有多少呀?为什么我的多轮对话训练的很慢,4快a100
Sorry, something went wrong.
No branches or pull requests
我用llama3 qlora训练单轮和多轮的对话, 发现单轮学习的效果比较好,用训练语料里的问题去问,基本遵照答案去回答。但多轮的效果较差,每一轮不会遵循已经训练的内容去答,但按照项目里提到的训练方式,每一轮的回答应该和单轮的效果差不多吧,只有把前面几轮问题答案按照训练语料拼好预测最后一轮的答案,回答才会对齐训练语料里的内容。
所以这种多轮训练方式(把所有答案都mask放到一条数据里训)是不是没有起到太大作用,不如把1条多轮拆分成多条数据去训练效果好呀(4轮多轮拆成4条数据)?
有没有大佬可以指点一下多轮怎么训练才能把每一轮都学习进去并且比较高效呀~
The text was updated successfully, but these errors were encountered: