Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLaMA3-8B-Instruct+lora使用A800(80GB显存)微调长度8192 #93

Open
12915494174 opened this issue Apr 23, 2024 · 2 comments
Open

Comments

@12915494174
Copy link

LLaMA3-8B-Instruct+lora微调,单张A800(80GB显存)能否8192长度的token微调,我的任务场景较为特殊,需要使用较长的文本来微调。我使用了该代码库提供的代码,在微调过程中遇到了显存溢出的问题?

@KMnO4-zx
Copy link
Contributor

这个长度的训练我们确实没有尝试过,可以试一下XTuner训练框架,我们的仓库仅作为学习使用,不建议用在生产环境。
XTuner:https://github.com/InternLM/xtuner

@WEXIJUE
Copy link

WEXIJUE commented May 13, 2024

这个长度的训练我们确实没有尝试过,可以试一下XTuner训练框架,我们的仓库仅作为学习使用,不建议在生产环境中使用。XTuner : https: //github.com/InternLM/xtuner

作者您好,我在基于Xtuner微调llama3模仿您基于XTuner框架微调InterLM2的代码的时候,遇到了数据集格式不对,找不到inputs_ids和labels的列名的问题,还有就是我下载的权重是safetensors格式,但是出现了这样的错误。
image
非常感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants