We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如果想基于RetroMAE预训练bart、t5系列的模型,应该如何解决呢?
bart-base-chinese-cluecorpussmall-retromae_batch256_max350.log
The text was updated successfully, but these errors were encountered:
Currently, this script doesn't support encoder-decoder architecture.
Sorry, something went wrong.
好的谢谢 还有想请问,在预训练过程中的report是这样的,在您的预训练实验中是如何判断何时停止的呢?仅凭loss曲线的变化吗? {'loss': 2.8222, 'learning_rate': 1.1313075087080098e-05, 'step': 103000, 'epoch': 1.3} 另外我注意到,在训练过程中也会偶尔出现loss值变高(但不明显,很小的变化)的情况,请问你们在预训练的过程中是否遇到过这种情况,又是如何判断何时停止的呢?
No branches or pull requests
如果想基于RetroMAE预训练bart、t5系列的模型,应该如何解决呢?
bart-base-chinese-cluecorpussmall-retromae_batch256_max350.log
The text was updated successfully, but these errors were encountered: