You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great work. Really appreciate that the code is public (and also the paper is so clearly written).
I had a question about the MNTP config. Can you please confirm what learning rate was used when running run_mntp.py for llama2/llama3? Looking at the configs under train_configs, it seems that the default learning rate in TrainingArguments of 5e-5 was used?
Sorry if this is already in the paper and I missed it (I read appendix D.1.1 but I am not sure where to look at the details for same training parameters as RoBERTa MNTP training).
-- Thanks.
The text was updated successfully, but these errors were encountered:
Great work. Really appreciate that the code is public (and also the paper is so clearly written).
I had a question about the MNTP config. Can you please confirm what learning rate was used when running
run_mntp.py
for llama2/llama3? Looking at the configs undertrain_configs
, it seems that the default learning rate in TrainingArguments of 5e-5 was used?Sorry if this is already in the paper and I missed it (I read appendix D.1.1 but I am not sure where to look at the details for
same training parameters as RoBERTa MNTP training
).-- Thanks.
The text was updated successfully, but these errors were encountered: