MNTP learning rate #76

spookyQubit · 2024-05-17T02:42:54Z

Great work. Really appreciate that the code is public (and also the paper is so clearly written).

I had a question about the MNTP config. Can you please confirm what learning rate was used when running run_mntp.py for llama2/llama3? Looking at the configs under train_configs, it seems that the default learning rate in TrainingArguments of 5e-5 was used?

Sorry if this is already in the paper and I missed it (I read appendix D.1.1 but I am not sure where to look at the details for same training parameters as RoBERTa MNTP training).

-- Thanks.

The text was updated successfully, but these errors were encountered:

vaibhavad · 2024-05-17T18:02:46Z

Hi @spookyQubit, thanks for your interest in our work.

We modified the huggingface mlm training script. Similar to this script, we used the default learning rate in TrainingArguments which is 5e-5.

spookyQubit · 2024-05-18T23:08:35Z

Thanks @vaibhavad .

spookyQubit closed this as completed May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNTP learning rate #76

MNTP learning rate #76

spookyQubit commented May 17, 2024

vaibhavad commented May 17, 2024

spookyQubit commented May 18, 2024

MNTP learning rate #76

MNTP learning rate #76

Comments

spookyQubit commented May 17, 2024

vaibhavad commented May 17, 2024

spookyQubit commented May 18, 2024