You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I haven't been able to find an answer to my question so opening an issue here. I'm fine-tuning the GPT-2 XL model using the trainer for 10 epochs and I'd like to save the data seen by the model during each epoch. More specifically, I want to save the data seen by the model every 242 steps. For instance, data seen from step 1 to step 242, step 243 to step 484, and so on until the end of the 10th epoch. I'm a bit confused about how to do this since the data is shuffled after each epoch. Is it possible to use TrainerCallback here?
These are my training args training_args = TrainingArguments( f"models/XL", evaluation_strategy = "steps", learning_rate=2e-5, weight_decay=0.01, push_to_hub=False, num_train_epochs=10, per_device_train_batch_size=8, per_device_eval_batch_size=8, save_strategy="epoch", save_steps = 242, fp16=True, report_to="none", logging_strategy="steps", logging_steps=100, )
I'd appreciate any directions. Thanks :)
The text was updated successfully, but these errors were encountered:
Hi! I haven't been able to find an answer to my question so opening an issue here. I'm fine-tuning the GPT-2 XL model using the trainer for 10 epochs and I'd like to save the data seen by the model during each epoch. More specifically, I want to save the data seen by the model every 242 steps. For instance, data seen from step 1 to step 242, step 243 to step 484, and so on until the end of the 10th epoch. I'm a bit confused about how to do this since the data is shuffled after each epoch. Is it possible to use
TrainerCallback
here?These are my training args
training_args = TrainingArguments( f"models/XL", evaluation_strategy = "steps", learning_rate=2e-5, weight_decay=0.01, push_to_hub=False, num_train_epochs=10, per_device_train_batch_size=8, per_device_eval_batch_size=8, save_strategy="epoch", save_steps = 242, fp16=True, report_to="none", logging_strategy="steps", logging_steps=100, )
I'd appreciate any directions. Thanks :)
The text was updated successfully, but these errors were encountered: