Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange training dynamics for ImageReward model. #64

Open
bhattg opened this issue Oct 9, 2023 · 3 comments
Open

Strange training dynamics for ImageReward model. #64

bhattg opened this issue Oct 9, 2023 · 3 comments
Labels
training How to train ImageReward

Comments

@bhattg
Copy link

bhattg commented Oct 9, 2023

Hi! I am trying to train a reward model, and I am confused why in the initial iterations of training the gradients are not changing and neither the loss is changing. Only after some steps does it suddenly change and then learning is completed.

Following is the attached learning dynamics.
Screen Shot 2023-10-08 at 6 21 46 PM

@learn01one
Copy link

Hello, which version of python and cuda are you using? Thank you.

@xujz18
Copy link
Member

xujz18 commented Nov 5, 2023

This is a very interesting discovery, and I believe it may be related to the learning rate schedule and warm-up settings, although there could be other factors worth exploring.

@xujz18 xujz18 added the training How to train ImageReward label Nov 5, 2023
@bhattg
Copy link
Author

bhattg commented Nov 6, 2023

Hello, sorry I couldn't get back with the question on python version 3.10.13 and CUDA 11.7

Experiment was run using torch 1.13.0

Regarding the learning dynamics, I am using the following

--fix_rate 0.7 --lr 1e-05 --lr-decay-style cosine --warmup 0.0 --batch_size 32 --accumulation_steps 1 --epochs 50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training How to train ImageReward
Projects
None yet
Development

No branches or pull requests

3 participants