Strange training dynamics for ImageReward model. #64

bhattg · 2023-10-09T01:35:09Z

Hi! I am trying to train a reward model, and I am confused why in the initial iterations of training the gradients are not changing and neither the loss is changing. Only after some steps does it suddenly change and then learning is completed.

Following is the attached learning dynamics.

learn01one · 2023-10-17T02:22:08Z

Hello, which version of python and cuda are you using? Thank you.

xujz18 · 2023-11-05T07:22:07Z

This is a very interesting discovery, and I believe it may be related to the learning rate schedule and warm-up settings, although there could be other factors worth exploring.

bhattg · 2023-11-06T18:33:47Z

Hello, sorry I couldn't get back with the question on python version 3.10.13 and CUDA 11.7

Experiment was run using torch 1.13.0

Regarding the learning dynamics, I am using the following

--fix_rate 0.7 --lr 1e-05 --lr-decay-style cosine --warmup 0.0 --batch_size 32 --accumulation_steps 1 --epochs 50

xujz18 added the training How to train ImageReward label Nov 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange training dynamics for ImageReward model. #64

Strange training dynamics for ImageReward model. #64

bhattg commented Oct 9, 2023

learn01one commented Oct 17, 2023

xujz18 commented Nov 5, 2023

bhattg commented Nov 6, 2023

Strange training dynamics for ImageReward model. #64

Strange training dynamics for ImageReward model. #64

Comments

bhattg commented Oct 9, 2023

learn01one commented Oct 17, 2023

xujz18 commented Nov 5, 2023

bhattg commented Nov 6, 2023