【MiniLLM】is it normal to get negative loss at some step? #183

lllyyyqqq · 2024-04-03T03:25:04Z

First, Excellent work! I am trying to reproduce using my own data, and change some of your code. During the training, at some steps, I got negative rl_loss, reg_loss, pg_loss, is it a normal behaviour?

t1101675 · 2024-04-04T02:41:46Z

It seems abnormal to get negative losses.

pg_loss and reward have opposite signs (see this function), where the reward equals log p which is negative. Therefore, pg_loss should be positive.
reg_loss can be viewed as the token-level reverse KLD between the teacher model and the student model, which should be positive.
rl_loss is simply pg_loss + reg_loss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【MiniLLM】is it normal to get negative loss at some step? #183

【MiniLLM】is it normal to get negative loss at some step? #183

lllyyyqqq commented Apr 3, 2024

t1101675 commented Apr 4, 2024

【MiniLLM】is it normal to get negative loss at some step? #183

【MiniLLM】is it normal to get negative loss at some step? #183

Comments

lllyyyqqq commented Apr 3, 2024

t1101675 commented Apr 4, 2024