Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【MiniLLM】is it normal to get negative loss at some step? #183

Open
lllyyyqqq opened this issue Apr 3, 2024 · 1 comment
Open

【MiniLLM】is it normal to get negative loss at some step? #183

lllyyyqqq opened this issue Apr 3, 2024 · 1 comment

Comments

@lllyyyqqq
Copy link

First, Excellent work! I am trying to reproduce using my own data, and change some of your code. During the training, at some steps, I got negative rl_loss, reg_loss, pg_loss, is it a normal behaviour?

@t1101675
Copy link
Contributor

t1101675 commented Apr 4, 2024

It seems abnormal to get negative losses.

  • pg_loss and reward have opposite signs (see this function), where the reward equals log p which is negative. Therefore, pg_loss should be positive.
  • reg_loss can be viewed as the token-level reverse KLD between the teacher model and the student model, which should be positive.
  • rl_loss is simply pg_loss + reg_loss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants