Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cosine decay to not clamp output, match other implementations #1138

Merged
merged 1 commit into from
May 20, 2024

Conversation

jlwitthuhn
Copy link
Contributor

@jlwitthuhn jlwitthuhn commented May 18, 2024

Proposed changes

This change updates how the 'minimum' value is treated by the cosine decay scheduler. The existing implementation clamps the output to the provided minimum and the new implementation instead smoothly transitions from the initial to final learning rate. This picture clarifies the difference between these two when decaying from 1.0 to 0.5 over 100 steps:
image

With this change the behavior will also match that of pytorch, optax, and keras

Also renamed 'minimum' to 'end' because this implementation can 'decay' upwards.

EDIT: I'm also not completely sold that removing minimum is the right move because it still might be useful to someone. Could be best to keep that and also add end as a separate parameter. If anyone has opinions about that I'd be happy to change it.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for fixing that.

I think it's fine to leave the minimum option out for now.

@awni awni merged commit 7e5674d into ml-explore:main May 20, 2024
5 checks passed
jkaercher pushed a commit to jkaercher/mlx that referenced this pull request May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants