-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for ibm granite #758
Conversation
…into pc/granite
The KVcache change broke my original implementation but I found a work arround it :) Ready for reaview @awni ✅ |
@Blaizzy this diff includes some previous PRs. Can you rebase on origin's main and force push to your branch? |
…into pc/granite
Hey @awni, It's done ✅ |
attention_bias: bool = False | ||
mlp_bias: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason you added these / changed this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Granite 3B and 8B require them, and its different from the base llama models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, I did not realize those models had model_type = llama
. Makes sense.
llms/mlx_lm/models/gpt_bigcode.py
Outdated
def create_additive_causal_mask(N: int, offset: int = 0): | ||
rinds = mx.arange(offset + N) | ||
linds = mx.arange(offset, offset + N) if offset else rinds | ||
mask = linds[:, None] < rinds[None] | ||
return mask * -1e9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you mind refactoring this into base.py
and calling that from this model and the llama.py
model? Ultimately we'd like all the models to use this same version but we can start with that and then update from there rather than duplicate the function everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it's done ✅
llms/mlx_lm/models/gpt_bigcode.py
Outdated
attn_pdrop: float = 0.1 | ||
embd_pdrop: float = 0.1 | ||
resid_pdrop: float = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove these unused arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are being used on the attention, embedding and mlp blocks :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right let's remove those Dropout layers as well.. we don't add dropout layers to model files here since we don't use them for trianing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
Done ✅
Very nicely done! I just left a couple minor comments. Please check them and then we can merge it! |
Thank you very much, @awni! I addressed all comments, Let me know if there anything else |
This PR adds support for openELM.
Twitter: @Prince_Canuma
Todo: