Add support for ibm granite #758

Blaizzy · 2024-05-06T22:59:52Z

This PR adds support for openELM.
Twitter: @Prince_Canuma

Todo:

3-8B
20-34B

…into pc/granite

Blaizzy · 2024-05-14T16:33:07Z

The KVcache change broke my original implementation but I found a work arround it :)

Ready for reaview @awni ✅

awni · 2024-05-19T13:06:16Z

@Blaizzy this diff includes some previous PRs. Can you rebase on origin's main and force push to your branch?

…c/granite

…into pc/granite

Blaizzy · 2024-05-20T21:55:54Z

Hey @awni,

It's done ✅

awni · 2024-05-20T21:59:52Z

llms/mlx_lm/models/llama.py

+    attention_bias: bool = False
+    mlp_bias: bool = False


Was there a reason you added these / changed this file?

Granite 3B and 8B require them, and its different from the base llama models.

Oh I see, I did not realize those models had model_type = llama. Makes sense.

awni · 2024-05-20T22:01:58Z

llms/mlx_lm/models/gpt_bigcode.py

+def create_additive_causal_mask(N: int, offset: int = 0):
+    rinds = mx.arange(offset + N)
+    linds = mx.arange(offset, offset + N) if offset else rinds
+    mask = linds[:, None] < rinds[None]
+    return mask * -1e9


would you mind refactoring this into base.py and calling that from this model and the llama.py model? Ultimately we'd like all the models to use this same version but we can start with that and then update from there rather than duplicate the function everywhere.

Sure, it's done ✅

awni · 2024-05-20T22:02:27Z

llms/mlx_lm/models/gpt_bigcode.py

+    attn_pdrop: float = 0.1
+    embd_pdrop: float = 0.1
+    resid_pdrop: float = 0.1


Remove these unused arguments.

They are being used on the attention, embedding and mlp blocks :)

Right let's remove those Dropout layers as well.. we don't add dropout layers to model files here since we don't use them for trianing.

Got it!

Done ✅

awni · 2024-05-20T22:03:08Z

Very nicely done! I just left a couple minor comments. Please check them and then we can merge it!

Blaizzy · 2024-05-20T22:31:56Z

Thank you very much, @awni!

I addressed all comments,

Let me know if there anything else

add support for granite 3-8B config

b608e34

Blaizzy changed the title ~~Add support for granite~~ Add support for ibm granite May 6, 2024

Blaizzy added 8 commits May 14, 2024 17:28

add gpt_bigcode

5ed2ba2

add positional embedding condition.

3d7acb2

add support for granite 3-8B config

185fb37

add gpt_bigcode

3208bf1

add positional embedding condition.

42c24d0

Merge branch 'pc/granite' of https://github.com/Blaizzy/mlx-examples …

3e3411c

…into pc/granite

remove unused function

9c823e5

rebase fix

4947881

Blaizzy added 2 commits May 14, 2024 18:34

move position emebedding to mask creation

e5bae7c

add to tuner and format

ec45eaa

Blaizzy marked this pull request as ready for review May 14, 2024 16:39

Blaizzy added 11 commits May 20, 2024 22:59

Merge branch 'main' of https://github.com/Blaizzy/mlx-examples into p…

9baf289

…c/granite

add support for granite 3-8B config

1df7ec3

add gpt_bigcode

8278bb7

add positional embedding condition.

121ece5

add support for granite 3-8B config

95e59fa

add gpt_bigcode

e606625

add positional embedding condition.

ec63b8d

rebase fix

f7e60a1

move position emebedding to mask creation

a346522

add to tuner and format

78be0a2

Merge branch 'pc/granite' of https://github.com/Blaizzy/mlx-examples …

d2b6b61

…into pc/granite

awni reviewed May 20, 2024

View reviewed changes

refactor mask

a849de3

remove dropout layers

fe90366

awni approved these changes May 20, 2024

View reviewed changes

awni merged commit b044ce2 into ml-explore:main May 22, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for ibm granite #758

Add support for ibm granite #758

Blaizzy commented May 6, 2024 •

edited

Blaizzy commented May 14, 2024 •

edited

awni commented May 19, 2024

Blaizzy commented May 20, 2024

awni May 20, 2024

Blaizzy May 20, 2024

awni May 20, 2024

awni May 20, 2024

Blaizzy May 20, 2024

awni May 20, 2024

Blaizzy May 20, 2024

awni May 20, 2024

Blaizzy May 20, 2024

awni commented May 20, 2024

Blaizzy commented May 20, 2024

Add support for ibm granite #758

Add support for ibm granite #758

Conversation

Blaizzy commented May 6, 2024 • edited

Blaizzy commented May 14, 2024 • edited

awni commented May 19, 2024

Blaizzy commented May 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awni commented May 20, 2024

Blaizzy commented May 20, 2024

Blaizzy commented May 6, 2024 •

edited

Blaizzy commented May 14, 2024 •

edited