Add support for DBRX #623

LaaZa · 2024-03-28T19:28:20Z

Adds support for databricks/dbrx

NOT tested, way too big for me to test and the tiny dummy model yujiepan/dbrx-tiny-random has too small infeatures.

If possbile please test and report here.

~~Requires trust_remote_code=True and tiktoken package~~

Requires transformers>=4.40.0

Closes #621

Qubitium · 2024-03-29T01:53:01Z

@LaaZa I will test this.

Qubitium · 2024-03-29T03:54:59Z

@LaaZa The quantize process is not working. The end result is 243G which is != pre-quant/8. Also the quantize speed is of ~20 minutes is too fast for model of this size.

Qubitium · 2024-03-29T04:14:12Z

Print of dbrx-instruct.modules

bound method Module.modules of DbrxForCausalLM(
  (transformer): DbrxModel(
    (wte): Embedding(100352, 6144)
    (blocks): ModuleList(
      (0-39): 40 x DbrxBlock(
        (norm_attn_norm): DbrxNormAttentionNorm(
          (norm_1): LayerNorm((6144,), eps=1e-05, elementwise_affine=True)
          (attn): DbrxAttention(
            (Wqkv): Linear(in_features=6144, out_features=8192, bias=False)
            (out_proj): Linear(in_features=6144, out_features=6144, bias=False)
            (rotary_emb): DbrxRotaryEmbedding()
          )
          (norm_2): LayerNorm((6144,), eps=1e-05, elementwise_affine=True)
        )
        (ffn): DbrxFFN(
          (router): DbrxRouter(
            (layer): Linear(in_features=6144, out_features=16, bias=False)
          )
          (experts): DbrxExperts(
            (mlp): DbrxExpertGLU()
          )
        )
      )
    )
    (norm_f): LayerNorm((6144,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=6144, out_features=100352, bias=False)
)>

maziyarpanahi · 2024-03-29T10:53:40Z

Hi @LaaZa
I have started the quantization based on this PR. It's at 27/40 now. Shall I also test #625 or this PR will be have the lats changes?

LaaZa · 2024-03-29T11:04:53Z

Hi @LaaZa
I have started the quantization based on this PR. It's at 27/40 now. Shall I also test #625 or this PR will be have the lats changes?

Due to the unusual architechture of the MoE implementation, the original model seems to not quantize. There are ways to convert it to something that more likely works, but that requires slightly different version than what this PR currently has. I'll keep this updated according to what is most beneficial way forward, but I would say to wait for support in transformers, so we have a more standardized goal.

maziyarpanahi · 2024-03-29T11:06:38Z

Hi @LaaZa
I have started the quantization based on this PR. It's at 27/40 now. Shall I also test #625 or this PR will be have the lats changes?

Due to the unusual architechture of the MoE implementation, the original model seems to not quantize. There are ways to convert it to something that more likely works, but that requires slightly different version than what this PR currently has. I'll keep this updated according to what is most beneficial way forward, but I would say to wait for support in transformers, so we have a more standardized goal.

Perfect, thanks. Please let me know if I still can test something

maziyarpanahi · 2024-04-04T18:57:12Z

Hi @Qubitium
I like both models, unfortunately, I cannot say which one was better.
Thank you for your hard work, and please let me know if I can test any other model.

PS: I made a short doc for anyone else who wanted to quickly test these models: https://gist.github.com/maziyarpanahi/e2e005addef6bb9ca97f9ff6c4c4f0d5

# Conflicts: # auto_gptq/modeling/__init__.py # auto_gptq/modeling/_const.py # auto_gptq/modeling/auto.py

LaaZa · 2024-04-19T16:40:21Z

I think this is ready for retesting with transformers 4.40.0

Add support for DBRX

c0b14fa

Qubitium mentioned this pull request Mar 29, 2024

[WIP] [WORKING] dbrx (mod) support #625

Draft

LaaZa marked this pull request as draft March 31, 2024 14:53

LaaZa added 2 commits April 19, 2024 19:25

Merge branch 'main' into DBRX

5cf3ba9

# Conflicts: # auto_gptq/modeling/__init__.py # auto_gptq/modeling/_const.py # auto_gptq/modeling/auto.py

require transformers 4.40.0

475ec65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for DBRX #623

Add support for DBRX #623

LaaZa commented Mar 28, 2024 •

edited

Qubitium commented Mar 29, 2024

Qubitium commented Mar 29, 2024 •

edited

Qubitium commented Mar 29, 2024 •

edited

maziyarpanahi commented Mar 29, 2024

LaaZa commented Mar 29, 2024

maziyarpanahi commented Mar 29, 2024

maziyarpanahi commented Apr 4, 2024 •

edited

LaaZa commented Apr 19, 2024

Add support for DBRX #623

Are you sure you want to change the base?

Add support for DBRX #623

Conversation

LaaZa commented Mar 28, 2024 • edited

Qubitium commented Mar 29, 2024

Qubitium commented Mar 29, 2024 • edited

Qubitium commented Mar 29, 2024 • edited

maziyarpanahi commented Mar 29, 2024

LaaZa commented Mar 29, 2024

maziyarpanahi commented Mar 29, 2024

maziyarpanahi commented Apr 4, 2024 • edited

LaaZa commented Apr 19, 2024

LaaZa commented Mar 28, 2024 •

edited

Qubitium commented Mar 29, 2024 •

edited

Qubitium commented Mar 29, 2024 •

edited

maziyarpanahi commented Apr 4, 2024 •

edited