Is it possible to add support for Infini-attention? #292

sdmorrey · 2024-05-11T06:46:15Z

There's some work being done to implement Infini-attention from https://arxiv.org/pdf/2404.07143

In a nutshell it allows for essentially an unlimited context length without incurring the quadratic penalty. There's a proof of concept with 10M token context running in less than 32GB of RAM here...
https://github.com/mustafaaljadery/gemma-2B-10M

I believe we will see more models adopting this approach and if this were officially supported it would be a huge benefit to the community.

I don't have the rust chops to pull this off, but I thought I'd at least bring it to your attention since you have Phi-3 with 128k context working already.

Thanks for all your hard work!

EricLBuehler · 2024-05-13T09:30:16Z

Thank you for letting me know! I think this would be a valuable addition, and I'll try to implement it. I took a look at the implementation you linked, and I think this is the key change, is that correct?

https://github.com/mustafaaljadery/gemma-2B-10M/blob/main/src/gemma.py#L488-L548

I'm looking forward to implementing this.

nidhoggr-nil · 2024-05-23T12:44:22Z

Looks interesting, wonder if there are any downsides, probably depends on how the model compresses the knowledge so that information loss is minimal and how the practical side of the memory implementation is handled for larger contexts.

Looks like it is the correct place, the functions for retrieval and storage etc, also need to be implemented, so basically section 2.1.1 and 2.1.2 in the paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to add support for Infini-attention? #292

Is it possible to add support for Infini-attention? #292

sdmorrey commented May 11, 2024

EricLBuehler commented May 13, 2024

nidhoggr-nil commented May 23, 2024

Is it possible to add support for Infini-attention? #292

Is it possible to add support for Infini-attention? #292

Comments

sdmorrey commented May 11, 2024

EricLBuehler commented May 13, 2024

nidhoggr-nil commented May 23, 2024