Alternative attentions: ReBased linear flashattn and LWM's RingAttention #114

kabachuha · 2024-03-18T09:06:09Z

Adds support for subquadratic ReBased Linear Self-FlashAttention and LargeWorldModel's RingAttention allowing for far larger context sizes with the same VRAM usage and not requiring extra digression from the Transformer architecture unlike RNNs and Mamba-like models

Closes #107

zhengzangw · 2024-03-23T06:17:41Z

Thank you for your contribution! Currently, we do not want to involve more libraries. Since the project is under quickly development, we will turn back to this at a late stage.

zhengzangw · 2024-05-09T08:28:14Z

Closed as we are not ready to test different attention recently.

alternative attentions: ReBased linear flashattn and LWM's RingAttention

c14bfd3

zhengzangw closed this May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative attentions: ReBased linear flashattn and LWM's RingAttention #114

Alternative attentions: ReBased linear flashattn and LWM's RingAttention #114

kabachuha commented Mar 18, 2024 •

edited

zhengzangw commented Mar 23, 2024

zhengzangw commented May 9, 2024

Alternative attentions: ReBased linear flashattn and LWM's RingAttention #114

Alternative attentions: ReBased linear flashattn and LWM's RingAttention #114

Conversation

kabachuha commented Mar 18, 2024 • edited

zhengzangw commented Mar 23, 2024

zhengzangw commented May 9, 2024

kabachuha commented Mar 18, 2024 •

edited