Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative attentions: ReBased linear flashattn and LWM's RingAttention #114

Closed
wants to merge 1 commit into from

Conversation

kabachuha
Copy link

@kabachuha kabachuha commented Mar 18, 2024

Adds support for subquadratic ReBased Linear Self-FlashAttention and LargeWorldModel's RingAttention allowing for far larger context sizes with the same VRAM usage and not requiring extra digression from the Transformer architecture unlike RNNs and Mamba-like models

Closes #107

@zhengzangw
Copy link
Collaborator

Thank you for your contribution! Currently, we do not want to involve more libraries. Since the project is under quickly development, we will turn back to this at a late stage.

@zhengzangw
Copy link
Collaborator

Closed as we are not ready to test different attention recently.

@zhengzangw zhengzangw closed this May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for subquadratic attention methods such as Linear Attention
2 participants