Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax #1541

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

bpopeters
Copy link
Contributor

This pull request adds support for entmax 1.5, a sparse alternative to softmax which we describe in our ACL paper, Sparse Sequence-to-Sequence Models. It uses the implementations of sparsemax and entmax-1.5 from entmax package, available from pip.

This pull request does not include support for entmax with other alpha values. I suspect the code for that will be a little bit more involved and I can get to it soon.

It also does not include support for entmax attention in transformers, but I can probably make that PR next week as well.

One potential issue is that our entmax code does not support python 2. I don't know who still needs python 2 support for OpenNMT.

@vince62s
Copy link
Member

Hi Ben, welcome back.
Up to now, we tried to make the code python2 compatible (which is a requirement in Travis as you can see). I do understand it is a bit obsolete (plus python3 is requirement for distributed training) but is there much to do to make it compatible?

@bpopeters
Copy link
Contributor Author

It probably would not require very many changes, but it isn't really on our agenda since python 2 is only supported until the end of the year.

@vince62s vince62s changed the title entmax 1.5 for attention and outputs, faster implementation of sparsemax [WIP] entmax 1.5 for attention and outputs, faster implementation of sparsemax Sep 4, 2019
@vince62s vince62s changed the title [WIP] entmax 1.5 for attention and outputs, faster implementation of sparsemax [WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants