Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch 2.0 compile model #283

Open
andrecharneca opened this issue Dec 1, 2023 · 3 comments
Open

Torch 2.0 compile model #283

andrecharneca opened this issue Dec 1, 2023 · 3 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@andrecharneca
Copy link

Are there any plans to add torch.compile speed-ups to LMQL Transformers models? Thanks

@lbeurerkellner lbeurerkellner added the enhancement New feature or request label Dec 10, 2023
@lbeurerkellner
Copy link
Collaborator

Hi there Andre, can you recommend any resources on how torch.compile improves inference speed, with e.g. transformers.

In general I am definitely not opposed to adding it.

@andrecharneca
Copy link
Author

For example: https://huggingface.co/docs/transformers/main/perf_torch_compile , although this is with Vision Transformers, results should be similar.
After some experimentation with torch.compile on my own, for LLMs the compilation can take quite a while, so the gains in performance really depend on the specific use-case. Would be a nice feature to add still, since it's so simple.

@lbeurerkellner lbeurerkellner added the good first issue Good for newcomers label Feb 27, 2024
@lbeurerkellner
Copy link
Collaborator

Marking this as a good first issue.

The feature can be added to https://github.com/eth-sri/lmql/blob/main/src/lmql/models/lmtp/backends/transformers_model.py, where an optional lmql serve-model argument can be set, such that compilation is done before model serving begins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants