Torch 2.0 compile model #283

andrecharneca · 2023-12-01T11:42:21Z

Are there any plans to add torch.compile speed-ups to LMQL Transformers models? Thanks

lbeurerkellner · 2023-12-10T14:30:04Z

Hi there Andre, can you recommend any resources on how torch.compile improves inference speed, with e.g. transformers.

In general I am definitely not opposed to adding it.

andrecharneca · 2023-12-13T13:10:42Z

For example: https://huggingface.co/docs/transformers/main/perf_torch_compile , although this is with Vision Transformers, results should be similar.
After some experimentation with torch.compile on my own, for LLMs the compilation can take quite a while, so the gains in performance really depend on the specific use-case. Would be a nice feature to add still, since it's so simple.

lbeurerkellner · 2024-02-27T14:14:20Z

Marking this as a good first issue.

The feature can be added to https://github.com/eth-sri/lmql/blob/main/src/lmql/models/lmtp/backends/transformers_model.py, where an optional lmql serve-model argument can be set, such that compilation is done before model serving begins.

lbeurerkellner added the enhancement New feature or request label Dec 10, 2023

lbeurerkellner added the good first issue Good for newcomers label Feb 27, 2024

Saibo-creator mentioned this issue Mar 1, 2024

add support for torch_compile with HF models #336

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch 2.0 compile model #283

Torch 2.0 compile model #283

andrecharneca commented Dec 1, 2023

lbeurerkellner commented Dec 10, 2023

andrecharneca commented Dec 13, 2023

lbeurerkellner commented Feb 27, 2024

Torch 2.0 compile model #283

Torch 2.0 compile model #283

Comments

andrecharneca commented Dec 1, 2023

lbeurerkellner commented Dec 10, 2023

andrecharneca commented Dec 13, 2023

lbeurerkellner commented Feb 27, 2024