Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[benchmark] optimize benchmark: counting tokenlizer tokens and error requests
#1607
opened May 17, 2024 by
NiuBlibing
Loading…
Update doc for prefix caching
documentation
Improvements or additions to documentation
#1597
opened May 16, 2024 by
ispobock
Loading…
Balance vision model weights on multi gpus
improvement
#1591
opened May 14, 2024 by
irexyc
Loading…
2 tasks done
support mistral and llava_mistral in turbomind
improvement
#1579
opened May 10, 2024 by
lvhan028
Loading…
1 task done
[Feature] Support vl models quantization
enhancement
New feature or request
#1553
opened May 7, 2024 by
AllentDan
Loading…
8 tasks done
fix: update api_server_backend.py to adapt latest gradio
improvement
#1541
opened May 3, 2024 by
kv-chiu
Loading…
Optimize kernel launch for triton2.2.0 and triton2.3.0
improvement
#1499
opened Apr 25, 2024 by
grimoire
Loading…
Add docs of support new vl model
documentation
Improvements or additions to documentation
#1332
opened Mar 22, 2024 by
irexyc
Loading…
remove chat template config in turbomind engine
BC-breaking
#1161
opened Feb 20, 2024 by
irexyc
Loading…
Visualize layer activations and weights to simplify the quantization process.
#607
opened Oct 24, 2023 by
HIT-cwh
Loading…
ProTip!
Follow long discussions with comments:>50.