vllm-project / vllm Public

Notifications
Fork 2.6k
Star 19.7k

Code
Issues 825
Pull requests 231
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2024

#3861 opened Apr 4, 2024 by simon-mo

Open 26

Labels 41 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

825 Open 1,937 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: Can't use offline inference embedding bug

Something isn't working

#4908 opened May 19, 2024 by Fanb1ing

[Bug]: Cannot use FlashAttention-2 backend because the flash_attn package is not found bug

Something isn't working

#4906 opened May 19, 2024 by maxin9966

[Bug]: llm_engine_example.py (more requests) get stuck bug

Something isn't working

#4904 opened May 19, 2024 by CsRic

[Feature]: Support for Falcon-11B model (Falcon 2) feature request

#4902 opened May 18, 2024 by s-smits

[Usage]: Profiling Prefill and Decode Phases Separately usage

How to use vllm

#4900 opened May 18, 2024 by Msiavashi

[Usage]: Passing a guided_json in offline inference usage

How to use vllm

#4899 opened May 18, 2024 by ccdv-ai

v0.4.3 Release Tracker release

Related to new version release

#4895 opened May 18, 2024 by simon-mo

[Bug]: CohereForAI/c4ai-command-r-v01OSError: [Errno 12] Cannot allocate memory bug

Something isn't working

#4891 opened May 17, 2024 by epignatelli

[Bug]: assert parts[0] == "base_model" AssertionError bug

Something isn't working

#4883 opened May 17, 2024 by Edisonwei54

[Usage]: why can't I set gpu nums while use "tensor_parallel_size"? usage

How to use vllm

#4882 opened May 17, 2024 by GodHforever

[Installation]: Do we have the plan to update the pip package installation method for the CPU backend. installation

Installation problems

#4881 opened May 17, 2024 by Zhenzhong1

[Usage]: gpu memory usage when using tensor parallel usage

How to use vllm

#4880 opened May 17, 2024 by DaiJianghai

[Bug]: single lora request error make all processing requests error bug

Something isn't working

#4879 opened May 17, 2024 by jinzhen-lin

[RFC]: Add control panel support for vLLM RFC

#4873 opened May 17, 2024 by leiwen83

7 of 11 tasks

[Bug]: Shape error encountered in speculative decoding when enable_lora=True bug

Something isn't working

#4872 opened May 17, 2024 by mitchellstern

[Feature]: Health check for restart policy feature request

#4867 opened May 16, 2024 by pseudotensor

[Usage]: distributed inference with kuberay usage

How to use vllm

#4865 opened May 16, 2024 by hetian127

[Misc]: a question about chunked-prefill in flash-attn backends misc

#4863 opened May 16, 2024 by HarryWu99

[Bug]: No CUDA GPUs are available on 'CPU' use bug

Something isn't working

#4858 opened May 16, 2024 by mcr-ksh

[Usage]: How to determine how many concurrent requests can be supported in an acceptable time duration with demo api server? usage

How to use vllm

#4853 opened May 16, 2024 by senbinyu

[Bug]: Qwen1.5-72B L20x8 latest vLLM TPOT slower than v0.4.0.post, 48ms vs 39ms, why? bug

Something isn't working

#4852 opened May 16, 2024 by DefTruth

[Misc]: Assertion with no scription in vllm with DeepSeekMath 7b model, why, how to fix? misc

#4849 opened May 16, 2024 by brando90

[Feature]: Build and publish Neuron docker image feature request

#4838 opened May 15, 2024 by yaronr

[Bug]: Running vllm docker image with neuron fails bug

Something isn't working

#4836 opened May 15, 2024 by yaronr

[New Model]: Google's Paligemma family of models new model

Requests to new models

#4833 opened May 15, 2024 by nfplay

Previous 1 2 3 4 5 … 32 33 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly