InternLM / lmdeploy Public

Notifications
Fork 238
Star 2.6k

Code
Issues 107
Pull requests 26
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: InternLM/lmdeploy

[Benchmark] benchmarks on different cuda architecture with mo...

#815 opened Dec 11, 2023 by lvhan028

Open 6

报名参加书生·浦语大模型实战营——两周带你玩转微调部署评测全链路

#890 opened Dec 26, 2023 by vansin

Open

Labels 32 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

107 Open 710 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug] got error when pip install. docker img works though, python ver3.11

#1633 opened May 21, 2024 by fyxc

2 tasks done

Are there any plans to support CUDA 11.7?

#1632 opened May 21, 2024 by dlin511

[Bug] lmdeploy搭建的服务，是否支持通过传输stop_words的方式来控制模型输出

#1631 opened May 21, 2024 by qiuxuezhe123

2 tasks

[Bug] qwen1.5-14b-chat使用turbomind进行推理，会出现输出重复的情况

#1629 opened May 21, 2024 by qiuxuezhe123

2 tasks

使用KV cache（int8或int4）量化internvl-v1.5后，显存反而增加了

#1626 opened May 21, 2024 by qingchunlizhi

1 of 2 tasks

[Feature] Layer Wise Calibration and Quantization of Models (To quantize model on Low VRAM GPU)

#1625 opened May 21, 2024 by Tushar-ml

[Feature] specify gpus in pipeline

#1624 opened May 21, 2024 by kleinzcy

GPTQ 和 AWQ 的推理 kernel 能否互用？

#1623 opened May 21, 2024 by sleepwalker2017

[Feature] Implement COG-VLM2

#1622 opened May 20, 2024 by isidentical

[Bug] hang when many requests

#1619 opened May 20, 2024 by NiuBlibing

2 tasks done

[Feature] Grammar/structured output support

#1614 opened May 19, 2024 by nidhoggr-nil

[Bug] 部署的多模态模型，多轮对话时输出结果异常

#1612 opened May 17, 2024 by wssywh

2 tasks done

[Feature] Throw exception when response error

#1610 opened May 17, 2024 by NiuBlibing

[Bug] 部署llava-v1.6-34b，模型一直输出重复的结果

#1604 opened May 16, 2024 by wssywh

2 tasks

[Bug] Unrecognized configuration class when quantizing llava

#1601 opened May 16, 2024 by zjysteven

2 tasks done

Support for Pali gemma

#1596 opened May 15, 2024 by bks5881

[Docs] Add docs to NVTX options

#1595 opened May 15, 2024 by yyccli

[Bug] llava, cuda out of memory

#1593 opened May 15, 2024 by xiaoyudxy

1 of 2 tasks

AsyncEngine 的 stream_infer 函数增加手动传入session_id，实现多次调用 stream_infer 时的并行推理[Feature]

#1590 opened May 14, 2024 by NagatoYuki0943

[Feature] Support W4A8KV4 Quantization(QServe/QoQ)

#1587 opened May 13, 2024 by wanzhenchn

[Bug] change h_input_length_buf_ before synchronization

#1584 opened May 11, 2024 by mengmeexix

2 tasks done

[Feature] Support for LLaVA-NeXT Qwen1.5-110, Qwen1.5-72B, LLaMA3-8B awaiting response

#1583 opened May 11, 2024 by Iven2132

[Feature] 是否支持enc-dec类型模型中decoder的persistent batch awaiting response

#1581 opened May 10, 2024 by Oldpan

使用turbomind部署CodeQwen1.5模型，推理效果变差

#1580 opened May 10, 2024 by Lanyu123

[Bug] 使用docker部署internlm/internlm-xcomposer-vl-7b和internlm/internlm-xcomposer2-vl-7b均报错

#1577 opened May 10, 2024 by ye7love7

2 tasks done

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly