bug: [TensorRT-LLM Server not work] #2834

tmape · 2024-04-26T09:38:45Z

Describe the bug
I am attempting to connect Open Web-UI to access Jan's server for utilizing the TensorRT-LLM model (Mistral 7B Instructions v0.1 INT4). However, I am experiencing issues and it does not work as expected. I have tried switching to a standard gguf model (Meta-Llama-3-8B-Instruct.Q8_0), which functions correctly. When chatting with Jan, both the TensorRT-LLM and gguf models work as expected.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your issue.

Environment details

Operating System: [Windows 11]
Jan Version: [0.4.12]
Processor: [Intel Core i7]
RAM: [64GB]
Any additional relevant hardware specifics: [RTX 4060]

Logs

2024-04-26T09:19:08.690Z [SERVER]::{"reqId":"req-t","res":{},"req":{"method":"POST","url":"/v1/chat/completions","hostname":"192.168.0.101:1337"},"msg":"incoming request"}

2024-04-26T09:19:08.832Z [SERVER]::{"reqId":"req-t","res":{"statusCode":500},"req":{},"msg":"request completed","responseTime":133.75729999993928}

Additional context

Server Options: 0.0.0.0, 1337
API Prefix: /v1
Cross-Origin-Resource-Sharing (CORS): Disable
Verbose Server Logs: Enable

The text was updated successfully, but these errors were encountered:

louis-jan · 2024-05-05T15:12:40Z

Duplicated
#2373

tmape added the type: bug Something isn't working label Apr 26, 2024

louis-jan closed this as not planned Won't fix, can't repro, duplicate, stale May 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: [TensorRT-LLM Server not work] #2834

bug: [TensorRT-LLM Server not work] #2834

tmape commented Apr 26, 2024

louis-jan commented May 5, 2024

bug: [TensorRT-LLM Server not work] #2834

bug: [TensorRT-LLM Server not work] #2834

Comments

tmape commented Apr 26, 2024

louis-jan commented May 5, 2024