h2ogpt on ubuntu server #1591

InesBenAmor99 · 2024-04-30T08:37:46Z

I'm running h2ogpt on an Ubuntu server, you'll find attached the server specifications. However, the model execution is too slow (TheBloke/Mistral-7B-Instruct-v0.2-GGUF), and sometimes it doesn't even generate a response. Any recommendations? What exactly could be the problem?

pseudotensor · 2024-04-30T22:24:07Z

Hi, can you give your exact run line? For CPU it can be slow when inputting large context,so you can reduce top_k_docs etc.

https://github.com/h2oai/h2ogpt/blob/832ad2d4a6b1431105785045a6b218a8451591f9/docs/FAQ.md#controlling-quality-and-speed-of-parsing

InesBenAmor99 · 2024-05-01T10:31:42Z

Hello , thank you for answering ! this is my exact run line : python generate.py --base_model=TheBloke/Mistral-7B-Instruct-v0.2-GGUF --prompt_type=mistral --max_seq_len=4096 (note that it's too slow even if I don't introduce context (source = llm), simple requests like hello or hi messages for example )

pseudotensor · 2024-05-01T19:28:30Z

Hi, same command line for me yields very fast results on CPU, but I added the top_k_docs limit. I also added the other stuff mentioned, but that wouldn't matter of just LLM chat mode.

(h2ogpt) jon@pseudotensor:~/h2ogpt$ CUDA_VISIBLE_DEVICES= python generate.py --base_model=TheBloke/Mistral-7B-Instruct-v0.2-GGUF --prompt_type=mistral --max_seq_len=4096 --top_k_docs=3 --max_input_tokens=4096

Then I go to http://127.0.0.1:7860

I see about 2-3 tokens per second, maybe 2 words per second, on my CPU system with i9.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

h2ogpt on ubuntu server #1591

h2ogpt on ubuntu server #1591

InesBenAmor99 commented Apr 30, 2024 •

edited

pseudotensor commented Apr 30, 2024

InesBenAmor99 commented May 1, 2024

pseudotensor commented May 1, 2024

h2ogpt on ubuntu server #1591

h2ogpt on ubuntu server #1591

Comments

InesBenAmor99 commented Apr 30, 2024 • edited

pseudotensor commented Apr 30, 2024

InesBenAmor99 commented May 1, 2024

pseudotensor commented May 1, 2024

InesBenAmor99 commented Apr 30, 2024 •

edited