Add GPU usage #4153

dhiltgen · 2024-05-04T17:10:08Z

Help users understand how much of the model fit into their GPU without having to resort to inspecting the server log

A few examples from different systems and models

eval rate:            4.40 tokens/s
gpu usage:            1 GPU (14/27 layers) 3.2 GB (2.0 GB GPU)

eval rate:            6.64 tokens/s
gpu usage:            1 GPU (27/27 layers) 3.2 GB

eval rate:            18.44 tokens/s
gpu usage:            2 GPUs (27/33 layers) 27 GB (24 GB GPU)

eval rate:            19.58 tokens/s
gpu usage:            CPU (0/27 layers) 3.2 GB

llm/server.go

mxyng · 2024-05-07T17:35:33Z

#4190 broke lint on windows. gofmt is still a problem

This records more GPU usage information for eventual UX inclusion.

dhiltgen · 2024-05-08T21:47:58Z

Still chewing on the optimal UX, so I've removed the UX from this PR to lay the groundwork for a follow up PR to expose it in the UX.

dhiltgen force-pushed the gpu_verbose_response branch 5 times, most recently from 284a45f to 68fd3eb Compare May 7, 2024 16:28

mxyng approved these changes May 7, 2024

View reviewed changes

llm/server.go Outdated Show resolved Hide resolved

dhiltgen force-pushed the gpu_verbose_response branch from 68fd3eb to be3b9b1 Compare May 7, 2024 17:00

dhiltgen force-pushed the gpu_verbose_response branch from be3b9b1 to 2ab97a2 Compare May 7, 2024 17:40

dhiltgen mentioned this pull request May 8, 2024

The ollama model how resides on the gpu? #4254

Closed

Record GPU usage information

bee2f4a

This records more GPU usage information for eventual UX inclusion.

dhiltgen force-pushed the gpu_verbose_response branch from 2ab97a2 to bee2f4a Compare May 8, 2024 21:47

dhiltgen changed the title ~~Add GPU usage to verbose metrics~~ Add GPU usage May 8, 2024

pdevine approved these changes May 8, 2024

View reviewed changes

dhiltgen merged commit ee49844 into ollama:main May 8, 2024
15 checks passed

dhiltgen deleted the gpu_verbose_response branch May 8, 2024 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU usage #4153

Add GPU usage #4153

dhiltgen commented May 4, 2024 •

edited

mxyng commented May 7, 2024

dhiltgen commented May 8, 2024

Add GPU usage #4153

Add GPU usage #4153

Conversation

dhiltgen commented May 4, 2024 • edited

mxyng commented May 7, 2024

dhiltgen commented May 8, 2024

dhiltgen commented May 4, 2024 •

edited