You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have tried multiple values of num_predict between 30 and 100 and two models (llama2 and llava).
In all cases the last sentence is often cut short, making it quite inconvenient to use in applications.
Not entirely sure if it is a bug or just expected behaviour that should be handled otherwise (prompt engineering, perplexity, postprocessing...).
Hi @mariomorvan thanks for the issue. This is expected behavior given num_predict decide show many tokens (roughly, words) will be output back. You'd want to leave enough for at least one complete sentence.
That said, I totally understand you'd like to limit the length and receive a complete answer. This is something we'll consider in the future! A good tip for this is to mention the length of the response in the prompt. For example answer this question in a single sentence of no more than 10 words - the language model will often oblige :)
Thanks - looks like a useful and often effective workaround
Hi @mariomorvan thanks for the issue. This is expected behavior given num_predict decide show many tokens (roughly, words) will be output back. You'd want to leave enough for at least one complete sentence.
That said, I totally understand you'd like to limit the length and receive a complete answer. This is something we'll consider in the future! A good tip for this is to mention the length of the response in the prompt. For example answer this question in a single sentence of no more than 10 words - the language model will often oblige :)
What is the issue?
I have tried multiple values of num_predict between 30 and 100 and two models (llama2 and llava).
In all cases the last sentence is often cut short, making it quite inconvenient to use in applications.
Not entirely sure if it is a bug or just expected behaviour that should be handled otherwise (prompt engineering, perplexity, postprocessing...).
The problem has already been mentioned by several people in this issue langgenius/dify#2461 (comment)
OS
macOS
GPU
Intel
CPU
Intel
Ollama version
0.1.33
The text was updated successfully, but these errors were encountered: