Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unfinished sentences when setting num_predict parameter #4230

Closed
mariomorvan opened this issue May 7, 2024 · 2 comments
Closed

Unfinished sentences when setting num_predict parameter #4230

mariomorvan opened this issue May 7, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@mariomorvan
Copy link

What is the issue?

I have tried multiple values of num_predict between 30 and 100 and two models (llama2 and llava).
In all cases the last sentence is often cut short, making it quite inconvenient to use in applications.
Not entirely sure if it is a bug or just expected behaviour that should be handled otherwise (prompt engineering, perplexity, postprocessing...).

The problem has already been mentioned by several people in this issue langgenius/dify#2461 (comment)

OS

macOS

GPU

Intel

CPU

Intel

Ollama version

0.1.33

@mariomorvan mariomorvan added the bug Something isn't working label May 7, 2024
@jmorganca
Copy link
Member

Hi @mariomorvan thanks for the issue. This is expected behavior given num_predict decide show many tokens (roughly, words) will be output back. You'd want to leave enough for at least one complete sentence.

That said, I totally understand you'd like to limit the length and receive a complete answer. This is something we'll consider in the future! A good tip for this is to mention the length of the response in the prompt. For example answer this question in a single sentence of no more than 10 words - the language model will often oblige :)

@mariomorvan
Copy link
Author

Thanks - looks like a useful and often effective workaround

Hi @mariomorvan thanks for the issue. This is expected behavior given num_predict decide show many tokens (roughly, words) will be output back. You'd want to leave enough for at least one complete sentence.

That said, I totally understand you'd like to limit the length and receive a complete answer. This is something we'll consider in the future! A good tip for this is to mention the length of the response in the prompt. For example answer this question in a single sentence of no more than 10 words - the language model will often oblige :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants