Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Ollama as a high-level service #510

Merged
merged 30 commits into from
May 7, 2024

Conversation

boswelja
Copy link
Contributor

@boswelja boswelja commented Apr 24, 2024

Implementing Ollama as a high-level service type. This has a few advantages:

  • Unlocks the full use of Ollama APIs that aren't necessarily OpenAI-compatible
  • Significantly simplify plugin setup for Ollama users
  • Significantly easier to expand supported features (we don't have to deal with service type -> template -> compat API checks)

Right now, the above benefits have translated into:

  • Support for inline code completions (limited right now, pending API for FIM tasks ollama/ollama#3869 for full support)
  • Support for multimodality
  • Model selector surfaced right next to host input
  • Dedicated Ollama option for chat window service dropdown

Currently blocked by:

Screenshots:
image
image
image
image

@boswelja
Copy link
Contributor Author

Please be aware I have no idea how the IDE plugin API works, there's bound to be issues 😆

@PhilKes
Copy link
Contributor

PhilKes commented Apr 25, 2024

Maybe I missed some discussion here, but I thought @carlrobertoh didnt want to support Ollama as a high-level service?
We had that discussion when I already implemented exactly this using the Ollama API in #361.
If that opinion changed, I'm all for it of course 😁

@boswelja
Copy link
Contributor Author

We did a little negotiating 😛
#441 (comment)

@carlrobertoh
Copy link
Owner

Since this is a popular request and Ollama doesn't support an API for OpenAI-compatible text completions, I've decided to make an exception. However, I'd still like to keep the others as they are and provide better documentation on how to configure them. 🙂

@PhilKes
Copy link
Contributor

PhilKes commented Apr 25, 2024

Since this is a popular request and Ollama doesn't support an API for OpenAI-compatible text completions, I've decided to make an exception. However, I'd still like to keep the others as they are and provide better documentation on how to configure them. 🙂

Good timing, I was actually working on adding /v1/completions support to Ollama 😂

But instead I now opened a PR for supporting llama.cpp's /infill API for FIM/code-completions:
ollama/ollama#3907
Which would resolve @boswelja ollama/ollama#3869

@boswelja
Copy link
Contributor Author

Wow that made progress way faster than I expected, thanks @PhilKes! I'll hold off on this for a bit and see if we can get that API in for the first release, which would effectively solve code completions

@carlrobertoh
Copy link
Owner

Nice! In the meantime, we could switch the llama.cpp completions to the /infill API as well

@linpan
Copy link

linpan commented Apr 28, 2024

Ollama as a high-level service support /v1/completions. keep on,

@boswelja
Copy link
Contributor Author

Preventing multimodal inputs

re. this, it looks like Ollama APIs handle this relatively gracefully
image

Is this an acceptable solution, at least for now? Users can still attach files, but they are just ignored.
In the future, I think we can check the model "families" that Ollama gives us to see if it contains "clip", but I'm not sure if that's a silver bullet just yet

@artem-zinnatullin
Copy link

Hey @boswelja, just wanted to confirm that I got your PR at last state dc32216 working with master branch of https://github.com/carlrobertoh/llm-client pushed to mavenLocal(), great work!

Few thoughts:

  • Really cool that you're pulling available models into a dropdown in settings, very easy to use!
  • Settings UI should note why certain models can't be used for Code Completion, it also doesn't seem to properly render enable/disable state when switching models
  • We need some llama3 specific FIM mappings it seems, can't get it to produce meaningful code completions 😭
  • Your code is really clean, pleasure to read, plugin itself could use less copy-paste between settings related logic and actual business logic 😅
Screenshot 2024-04-27 at 11 14 26 PM

Hope this motivates you to pushing this PR further! Happy to test new changes and work out Ollama support

@boswelja
Copy link
Contributor Author

Thanks @artem-zinnatullin! I'm aware of potential issues with toggling code completion, I was beaten to the punch by the custom OpenAI service completion, which implements this slightly different, so I'm halfway through refactoring to match that.

While we wait for ollama/ollama#3907, I'll try split this into smaller PRs so that it's less to review all at once :)

@linpan
Copy link

linpan commented May 4, 2024

@boswelja

@boswelja
Copy link
Contributor Author

boswelja commented May 4, 2024

Yes that's me

@PhilKes
Copy link
Contributor

PhilKes commented May 6, 2024

@boswelja I think we shouldnt rely on /infill API for now.
I thought ggerganov/llama.cpp#6689 would enable the FIM prompt templates to be loaded automatically for all models in llama.cpp, but thats not the case as I understand it. Right now llama.cpp only knows how to determine the correct FIM tokens (prefix, suffix, middle) for CodeGemma and CodeLlama. At least that was my experience when I tried to test /infill with CodeQwen (ggerganov/llama.cpp#7102 (comment)).

It would be really nice not having to bother with Infill Prompt Templates in the CodeGPT Plugin itself, but I think the /infill of llama.cpp does not yet offer what we need. But maybe someone else knows more about that than me. I'm still waiting on feedback for ollama/ollama#3907.

But if not, I would actually propose to rollback #513 and also do not rely on it for the Ollama service implementation aswell.

@boswelja
Copy link
Contributor Author

boswelja commented May 6, 2024

Fair enough, we can move forward sticking with generate for now. Thanks for the detailed investigation!

@carlrobertoh
Copy link
Owner

Huh, that's probably the reason why I rolled back the /infill API in the first place, altho I never actually investigated why some of the models weren't working as expected.

@PhilKes Let's revert the last change :)

@boswelja is the PR ready for review? I might push some changes on the fly, or perhaps merge it as is, since I'm planning on integrating another new service, which might cause some merge conflicts.

@boswelja
Copy link
Contributor Author

boswelja commented May 7, 2024

I was about to say "no, I've got a couple of smaller PRs that should go in first" but looks like they're merged now!

I'll resolve conflicts and do another once-over, I think the only other thing I wanted input on is #510 (comment)

# Conflicts:
#	gradle/libs.versions.toml
#	src/main/java/ee/carlrobert/codegpt/completions/CompletionRequestService.java
#	src/main/kotlin/ee/carlrobert/codegpt/actions/CodeCompletionFeatureToggleActions.kt
#	src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeGPTInlineCompletionProvider.kt
@boswelja
Copy link
Contributor Author

boswelja commented May 7, 2024

Current issues:

  • When refreshing the model list, the model dropdown doesn't update with discovered models
    • This is most noticeable when first setting up, after there have been no models but refreshing loads models
    • You can still chat with the model, but the dropdown shows the wrong service and model active
  • Can upload images to models that don't support image inputs (they just ignore it)
    • Is this even an issue?

Not really sure how to fix that first one

@boswelja boswelja marked this pull request as ready for review May 7, 2024 08:15
@carlrobertoh
Copy link
Owner

carlrobertoh commented May 7, 2024

When refreshing the model list, the model dropdown doesn't update with discovered models

I made a few minor changes, including fixing the dropdown refresh issues. I also removed the availableModels state since there's no need to maintain any record of available models, as they are always requested via API.

Edit: Will revert the removal

Can upload images to models that don't support image inputs (they just ignore it)

  • Is this even an issue?

I don't think it's an issue at the moment. Let's keep it.

@carlrobertoh
Copy link
Owner

Everything seems to be working more or less; code completions still need to be improved, but other than that, it seems good. I'll try to provide better documentation on how to set up everything soon as well. Also, if something pops up, then I'll fix it on the fly.

Furthermore, you can expect the feature to be released sometime early next week, hopefully even earlier.

A big thank you to everyone for your help and support! ❤️

@carlrobertoh carlrobertoh merged commit e40630d into carlrobertoh:master May 7, 2024
2 checks passed
carlrobertoh added a commit that referenced this pull request May 13, 2024
* Initial implementation of Ollama as a service

* Fix model selector in tool window

* Enable image attachment

* Rewrite OllamaSettingsForm in Kt

* Create OllamaInlineCompletionModel and use it for building completion template

* Add support for blocking code completion on models that we don't know support it

* Allow disabling code completion settings

* Disable code completion settings when an unsupported model is entered

* Track FIM template in settings as a derived state

* Update llm-client

* Initial implementation of model combo box

* Add Ollama icon and display models as list

* Make OllamaSettingsState immutable & convert OllamaSettings to Kotlin

* Add refresh models button

* Distinguish between empty/needs refresh/loading

* Avoid storing any model if the combo box is empty

* Fix icon size

* Back to mutable settings
There were some bugs with immutable settings

* Store available models in settings state

* Expose available models in model dropdown

* Add dark icon

* Cleanups for CompletionRequestProvider

* Fix checkstyle issues

* refactor: migrate to SimplePersistentStateComponent

* fix: add code completion stop tokens

* fix: display only one item in the model popup action group

* fix: add back multi model selection

---------

Co-authored-by: Carl-Robert Linnupuu <carlrobertoh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants