Implement Ollama as a high-level service #510

boswelja · 2024-04-24T09:00:28Z

Implementing Ollama as a high-level service type. This has a few advantages:

Unlocks the full use of Ollama APIs that aren't necessarily OpenAI-compatible
Significantly simplify plugin setup for Ollama users
Significantly easier to expand supported features (we don't have to deal with service type -> template -> compat API checks)

Right now, the above benefits have translated into:

Support for inline code completions (limited right now, pending API for FIM tasks ollama/ollama#3869 for full support)
Support for multimodality
Model selector surfaced right next to host input
Dedicated Ollama option for chat window service dropdown

Currently blocked by:

~~Add /api/infill for fill-in-the-middle ollama/ollama#3907~~ Delays with implementation on ollama side, let's stick with what we've got for now.
Bump llm-client to 0.7.5 #520
Expand and explicitly handle cases where a ServiceType is checked #521
Preventing multimodal inputs where appropriate

Screenshots:

… template

… support it

boswelja · 2024-04-24T09:12:32Z

Please be aware I have no idea how the IDE plugin API works, there's bound to be issues 😆

PhilKes · 2024-04-25T08:23:41Z

Maybe I missed some discussion here, but I thought @carlrobertoh didnt want to support Ollama as a high-level service?
We had that discussion when I already implemented exactly this using the Ollama API in #361.
If that opinion changed, I'm all for it of course 😁

boswelja · 2024-04-25T09:44:06Z

We did a little negotiating 😛
#441 (comment)

carlrobertoh · 2024-04-25T11:04:04Z

Since this is a popular request and Ollama doesn't support an API for OpenAI-compatible text completions, I've decided to make an exception. However, I'd still like to keep the others as they are and provide better documentation on how to configure them. 🙂

PhilKes · 2024-04-25T11:20:34Z

Since this is a popular request and Ollama doesn't support an API for OpenAI-compatible text completions, I've decided to make an exception. However, I'd still like to keep the others as they are and provide better documentation on how to configure them. 🙂

Good timing, I was actually working on adding /v1/completions support to Ollama 😂

But instead I now opened a PR for supporting llama.cpp's /infill API for FIM/code-completions:
ollama/ollama#3907
Which would resolve @boswelja ollama/ollama#3869

boswelja · 2024-04-25T11:23:32Z

Wow that made progress way faster than I expected, thanks @PhilKes! I'll hold off on this for a bit and see if we can get that API in for the first release, which would effectively solve code completions

carlrobertoh · 2024-04-25T12:07:25Z

Nice! In the meantime, we could switch the llama.cpp completions to the /infill API as well

linpan · 2024-04-28T01:09:54Z

Ollama as a high-level service support /v1/completions. keep on,

# Conflicts: # src/main/kotlin/ee/carlrobert/codegpt/actions/CodeCompletionFeatureToggleActions.kt # src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeCompletionRequestFactory.kt

boswelja · 2024-04-28T03:12:07Z

Preventing multimodal inputs

re. this, it looks like Ollama APIs handle this relatively gracefully

Is this an acceptable solution, at least for now? Users can still attach files, but they are just ignored.
In the future, I think we can check the model "families" that Ollama gives us to see if it contains "clip", but I'm not sure if that's a silver bullet just yet

There were some bugs with immutable settings

artem-zinnatullin · 2024-04-28T05:19:29Z

Hey @boswelja, just wanted to confirm that I got your PR at last state dc32216 working with master branch of https://github.com/carlrobertoh/llm-client pushed to mavenLocal(), great work!

Few thoughts:

Really cool that you're pulling available models into a dropdown in settings, very easy to use!
Settings UI should note why certain models can't be used for Code Completion, it also doesn't seem to properly render enable/disable state when switching models
We need some llama3 specific FIM mappings it seems, can't get it to produce meaningful code completions 😭
Your code is really clean, pleasure to read, plugin itself could use less copy-paste between settings related logic and actual business logic 😅

Hope this motivates you to pushing this PR further! Happy to test new changes and work out Ollama support

boswelja · 2024-04-30T04:58:05Z

Thanks @artem-zinnatullin! I'm aware of potential issues with toggling code completion, I was beaten to the punch by the custom OpenAI service completion, which implements this slightly different, so I'm halfway through refactoring to match that.

While we wait for ollama/ollama#3907, I'll try split this into smaller PRs so that it's less to review all at once :)

linpan · 2024-05-04T04:37:14Z

@boswelja

boswelja · 2024-05-04T04:56:30Z

Yes that's me

PhilKes · 2024-05-06T16:50:38Z

@boswelja I think we shouldnt rely on /infill API for now.
I thought ggerganov/llama.cpp#6689 would enable the FIM prompt templates to be loaded automatically for all models in llama.cpp, but thats not the case as I understand it. Right now llama.cpp only knows how to determine the correct FIM tokens (prefix, suffix, middle) for CodeGemma and CodeLlama. At least that was my experience when I tried to test /infill with CodeQwen (ggerganov/llama.cpp#7102 (comment)).

It would be really nice not having to bother with Infill Prompt Templates in the CodeGPT Plugin itself, but I think the /infill of llama.cpp does not yet offer what we need. But maybe someone else knows more about that than me. I'm still waiting on feedback for ollama/ollama#3907.

But if not, I would actually propose to rollback #513 and also do not rely on it for the Ollama service implementation aswell.

boswelja · 2024-05-06T23:44:09Z

Fair enough, we can move forward sticking with generate for now. Thanks for the detailed investigation!

carlrobertoh · 2024-05-07T07:40:45Z

Huh, that's probably the reason why I rolled back the /infill API in the first place, altho I never actually investigated why some of the models weren't working as expected.

@PhilKes Let's revert the last change :)

@boswelja is the PR ready for review? I might push some changes on the fly, or perhaps merge it as is, since I'm planning on integrating another new service, which might cause some merge conflicts.

boswelja · 2024-05-07T07:54:35Z

I was about to say "no, I've got a couple of smaller PRs that should go in first" but looks like they're merged now!

I'll resolve conflicts and do another once-over, I think the only other thing I wanted input on is #510 (comment)

# Conflicts: # gradle/libs.versions.toml # src/main/java/ee/carlrobert/codegpt/completions/CompletionRequestService.java # src/main/kotlin/ee/carlrobert/codegpt/actions/CodeCompletionFeatureToggleActions.kt # src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeGPTInlineCompletionProvider.kt

boswelja · 2024-05-07T08:14:55Z

Current issues:

When refreshing the model list, the model dropdown doesn't update with discovered models
- This is most noticeable when first setting up, after there have been no models but refreshing loads models
- You can still chat with the model, but the dropdown shows the wrong service and model active
Can upload images to models that don't support image inputs (they just ignore it)
- Is this even an issue?

Not really sure how to fix that first one

carlrobertoh · 2024-05-07T21:30:13Z

When refreshing the model list, the model dropdown doesn't update with discovered models

I made a few minor changes, including fixing the dropdown refresh issues. ~~I also removed the availableModels state since there's no need to maintain any record of available models, as they are always requested via API.~~

Edit: Will revert the removal

Can upload images to models that don't support image inputs (they just ignore it)

Is this even an issue?

I don't think it's an issue at the moment. Let's keep it.

carlrobertoh · 2024-05-07T22:11:01Z

Everything seems to be working more or less; code completions still need to be improved, but other than that, it seems good. I'll try to provide better documentation on how to set up everything soon as well. Also, if something pops up, then I'll fix it on the fly.

Furthermore, you can expect the feature to be released sometime early next week, hopefully even earlier.

A big thank you to everyone for your help and support! ❤️

* Initial implementation of Ollama as a service * Fix model selector in tool window * Enable image attachment * Rewrite OllamaSettingsForm in Kt * Create OllamaInlineCompletionModel and use it for building completion template * Add support for blocking code completion on models that we don't know support it * Allow disabling code completion settings * Disable code completion settings when an unsupported model is entered * Track FIM template in settings as a derived state * Update llm-client * Initial implementation of model combo box * Add Ollama icon and display models as list * Make OllamaSettingsState immutable & convert OllamaSettings to Kotlin * Add refresh models button * Distinguish between empty/needs refresh/loading * Avoid storing any model if the combo box is empty * Fix icon size * Back to mutable settings There were some bugs with immutable settings * Store available models in settings state * Expose available models in model dropdown * Add dark icon * Cleanups for CompletionRequestProvider * Fix checkstyle issues * refactor: migrate to SimplePersistentStateComponent * fix: add code completion stop tokens * fix: display only one item in the model popup action group * fix: add back multi model selection --------- Co-authored-by: Carl-Robert Linnupuu <carlrobertoh@gmail.com>

boswelja added 9 commits April 24, 2024 18:37

Initial implementation of Ollama as a service

de7ff08

Fix model selector in tool window

a6d3819

Enable image attachment

fba4947

Rewrite OllamaSettingsForm in Kt

ab906d6

Create OllamaInlineCompletionModel and use it for building completion…

fd6e39c

… template

Add support for blocking code completion on models that we don't know…

ae4da45

… support it

Allow disabling code completion settings

d7c4d67

Disable code completion settings when an unsupported model is entered

b785c9a

Track FIM template in settings as a derived state

c0f40f0

This was referenced Apr 25, 2024

fix: use /infill for llama.cpp code-completions #513

Merged

support ollama generate reponose when code completions. #517

Open

Problems with code completion using custom openai service (ollama) #514

Open

boswelja added 7 commits April 28, 2024 13:51

Merge branch 'refs/heads/master' into ollama-service

86482a4

# Conflicts: # src/main/kotlin/ee/carlrobert/codegpt/actions/CodeCompletionFeatureToggleActions.kt # src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeCompletionRequestFactory.kt

Update llm-client

bf3f9ea

Initial implementation of model combo box

940d2c7

Add Ollama icon and display models as list

7428b95

Make OllamaSettingsState immutable & convert OllamaSettings to Kotlin

3cc3c9e

Add refresh models button

5cac940

Distinguish between empty/needs refresh/loading

d631968

boswelja added 2 commits April 28, 2024 15:17

Avoid storing any model if the combo box is empty

db96e2f

Fix icon size

26fee3c

boswelja added 4 commits April 28, 2024 15:39

Back to mutable settings

ce88a39

There were some bugs with immutable settings

Store available models in settings state

cb53692

Expose available models in model dropdown

4b8a53c

Add dark icon

dc32216

boswelja added 2 commits April 30, 2024 16:46

Cleanups for CompletionRequestProvider

418a7c7

Fix checkstyle issues

a99b22b

boswelja marked this pull request as ready for review May 7, 2024 08:15

PhilKes mentioned this pull request May 7, 2024

Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

Merged

carlrobertoh added 3 commits May 7, 2024 22:11

Merge remote-tracking branch 'origin/master' into ollama-service

ea5c5db

refactor: migrate to SimplePersistentStateComponent

da3066a

fix: add code completion stop tokens

3a06cde

carlrobertoh added 2 commits May 8, 2024 00:37

fix: display only one item in the model popup action group

eca6374

fix: add back multi model selection

0f59c8b

carlrobertoh force-pushed the ollama-service branch from b4702bc to 0f59c8b Compare May 7, 2024 21:56

carlrobertoh merged commit e40630d into carlrobertoh:master May 7, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Ollama as a high-level service #510

Implement Ollama as a high-level service #510

boswelja commented Apr 24, 2024 •

edited

boswelja commented Apr 24, 2024

PhilKes commented Apr 25, 2024 •

edited

boswelja commented Apr 25, 2024

carlrobertoh commented Apr 25, 2024

PhilKes commented Apr 25, 2024

boswelja commented Apr 25, 2024

carlrobertoh commented Apr 25, 2024

linpan commented Apr 28, 2024

boswelja commented Apr 28, 2024

artem-zinnatullin commented Apr 28, 2024

boswelja commented Apr 30, 2024

linpan commented May 4, 2024

boswelja commented May 4, 2024

PhilKes commented May 6, 2024 •

edited

boswelja commented May 6, 2024

carlrobertoh commented May 7, 2024

boswelja commented May 7, 2024

boswelja commented May 7, 2024 •

edited

carlrobertoh commented May 7, 2024 •

edited

carlrobertoh commented May 7, 2024

Implement Ollama as a high-level service #510

Implement Ollama as a high-level service #510

Conversation

boswelja commented Apr 24, 2024 • edited

boswelja commented Apr 24, 2024

PhilKes commented Apr 25, 2024 • edited

boswelja commented Apr 25, 2024

carlrobertoh commented Apr 25, 2024

PhilKes commented Apr 25, 2024

boswelja commented Apr 25, 2024

carlrobertoh commented Apr 25, 2024

linpan commented Apr 28, 2024

boswelja commented Apr 28, 2024

artem-zinnatullin commented Apr 28, 2024

boswelja commented Apr 30, 2024

linpan commented May 4, 2024

boswelja commented May 4, 2024

PhilKes commented May 6, 2024 • edited

boswelja commented May 6, 2024

carlrobertoh commented May 7, 2024

boswelja commented May 7, 2024

boswelja commented May 7, 2024 • edited

carlrobertoh commented May 7, 2024 • edited

carlrobertoh commented May 7, 2024

boswelja commented Apr 24, 2024 •

edited

PhilKes commented Apr 25, 2024 •

edited

PhilKes commented May 6, 2024 •

edited

boswelja commented May 7, 2024 •

edited

carlrobertoh commented May 7, 2024 •

edited