Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: custom function plugin support #798

Draft
wants to merge 3 commits into
base: dev
Choose a base branch
from
Draft

feat: custom function plugin support #798

wants to merge 3 commits into from

Conversation

tjbck
Copy link
Contributor

@tjbck tjbck commented Feb 19, 2024

Added a custom function plugin support that opens up some exciting possibilities. With this addition, we can integrate custom functions into our RAG pipeline or utilise it for post-processing LLM responses. Imagine chaining custom functions with RAG or LLM automatically triggering functions to provide more accurate answers.

If you have other specific use cases in mind, I'm open to tailoring this PR towards your use cases as well! Let me know what you guys think!


Will start working on this in the near future, I appreciate your patience 🙌

@tjbck
Copy link
Contributor Author

tjbck commented Feb 19, 2024

Example function:

"""
    This function retrieves weather data for a specified city.

    It takes keyword arguments (kwargs), constructs a Payload object, and sends a GET request 
    to the OpenWeatherMap API. The function fetches weather data based on the specified city.

    Args:
        kwargs (dict): A dictionary containing key-value pairs where 'city' is expected 
        as a key. Example: {'city': 'London'}

    Returns:
        dict: A dictionary containing weather data for the specified city. 
        The data includes various weather attributes such as temperature, humidity, 
        weather condition, etc., returned in JSON format from the OpenWeatherMap API.

    The function constructs a URL for the API request by embedding the city name 
    into the query parameters. It also includes the API key (stored in the variable 'API') 
    for authentication with the OpenWeatherMap API. The response from the API is converted 
    into JSON format and returned.
"""

import requests
from pydantic import BaseModel

API = "your api key"


class Payload(BaseModel):
    location: str


def main(kwargs):
    payload = Payload(**kwargs)
    r = requests.get(
        f"https://api.openweathermap.org/data/2.5/weather?q={payload.location}&appid={API}"
    )
    data = r.json()
    return data

@RWayne93
Copy link

This will work similar to the openai spec? i know ollama recently added this.

@tjbck
Copy link
Contributor Author

tjbck commented Feb 19, 2024

@RWayne93 not sure if I understand what you're saying :/ could you elaborate a bit more? I conceptualise this feature being used to either pre-process prompts or post-process llm responses, hope that clarifies!

@tjbck tjbck changed the base branch from main to dev February 25, 2024 02:35
@drawingthesun
Copy link

Example function:

"""
    This function retrieves weather data for a specified city.

    It takes keyword arguments (kwargs), constructs a Payload object, and sends a GET request 
    to the OpenWeatherMap API. The function fetches weather data based on the specified city.

    Args:
        kwargs (dict): A dictionary containing key-value pairs where 'city' is expected 
        as a key. Example: {'city': 'London'}

    Returns:
        dict: A dictionary containing weather data for the specified city. 
        The data includes various weather attributes such as temperature, humidity, 
        weather condition, etc., returned in JSON format from the OpenWeatherMap API.

    The function constructs a URL for the API request by embedding the city name 
    into the query parameters. It also includes the API key (stored in the variable 'API') 
    for authentication with the OpenWeatherMap API. The response from the API is converted 
    into JSON format and returned.
"""

import requests
from pydantic import BaseModel

API = "your api key"


class Payload(BaseModel):
    location: str


def main(kwargs):
    payload = Payload(**kwargs)
    r = requests.get(
        f"https://api.openweathermap.org/data/2.5/weather?q={payload.location}&appid={API}"
    )
    data = r.json()
    return data

How would this function be called from inside the chat ui?

@tjbck
Copy link
Contributor Author

tjbck commented Mar 2, 2024

@drawingthesun it's all in my head at the moment 😅 open to suggestions!

@thiner
Copy link

thiner commented Mar 15, 2024

@RWayne93 not sure if I understand what you're saying :/ could you elaborate a bit more? I conceptualise this feature being used to either pre-process prompts or post-process llm responses, hope that clarifies!

Sounds like langchain-like functionality. How about build it based on langchain python/js sdk?

@RWayne93
Copy link

@RWayne93 not sure if I understand what you're saying :/ could you elaborate a bit more? I conceptualise this feature being used to either pre-process prompts or post-process llm responses, hope that clarifies!

Sounds like langchain-like functionality. How about build it based on langchain python/js sdk?

So when I initially saw this. I assumed it was adding support for function calling that ollama had just recently added.

@BuildBackBuehler
Copy link

Added a custom function plugin support that opens up some exciting possibilities. With this addition, we can integrate custom functions into our RAG pipeline or utilise it for post-processing LLM responses. Imagine chaining custom functions with RAG or LLM automatically triggering functions to provide more accurate answers.

If you have other specific use cases in mind, I'm open to tailoring this PR towards your use cases as well! Let me know what you guys think!

Hey, I'm interested in being able to migrate some plugins from Oobabooga, donno if you're aware of 'em/used OB, but other than a RAG, opened up web-search capabilities and enhanced context window cache. Would be amazing if those could simply be dropped in 😂. I'll have to try out your PR!

@IIPedro
Copy link

IIPedro commented Mar 24, 2024

I see huge potential in this feature, and therefore would like to elaborate on how it could possibly be integrated into the web UI. As mentioned earlier, preprocessing and post-processing are essential to the end user, and I would also like to add that an integration should be easy to use, share and modify, mostly because the vast majority of users may not be experienced with coding. This way, I would propose an extensive and well-documented interface that can run simple Python code. I would like to emphasize that the proposal I'll defend would imply on running arbitrary code, therefore making the system potentially vulnerable. I will elaborate on how to make this as secure as possible.

First and foremost, the function calling should be separated into two parts: before and after the main response.

The first pass (before the main response) should be separated into LLM JSON calls with all the functions selected by the user (including a "none" function, which skips function usage), a maximum function call limit defined by the user and a counter on how many functions have been called. Each function on this pass should provide an answer string, which will be concatenated to previous function string responses. This is useful for contextualization and information-enhanced answering. Here's an example:

The user asks what time is it. The maximum function call limit is 3, and there is one function available (current time) + the default "none" function. There is an LLM pass for a structured function call, which calls "current time", returns a context (answer string) and adds a function usage notification to the next function calling prompt. Since on the next function call (2nd) there will be a string telling it has already used "current time", it most likely will select "none", therefore ending the loop before a third call. As a consequence of tool usage, the user prompt will include the context returned by the function, just like context is returned by RAG. Therefore, the LLM will know the current time.

The last pass (after the main response) should work just like the first pass, except instead of contextualizing the prompt, it rewrites the prompt as desired by the user. Fucntions for this type of pass should take the answer itself as an argument for the function, making it possible to analyze how many words there are, which words and what type of content is there. Usage of this type of information should be optional, but this allows for postprocessing, such as filtering certain words using regex, counting how many words there are in the answer and more. On this pass, the function should return a finish boolean, a rewrite boolean and a string. If the rewrite boolean is true, the answer should be completely rewritten by the LLM using the string as context. Else, the string should be sent to the next function. If the finish boolean is true, the answer should be provided to the user. It differs from the first pass since you can provide an order in which the functions will be called, and they'll always do so.

Secondly, there should be a python interpreter and a dependency installer in the interface. The python interpreter can be used to make functions, whereas the dependency installer MUST be used manually to install function dependencies, just like a pip install. It's important to mention a community tab would be good, since I'm not aware if the community codes a lot or not. It would be user-friendly, anyway. Besides, functions could be defined by modelfiles as well. This would make modelfiles different from Ollama's, but as there are plans to make llama.cpp inference, I believe this should be the right steps towards modularity with LLMs.

That's mostly it. I'll edit this message if I think about more stuff. Thanks! Feedback is much appreciated.

@jmfirth
Copy link
Contributor

jmfirth commented May 5, 2024

I’ve been periodically checking this PR for a couple months and wanted to leave some thoughts.

Long term I would love to see both server-side python and client-side JS/TS and/or pyodide-based tools.

The approach you took for server-side python tools feels sensible- a folder of python tool scripts with predictable names and security through whitelisted capability. Two thoughts: 1) should security be opt-in at the tool or user preference level, and 2) should tools should be their own python module with their own requirements, such as I believe is true with TGUI plugins.

Either way, I’ve been pining to see this capability come to my favorite chat UI. Would love to see it land in any capacity, even with loud preview/unstable/will-change disclaimers, to see what I and others can do with it.

@TheMrCodes
Copy link

Hi, same here as @jmfirth Im a programmer per trade and would love to have a GUI to quickly tinker with LLM pipeline ideas.

For me the solution to have a plugin or integration folder in the backend where mini-projects or scripts are placed that can be called using a ChatGPT plugin structure would be perfect, with the addition benefits of ACLs per User and Plugin and a UI that can display intermediat outputs in a collapsable Button.

For the future vision it also would be cool to have the ability to code up frontend widget that are display only and can be injected into the chat.

But for now I don't really know if in the target audience of this project isn't more for DevOps and Homelab People. Because then It would be handy to have a CLI or Configuration to install or enable / disable integrations.

@mrjones2014
Copy link

But for now I don't really know if in the target audience of this project isn't more for DevOps and Homelab People

If you need some context, I am running it on my home server through NixOS, managed over SSH. Putting configuration or code files on the server is no problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants