Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot specify role: system in LLM::Anthropic #603

Closed
kokuyouwind opened this issue May 2, 2024 · 8 comments
Closed

Cannot specify role: system in LLM::Anthropic #603

kokuyouwind opened this issue May 2, 2024 · 8 comments

Comments

@kokuyouwind
Copy link
Contributor

Description

In OpenAI and Ollama, it is possible to specify system, user, and assistant as roles.
But in Anthropic, only user and assistant can be specified, and an error will occur if system is specified.

> Langchain::LLM::Anthropic.new.chat(messages: [{ role: 'system', content: 'Act as a professional programmer.'}, { role: 'user', content: 'What is LLM?' }])
#<Langchain::LLM::AnthropicResponse:0x00000001295e3768
 @model=nil,
 @raw_response=
  {"type"=>"error",
   "error"=>
    {"type"=>"invalid_request_error",
     "message"=>"messages: Unexpected role \"system\". The Messages API accepts a top-level `system` parameter, not \"system\" as an input message role."}}>

I know that the Anthropic API specification does not allow a system to be specified for a role, and instead requires the use of a top-level system parameter.
However, as an LLM framework, I believe it is preferable to be able to absorb the differences between each service and use a common interface.

Reference case: Python Library

In the python library, ChatAnthropic accepts ChatPromptTemplate.from_messages([("system", system), ("human", human)]).
https://python.langchain.com/docs/integrations/chat/anthropic/

Proposal

As a preprocessing step in LLM::Anthropic#chat, how about taking only the role: system messages in messages in addition to the top-level argument system, and putting them all joined on a line break into the system of the API?
This would not break the existing behavior and could be used for cases with multiple role: system messsages.

@andreibondarev
Copy link
Collaborator

@kokuyouwind Thank you for this proposal! I'm curious do you have a need for this in your applications? Would this make passing the system role easier for you?

@kokuyouwind
Copy link
Contributor Author

kokuyouwind commented May 3, 2024

@andreibondarev

Thanks for the reply.

I'm curious do you have a need for this in your applications? Would this make passing the system role easier for you?

I am building a development tool that will allow users to freely configure their preferred LLM.
The tool generates messages containing role: system and calls chat, so Anthropic is not available.

ref. kokuyouwind/rbs_goose@6609103#diff-439afd54f569f0ca0e76999e2d9819d89e82376de99b31652b1aafc81ec64144R91-R93

To avoid this, one of the following actions is needed.

  • Change the method of generating messages by determining if it is Anthropic or not.
  • Give up using system role messages and include all of them in user role messages.
  • Modify LLM::Anthropic to accept role: system messages (this proposal)

If this proposal is accepted, the tool does not need to care about the LLM type and can be implemented simply.
I also think it is a good idea to make the interface compatible so that Agents, etc. can be used with any LLM in the future.

@andreibondarev
Copy link
Collaborator

@kokuyouwind Thank you for your PR and using this library in your gem 😄

I'd like to actually think through this after the Langchain::Assistant Anthropic support is added here: #543.

We would need to add an AnthropicMessage class like this one: https://github.com/patterns-ai-core/langchainrb/pull/513/files#diff-86baf19d3db04ca4b773792c27230e17bb4ba4f9373d17688b8a2f67de6f9c28

@kokuyouwind
Copy link
Contributor Author

kokuyouwind commented May 8, 2024

@andreibondarev
Ok, it is true that a layer of Messages class seems to be better than a layer of raw data to absorb incompatibilities between LLMs.

Personally, I would be happy if it could be used in cases where Assistant is not used (cases where LLM::XXXClient#chat is used directly, or where a chain of RAGs, QA bots, etc. is set up).
For this purpose, it would be necessary to create a new layer such as LLM::Messages or Prompt::ChatPromptTemplate instead of Assistants::Messages, so that it can be passed to LLM::Client#chat. This would be a major modification and may be difficult to undertake immediately.

@kokuyouwind
Copy link
Contributor Author

You may close this Issue and pull request #604, as I have resolved all of my original issues by bringing them all to the user message.

If you would consider it, can we start another Issue as “Separating implementations not related to tools from Assistant”?
As you say, Assistant is able to absorb the differences between LLMs, but it seems to include implementations that are not directly related to tools.
By separating these from Assistant, I think we can absorb incompatibilities in the way system prompts are passed even in cases where tools are not used, and we can also handle cases where we want to use Threads directly, as mentioned in #608.

@andreibondarev
Copy link
Collaborator

@kokuyouwind I've been thinking that the #chat(messages: []) method could accept the Langchain::Messages::* instances directly. For example:

message_1 = Langchain::Messages::AnthropicMessage.new(role:"user", content:"hi!")
message_2 = Langchain::Messages::AnthropicMessage.new(role:"assistant", content:"Hey! How can I help?")
message_3 = Langchain::Messages::AnthropicMessage.new(role:"assistant", content:"Help me debug my computer")

Langchain::LLM::Anthropic.new(...).chat(messages: [message_1, message_2, message_3])

@kokuyouwind
Copy link
Contributor Author

@andreibondarev
I think it is a very excellent idea.
To make it more generic, I think it would be better to make it not an AnthropicMessage but a per-role message instance that is LLM-independent.

message_1 = Langchain::Messages::UserMessage.new("hi!")
message_2 = Langchain::Messages::AssistantMessage.new("Hey! How can I help?")
message_3 = Langchain::Messages::AssistantMessage.new("Help me debug my computer")

Langchain::LLM::Anthropic.new(...).chat(messages: [message_1, message_2, message_3])

The class names above are aligned with the role notation, but could be aligned with Python's LangChain Messages, such as HumanMessage, AIMessage, etc.

@kokuyouwind
Copy link
Contributor Author

I did not notice that a discussion forum was created in #629.
I will close this issue so that we can discuss it there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants