Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM Observability: Part 1 (OpenAI) #1058

Merged
merged 10 commits into from
May 22, 2024
Merged

LLM Observability: Part 1 (OpenAI) #1058

merged 10 commits into from
May 22, 2024

Conversation

langchain4j
Copy link
Owner

@langchain4j langchain4j commented May 7, 2024

Issue

#199

Change

  • Added ModelListener, ChatLanguageModelRequest, and ChatLanguageModelResponse that are compatible (have all the required attributes) with OTEL LLM semconv draft.
  • Added an option to attach multiple ModelListener<ChatLanguageModelRequest, ChatLanguageModelResponse> to OpenAiChatModel and OpenAiStreamingChatModel (pilot module).

ChatLanguageModelRequest

public class ChatLanguageModelRequest {

    private final String model;
    private final Double temperature;
    private final Double topP;
    private final Integer maxTokens;
    private final List<ChatMessage> messages;
    private final List<ToolSpecification> toolSpecifications;
}

ChatLanguageModelResponse

public class ChatLanguageModelResponse {

    private final String id;
    private final String model;
    private final TokenUsage tokenUsage;
    private final FinishReason finishReason;
    private final AiMessage aiMessage;
}

Example

ModelListener<ChatLanguageModelRequest, ChatLanguageModelResponse> modelListener =
                new ModelListener<ChatLanguageModelRequest, ChatLanguageModelResponse>() {

                    @Override
                    public void onRequest(ChatLanguageModelRequest request) {
                        // handle request
                    }

                    @Override
                    public void onResponse(ChatLanguageModelResponse response, ChatLanguageModelRequest request) {
                        // handle response
                    }

                    @Override
                    public void onError(Throwable error, ChatLanguageModelResponse response, ChatLanguageModelRequest request) {
                        // handle error
                    }
                };

        OpenAiChatModel model = OpenAiChatModel.builder()
                .apiKey(...)
                .listeners(singletonList(modelListener))
                .build();

General checklist

  • There are no breaking changes
  • I have added unit and integration tests for my change
  • I have manually run all the unit and integration tests in the module I have added/changed, and they are all green
  • I have manually run all the unit and integration tests in the core and main modules, and they are all green
  • I have added/updated the documentation
  • I have added an example in the examples repo (only for "big" features)

@langchain4j
Copy link
Owner Author

langchain4j commented May 7, 2024

@brunobat could you please take a look? 🙏

@geoand
Copy link
Contributor

geoand commented May 13, 2024

I'm also very interested in this so we can augment in the already present observability features in the Quarkus implementation.

I also think that having something like this in sooner that rather than later would be very helpful as these new integration points could definitely prove useful for various tasks.

@geoand
Copy link
Contributor

geoand commented May 13, 2024

I believe that onResponse could also be passed ChatLanguageModelRequest

@langchain4j
Copy link
Owner Author

@geoand yeah I was just adding it :)

@geoand
Copy link
Contributor

geoand commented May 13, 2024

💪🏼

@langchain4j
Copy link
Owner Author

langchain4j commented May 13, 2024

@geoand do you think id can still be useful for something (after adding request to onResponse())?

@geoand
Copy link
Contributor

geoand commented May 13, 2024

I very much doubt it's useful

@langchain4j langchain4j marked this pull request as ready for review May 21, 2024 16:18
@langchain4j langchain4j changed the title Draft: LLM Observability LLM Observability May 21, 2024
@langchain4j langchain4j changed the title LLM Observability LLM Observability Part 1 May 22, 2024
@langchain4j langchain4j changed the title LLM Observability Part 1 LLM Observability: Part 1 May 22, 2024
@langchain4j langchain4j merged commit 6818e27 into main May 22, 2024
6 checks passed
@langchain4j langchain4j deleted the llm-observability branch May 22, 2024 11:14
@geoand
Copy link
Contributor

geoand commented May 23, 2024

I am actually thinking that the listener might need to be changed to:

public interface ModelListener<Request, Response> {

    default OnRequestResult<Request> onRequest(Request request) {
        return null;
    }

    default void onResponse(Response response, OnRequestResult<Request> onRequestResult) {

    }

    default void onError(Throwable error, Response response, OnRequestResult<Request> onRequestResult) {

    }

    /**
     * This name is horrible, and should replaced by something better 
     */    
    interface OnRequestResult<Request> {
        Request request();
    }
}

The reason is that an integration might want to include some custom data when the request is created and that data might need to be accessible when the result comes back (or an error occurs).

WDYT?

@langchain4j
Copy link
Owner Author

@geoand can you provide an example when this might be required?

@geoand
Copy link
Contributor

geoand commented May 23, 2024

I am thinking that with OpenTelemetry if I want to open a new span on request and close it on end, I need a way to keep hold of that. Now OTel might have a way to do that already (with ThreadLocal, or some pluggable strategy), but other APIs might not.

@brunobat
Copy link

We can add to the request object the OTel context

@geoand
Copy link
Contributor

geoand commented May 23, 2024

Right, but other APIs might not have that kind of capability

@langchain4j langchain4j changed the title LLM Observability: Part 1 LLM Observability: Part 1 (OpenAI) May 23, 2024
@geoand
Copy link
Contributor

geoand commented May 24, 2024

Here is another example:

With the current API, how would one use Micrometer to add a timed metric of the operation?

P.S. I take complete responsibility for not spotting this issue earlier.

@geoand
Copy link
Contributor

geoand commented May 24, 2024

We can add to the request object the OTel context

Looking at this more, I don't see how one could close the Scope with the current LangChain4j API.

@geoand
Copy link
Contributor

geoand commented May 24, 2024

I have a draft proposal of what I would like to do here.

Using that in Quarkus LangChain4j I would utilize it like so:

public class OpenTelemetryChatLanguageModelListener implements ModelListener<ChatLanguageModelRequest, OpenTelemetryChatLanguageModelListener.ScopeHoldingOnRequestResult, ChatLanguageModelResponse> {

    private final Tracer tracer;

    @Inject
    public OpenTelemetryChatLanguageModelListener(Tracer tracer) {
        this.tracer = tracer;
    }

    @Override
    public ScopeHoldingOnRequestResult onRequest(ChatLanguageModelRequest request) {
        String name = "ChatCompletions " + request.model();

        Span span = tracer.spanBuilder(name).startSpan();
        Scope scope = span.makeCurrent();

        // TODO: implement

        return new ScopeHoldingOnRequestResult(request, scope);
    }

    @Override
    public void onResponse(ChatLanguageModelResponse chatLanguageModelResponse, ScopeHoldingOnRequestResult request) {
        // TODO: implement
        request.scope.close();
    }

    @Override
    public void onError(Throwable error, ChatLanguageModelResponse chatLanguageModelResponse,
                        ScopeHoldingOnRequestResult request) {
        // TODO: implement
        request.scope.close();
    }

    public static class ScopeHoldingOnRequestResult implements OnRequestResult<ChatLanguageModelRequest> {

        private final ChatLanguageModelRequest request;
        private final Scope scope;

        private ScopeHoldingOnRequestResult(ChatLanguageModelRequest request, Scope scope) {
            this.request = request;
            this.scope = scope;
        }

        @Override
        public ChatLanguageModelRequest request() {
            return null;
        }
    }
}

@brunobat
Copy link

Will comment on the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants