bug: AzureChatOpenAI model name not recorded correctly #2029

arthurGrigo · 2024-05-09T22:06:22Z

Describe the bug

The 'Total Cost' in langfuse shows a lower value than langchains get_openai_callback() returns.

Tested with Azure's GPT API. Not sure if it's the same with OpenAI's API.

langchains get_openai_callback(): Total Cost (USD): $0.0010149999999999998
langfuse UI: Total Cost: $0.0008

This discrepancy becomes more obvious with complex chains.
I have already seen differences of up to 3 x.

To reproduce

Requirements:

langfuse.version is 2.27.0

Python Version: 3.11.9 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:40:41) [MSC v.1916 64 bit (AMD64)]

Package Information

langchain_core: 0.1.46
langchain: 0.1.16
langchain_community: 0.0.34
langsmith: 0.1.51
langchain_experimental: 0.0.57
langchain_openai: 0.1.4
langchain_text_splitters: 0.0.1
langgraph: 0.0.39

Code to reproduce:

import os

AZURE_API_BASE_GPT = os.getenv('AZURE_OPENAI_ENDPOINT_GPT_3_5')
AZURE_API_KEY_GPT = os.getenv('AZURE_OPENAI_KEY_GPT_3_5')
DEPLOYMENT_NAME_GPT = os.getenv('GPT_DEPLOYMENT_NAME_GPT_3_5')

gpt_parameter_temperature = 0.0

openai_api_type = "azure"
openai_api_version = "2023-05-15"

from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
                    azure_endpoint= AZURE_API_BASE_GPT,
                    openai_api_key= AZURE_API_KEY_GPT,
                    azure_deployment= DEPLOYMENT_NAME_GPT,
                    temperature= gpt_parameter_temperature,
                    openai_api_version= openai_api_version,
                    openai_api_type= openai_api_type,
                    callbacks= [],
                    verbose= True,
                )

# gpt-35-turbo-0613'
deployment_name = DEPLOYMENT_NAME_GPT

enable_tracing_langfuse = True

LANGFUSE_HOST = os.getenv('LANGFUSE_HOST', "http://localhost:3000")
LANGFUSE_PUBLIC_KEY = os.getenv('LANGFUSE_PUBLIC_KEY', "pk-...")
LANGFUSE_SECRET_KEY = os.getenv('LANGFUSE_SECRET_KEY', "sk-...")


callbacks = []

if enable_tracing_langfuse:
    print("enable_tracing_langfuse")
    from langfuse.callback import CallbackHandler
    callback_langfuse = CallbackHandler(
                                    public_key= LANGFUSE_PUBLIC_KEY,
                                    secret_key= LANGFUSE_SECRET_KEY,
                                    host= LANGFUSE_HOST
                                )
    
    callbacks.append(callback_langfuse)

from langchain_core.runnables.config import RunnableConfig

MAX_CONCURRENCY = 2

runnable_conf = RunnableConfig(
                            max_concurrency= MAX_CONCURRENCY, 
                            run_name= "my_langfuse_experiment", 
                            callbacks= callbacks,
                            tags=[
                                    deployment_name, 
                                    f'temp={gpt_parameter_temperature}'
                                ]
                        )

from langchain_core.runnables import RunnableParallel
from langchain_core.output_parsers.string import StrOutputParser
from langchain_community.callbacks import get_openai_callback
from langchain.prompts import ChatPromptTemplate

str_prsr = StrOutputParser()

prompt =  ChatPromptTemplate.from_template("Write a sentence with 200 short words about {thing}.")

# Makes no difference if parallel or not ...
# chain = RunnableParallel( 
#                     run_1 = prompt | llm | str_prsr,
#                     run_2 = prompt | llm | str_prsr,
#                     run_3 = prompt | llm | str_prsr,
#                     run_4 = prompt | llm | str_prsr,
#                 )

chain =  prompt | llm | str_prsr
                    
user_input = {"thing": "apples"} 

with get_openai_callback() as cb:
    sentences = chain.invoke(user_input, config= runnable_conf)

print(cb)

# Tokens Used: 512
#	Prompt Tokens: 18
#	Completion Tokens: 494
# Successful Requests: 1
# Total Cost (USD): $0.0010149999999999998



# The langfuse UI shows the same token usage information but the 'Total Cost' is different 
# In langfuse 'Total Cost' = $0.0008

# This discrepancy becomes more obvious with complex chains. 
# I have already seen differences of 3 x.

SDK and container versions

No response

Additional information

No response

Are you interested to contribute a fix for this bug?

Yes

marcklingen · 2024-05-09T22:41:14Z

thanks for reporting this, are the token counts correct in Langfuse? What model name gets recorded?

arthurGrigo · 2024-05-09T22:51:46Z

thanks for reporting this, are the token counts correct in Langfuse? What model name gets recorded?

I think you are on the right track!
In the returned completion object the model_name='gpt-35-turbo'.
Shouldn't this be 'gpt-35-turbo-0613'?

marcklingen · 2024-05-09T22:54:40Z

there are two model names available usually

the one that's used to create the request
the one that's included in the response

Usually the response name is more specific as the request does not need to include the model version.

Here the opposite seems to be the case. we'll need to have a look and add some tests to CI

token counts are correct?

arthurGrigo · 2024-05-09T22:59:23Z

there are two model names available usually

the one that's used to create the request

the one that's included in the response

Usually the response name is more specific as the request does not need to include the model version.

Here the opposite seems to be the case. we'll need to have a look and add some tests to CItoken counts are correct?

Yes, token counts are the same.
Thanks for your quick reply!

marcklingen · 2024-05-09T23:00:14Z

perfect, thanks for confirming

arthurGrigo · 2024-05-20T22:27:48Z

Note how the request says "model": "gpt-3.5-turbo" even though I use gpt-4 in this example. Azure knows which model to use because I created a deployment in azure and associated it with "gpt4-1106-preview".

When using azure you have to create a named deployment and assign a model to it. When calling their API, the URL holds the deployment name. You could hope to parse the model name from the URL which would make it easy for the user. In case the deployment name has a weird name without the model name in it you would need the user to provide a dict in which he maps the deployment name to a model name I guess.

Ideally you check how langchain does it in get_openai_callback().

Azure API call example:

{

    "body": null,

    "code": null,

    "type": null,

    "param": null,

    "message": "Connection error.",

    "request": {

        "url": {

            "_uri_reference": [

                "https",

                "",

                "xyz.openai.azure.com",

                null,

                "/openai/deployments/gpt4-1106-preview/chat/completions",

                "api-version=2023-05-15",

                null

            ]

        },

        "method": "POST",

        "stream": {

            "_stream": {

                "messages": [

                    {

                        "content": "...",

                        "role": "user"

                    }

                ],

                "model": "gpt-3.5-turbo",

                "n": 1,

                "stream": false,

                "temperature": 1

            }

        },

        "headers": {},

        "_content": {

            "messages": [

                {

                    "content": "...",

                    "role": "user"

                }

            ],

            "model": "gpt-3.5-turbo",

            "n": 1,

            "stream": false,

            "temperature": 1

        },

        "extensions": {

            "timeout": {

                "pool": null,

                "read": null,

                "write": null,

                "connect": null

            }

        }

    }

}

sbhadana · 2024-06-03T10:04:48Z

Same for me, using gpt-4-32k 0613 from Azureopenai but langfuse reporting gpt-3.5-turbo on dashboard.

marcklingen · 2024-06-03T10:46:32Z

Same for me, using gpt-4-32k 0613 from Azureopenai but langfuse reporting gpt-3.5-turbo on dashboard.

are you on the latest sdk version?

arthurGrigo added the 🐞❔ unconfirmed bug label May 9, 2024

arthurGrigo closed this as completed May 9, 2024

arthurGrigo reopened this May 9, 2024

marcklingen added bug Something isn't working sdk-python integration-langchain and removed 🐞❔ unconfirmed bug labels May 9, 2024

marcklingen changed the title ~~bug: Total Cost different from langchains get_openai_callback() - LCEL with AzureChatOpenAI~~ bug: AzureChatOpenAI model name not recorded correctly May 9, 2024

marcklingen assigned maxdeichmann May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: AzureChatOpenAI model name not recorded correctly #2029

bug: AzureChatOpenAI model name not recorded correctly #2029

arthurGrigo commented May 9, 2024 •

edited

marcklingen commented May 9, 2024

arthurGrigo commented May 9, 2024

marcklingen commented May 9, 2024

arthurGrigo commented May 9, 2024

marcklingen commented May 9, 2024

arthurGrigo commented May 20, 2024 •

edited

sbhadana commented Jun 3, 2024

marcklingen commented Jun 3, 2024

bug: AzureChatOpenAI model name not recorded correctly #2029

bug: AzureChatOpenAI model name not recorded correctly #2029

Comments

arthurGrigo commented May 9, 2024 • edited

Describe the bug

To reproduce

Requirements:

Package Information

SDK and container versions

Additional information

Are you interested to contribute a fix for this bug?

marcklingen commented May 9, 2024

arthurGrigo commented May 9, 2024

marcklingen commented May 9, 2024

arthurGrigo commented May 9, 2024

marcklingen commented May 9, 2024

arthurGrigo commented May 20, 2024 • edited

sbhadana commented Jun 3, 2024

marcklingen commented Jun 3, 2024

arthurGrigo commented May 9, 2024 •

edited

arthurGrigo commented May 20, 2024 •

edited