Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use appropriate wait time for retry based on the error message. #14

Closed
ekzhu opened this issue May 11, 2023 · 7 comments · May be fixed by microsoft/FLAML#1045
Closed

Use appropriate wait time for retry based on the error message. #14

ekzhu opened this issue May 11, 2023 · 7 comments · May be fixed by microsoft/FLAML#1045
Labels
enhancement New feature or request llm issues related to LLM

Comments

@ekzhu
Copy link
Collaborator

ekzhu commented May 11, 2023

[flaml.autogen.oai.completion: 05-11 00:50:35] {217} INFO - retrying in 10 seconds...
Traceback (most recent call last):
  File "[...\.venv\Lib\site-packages\flaml\autogen\oai\completion.py]", line 193, in _get_response
    response = openai_completion.create(request_timeout=request_timeout, **config)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...\.venv\Lib\site-packages\openai\api_resources\completion.py]", line 25, in create
    return super().create(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...\.venv\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py]", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "[...\.venv\Lib\site-packages\openai\api_requestor.py]", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...\.venv\Lib\site-packages\openai\api_requestor.py]", line 620, in _interpret_response
    self._interpret_response_line(
  File "[...\.venv\Lib\site-packages\openai\api_requestor.py]", line 683, in _interpret_response_line
    raise self.handle_error_response(
openai.error.RateLimitError: Requests to the Completions_Create Operation under Azure OpenAI API version 2022-12-01 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 59 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.
[flaml.autogen.oai.completion: 05-11 00:50:45] {217} INFO - retrying in 10 seconds...

The error message says "Please retry after 59 seconds", but FLAML keeps retrying in 10-second intervals.

@sonichi
Copy link
Collaborator

sonichi commented May 11, 2023

Thanks. It'll be nice to adjust the retry time according to the error msg.
One workaround is to set flaml.oai.retry_time = 60 in your code for now if that's the most common retry time required.

@sonichi sonichi added enhancement New feature or request good first issue Good for newcomers labels May 11, 2023
@sonichi sonichi transferred this issue from microsoft/FLAML Sep 23, 2023
@Pavel-hb
Copy link

I also get this error:


[autogen.oai.completion: 09-27 12:50:43] {236} INFO - retrying in 10 seconds...
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/autogen/oai/completion.py", line 206, in _get_response
response = openai_completion.create(**config)
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 155, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 299, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 710, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 775, in _interpret_response_line
raise self.handle_error_response(
openai.error.RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-6oXqS68sE8UL8MSONa3w2IpY on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.
INFO:autogen.oai.completion:retrying in 10 seconds...
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/autogen/oai/completion.py", line 206, in _get_response
response = openai_completion.create(**config)
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 155, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 299, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 710, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 775, in _interpret_response_line
raise self.handle_error_response(
openai.error.RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-6oXqS68sE8UL8MSONa3w2IpY on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.

@RobertWeaver
Copy link

Is there a configuration to set the max numbers of requests that can be made per minute? This would help to avoid hitting rate limits for Requests Per Minute. 😅

@sonichi
Copy link
Collaborator

sonichi commented Sep 30, 2023

@Pavel-hb @RobertWeaver I added some answers in #53 . Could you take a look and let me know if it answers your question?

@sonichi sonichi added llm issues related to LLM and removed good first issue Good for newcomers labels Oct 22, 2023
@raghavgrover13
Copy link

raghavgrover13 commented Nov 22, 2023

I also observed the same behavior that retry occurs with 10 seconds, it definitely should retry based on the time in the error message.
Also is there a way to dynamically reduce max tokens? I have seen in some instances with GPT 3.5 Turbo 16k that the output in one chat goes beyond the token limit- one should calculate the token count in system and user prompt and then set max token based on the model - max token limit- the token used in user prompt and system prompt?

@sonichi
Copy link
Collaborator

sonichi commented Dec 3, 2023

The retry time behavior has been changed in v0.2. https://microsoft.github.io/autogen/docs/Installation#python
Dynamically reducing max tokens is an idea @kevin666aa may want to take a note.

@yiranwu0
Copy link
Collaborator

yiranwu0 commented Dec 4, 2023

The retry time behavior has been changed in v0.2. https://microsoft.github.io/autogen/docs/Installation#python Dynamically reducing max tokens is an idea @kevin666aa may want to take a note.

The "calculate the token count in system and user prompt" are already the features of CompressibeAgent.

I think "dynamically reducing max tokens" is not necessary to have. The similar thing is achieved by the "TERMINATE" mode of CompressibeAgent: when the token count is smaller than max token limit, the completion will be allowed. OpenAI will automatically return when max token limit of the model is reached. After that, the next completion will be terminated due to token count limit. So, no retry due to token limit will be happen.

@raghavgrover13 Can you checkout https://github.com/microsoft/autogen/blob/main/notebook/agentchat_compression.ipynb to see if it helps with your problem?

@ekzhu ekzhu closed this as not planned Won't fix, can't repro, duplicate, stale Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request llm issues related to LLM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants