Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Using the new GPT-4o as the LLM model makes Pythagora sometimes spew out and endless loop which stops the next agent from being able to do anything #924

Closed
MathieuDawes opened this issue May 15, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@MathieuDawes
Copy link

Version

VisualStudio Code extension

Operating System

MacOS

What happened?

When I run Pythagora using GPT-4o it produces very long json outputs and keeps repeating the code for each step over and over. Whereas when it's using gpt-4-turbo it doesn't repeat the code unnecessarily in the json instructions.

@MathieuDawes MathieuDawes added the bug Something isn't working label May 15, 2024
@RKonnerth-A
Copy link

I have the same Bug, or is it a feature for creating a vector knowledge base for the project ?

@senko
Copy link
Collaborator

senko commented Jun 4, 2024

Thanks for reporting this @MathieuDawes and @RKonnerth-A.

We've had a lot of problems with GPT 4o, in our use cases it produced much worse results than gpt-4-turbo. We also compared gpt-4-turbo with gpt-4-turbo-preview and found out that gpt-4-turbo-preview works best.

Currently it's unclear whether that's to any optimizations OpenAI did to make gpt-4o run faster/cheaper (as we're at the bleeding/expensive edge of AI capabilities, any such change might have an outsize impact on us), or whether our promtps are too fine-tuned (pun intended!) for gpt-4-turbo-preview, or a combination of the two.

We're currently doing a deeper analysis but for now recommend using gpt-4-turbo-preview for the best results.

@senko senko closed this as completed Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants