Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI Model for complex structures #735

Open
3 tasks done
pietz opened this issue Jan 12, 2024 · 5 comments
Open
3 tasks done

AI Model for complex structures #735

pietz opened this issue Jan 12, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@pietz
Copy link

pietz commented Jan 12, 2024

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to look for a similar issue and didn't find it.
  • I searched the Marvin documentation for this feature.

Describe the current behavior

The AI model works well for rather simple data schemas but when the complexity grows and output documents become longer, it starts to fall apart.

The first stage I observed is that it starts "quoting" content from the original document. Changing date formats or summarizing text don't work anymore because the model starts to do copy & paste. The second stage when it gets even more complex, is that you start getting validation errors and the model becomes unusable.

The length limitation is also a problem in some of my use cases. Generating a list of 50 elements might hit the output length limit, which is too bad. I started experimenting with taking complex or long models apart in multiple smaller and simpler models. This works so well, that I was able to convert all my GPT-4 use cases to GPT-3.5.

I would like to suggest a change that is able to handle recursive AI model calls. At the moment you make your top-level model the AI model with a decorator, which treats all underlying pydantic models at the same time. I think it would be great if I could use the same decorator for some of these underlying models, in which case marvin would split it's LLM call into multiple calls.

That way you could dynamically experiment with different levels of complexity until it works to your liking. Furthermore this behavior can be executed asynchronously which greatly accelerates the response time.

Describe the proposed behavior

class Job(BaseModel):
    """A Job or position extracted from the resume"""
    position: str = Field(..., description="Name of the position")
    company: str = Field(..., description="Company name")
    start_date: str = Field(None, description="Start date of the job")
    end_date: str = Field(None, description="End date of the job or 'Present'")
    top_keywords: list[str] = Field(None, description="List of max. top 10 keywords, skills and technologies used for the job")

class Degree(BaseModel):
    """Degree or other type of education extracted from the resume"""
    name: str = Field(..., description="Name of the degree and field of study")
    institution: str = Field(None, description="University name")
    start_date: str = Field(None, description="Start date of the studies")
    end_date: str = Field(None, description="End date of the studies")

@marvin.ai_model(client=client, temperature=0.0)
class Resume(BaseModel):
    """Resume data extracted from the resume"""
    name: str = Field(..., description="The name of the person")
    email: str = Field(None, description="Email address of the person")
    phone: str = Field(None, description="Phone number of the person")
    location: str = Field(None, description="Current residence of the person")
    websites: str = Field(None, description="Website like LinkedIn, GitHub, Behance, etc.")
    work_experience: list[Job] = None
    education: list[Degree] = None
    skills: list[str] = Field(None, description="List of core skills and technologies")
    languages: list[str] = Field(None, description="List of languages spoken by the person")

This model would likely be too complex to work well in the real world. What if we could give the ai_model decorator also to the other 2 models and by calling the final Resume model, marvin would split the request into 3 pieces.

Example Use

No response

Additional context

What do you think?

@pietz pietz added the enhancement New feature or request label Jan 12, 2024
@zzstoatzz
Copy link
Collaborator

zzstoatzz commented Jan 17, 2024

hi @pietz

taking complex or long models apart in multiple smaller and simpler models

I have also found a lot of success with this strategy

this is an interesting idea

handle recursive AI model calls

on first thought I'm inclined to say that I would want to write a util myself that implements something like a recursive pattern to fill out a parent and then child models. I'd be interested to see any sketches you have on implementation!

@pietz
Copy link
Author

pietz commented Jan 21, 2024

The cleanest API from a user perspective would be that if a "sub model" is not a direct instance of BaseModel but marvin.Model instead, the calls would be split into multiple recursive calls.

Consider this example:

class Address(BaseModel):
    street: str
    zip: str
    state: str

class User(marvin.Model):
    name: str
    email: str
    address: Address

This would lead to a normal call like it's implemented at the moment. However...

class Address(marvin.Model):
    street: str
    zip: str
    state: str

class User(marvin.Model):
    name: str
    email: str
    address: Address

...would lead to two calls. Address would be called standalone first. Programmatically because it only consists non-BaseModel types. It would then return the parsed address which is assigned to the address variable in User. Next, the user model would be called without address, because that value already exists.

What I'm asking for is a major functionality because with this comes also the idea of async calls, which I will open another issue for.

What do you think?

@zzstoatzz
Copy link
Collaborator

zzstoatzz commented Jan 30, 2024

i like this idea! I have a rough sense of how we can do this and I can explore some implementations soon

@pietz
Copy link
Author

pietz commented Jan 31, 2024

@zzstoatzz What are some ways of helping Marvin development at the moment? I'm using this library a lot and I would like to contribute instead of asking for features all the time.

@zzstoatzz
Copy link
Collaborator

we are open to all types of contributions - feel free to open a draft PR that achieves what you want to see and then we can chat on that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants