Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to generate labels for FUYU #305

Open
aamir-gmail opened this issue Nov 11, 2023 · 1 comment
Open

how to generate labels for FUYU #305

aamir-gmail opened this issue Nov 11, 2023 · 1 comment

Comments

@aamir-gmail
Copy link

In models / fuyu processing_fuyu.py , the method get labels , what is the purpose of special_token_id and how do I get it.
For example my input ids look like this.
" Extract text from this image " , using fuyo processor I pass in the image and text and get input_ids , I am not too sure how to get labels from input_ids using the above method.

@Luodian
Copy link
Owner

Luodian commented Feb 1, 2024

The special_token_id is from Fuyu's design, it's a \x04 that use to separate Questions and Answers.

(if I remember correctly)
Fuyu's template is:
"{question}\n\x04{answer}\x04".

Our template is
"User:{question} Assistant:\x04{answer}\x04".

We also use it to locate the answer's position since we need to mask the {answer} during training.

The code is here~

# src/otter_ai/models/fuyu/processing_fuyu.py
def get_labels(self, input_ids, special_token_id, masking_number=-100):
    # Initialize labels tensor filled with masking_number
    labels = torch.full_like(input_ids, masking_number)

    # Iterate through each sequence in the batch
    for i in range(input_ids.shape[0]):
        seq = input_ids[i]
        # Find the indices of the special_token_id
        indices = (seq == special_token_id).nonzero(as_tuple=False).squeeze()
        # Pair the indices and unmask the tokens between each pairt
        paired_indices = indices.reshape(-1, 2)
        for start, end in paired_indices:
            labels[i, start + 1 : end + 1] = seq[start + 1 : end + 1]

    return labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants