Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot run HF example codes on all three codeLlama-Python-hf models #166

Open
jessyford opened this issue Nov 20, 2023 · 1 comment
Open

Comments

@jessyford
Copy link

jessyford commented Nov 20, 2023

I understand this might be a huggingface-related problem but I cannot find the answer anywhere so I come to ask for help.

On huggingface there is a example code for codellama model:

from transformers import LlamaForCausalLM, CodeLlamaTokenizer

tokenizer = CodeLlamaTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
model = LlamaForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf")
PROMPT = '''def remove_non_ascii(s: str) -> str:
""" <FILL_ME>
return result
'''
input_ids = tokenizer(PROMPT, return_tensors="pt")["input_ids"]
generated_ids = model.generate(input_ids, max_new_tokens=128)

filling = tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens = True)[0]
print(PROMPT.replace("<FILL_ME>", filling))

And the output is like:

def remove_non_ascii(s: str) -> str:
    """ Remove non-ASCII characters from a string.

    Args:
        s: The string to remove non-ASCII characters from.

    Returns:
        The string with non-ASCII characters removed.
    """
    result = ""
    for c in s:
        if ord(c) < 128:
            result += c
    return result

However, this works fine with all the original codellama model and codellama instruct models. But all three codellama-Python models will show tons of "Assertion srcIndex < srcSelectDimSize failed" errors and fail to complete the running.
The second strange thing is that, if I delete the ' <FILL_ME> ' part in the PROMPT when I am using codellama-Python model, then the error won't show , however there will still be no output.

So my questions are:

  1. Why will these "Assertion srcIndex < srcSelectDimSize failed" errors happen on codellama-Python, as well as the no-output problems after I deleting the <FILL_ME> in the PROMPT? From my point of view, codeLlama-python is just modified on more Python tasks, and it should not be fundamentally different with original codellama and codellama-instruct.

  2. Why the readme of Huggingface page says Codellama-Python cannot do infilling? Why modification on Python tasks will make the model cannot do infilling? Is the problem in my question 1 related to this lack of infilling of codellama-Python?

Thank you so much for your precious time.

@humza-sami
Copy link

humza-sami commented Dec 24, 2023

As far as I know, codellama-Python is not for infilling. Please refer to its documentation.
image

This model is not finetuned on infilling dataset. It is finetuned on only next token prediction dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants