Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate.json() gives ValidationError when run with mistral-7b-instruct-v0.2.Q6_K.gguf #837

Open
DorotaBjoorn opened this issue Apr 25, 2024 · 8 comments
Labels
correctness Everything related to the generation correctness JSON

Comments

@DorotaBjoorn
Copy link

DorotaBjoorn commented Apr 25, 2024

Describe the issue as clearly as possible:

Example code with Pydantic and generate.json() throws a ValidationError
Code is run from Jupyter Notebook
Output is ok if age: int is removed from the Pydantic class

Steps/code to reproduce the bug:

from outlines import models, generate
from llama_cpp import Llama

llm = Llama("/models/mistral-7b-instruct-v0.2.Q6_K.gguf", n_gpu_layers=10, n_ctx=0, verbose=False)
model = models.LlamaCpp(llm) 

from pydantic import BaseModel, Field
class User(BaseModel):
    first_name: str
    last_name: str
    age: int

generator = generate.json(model, User, whitespace_pattern="")

result = generator(
    """Based on user information create a user profile with the fields first_name, last_name, age.
    User information is: Jane Doe age=10"""
)

print(result)

Expected result:

User(first_name="Jane", last_name="Doe", age=10)

Error message:

JSONDecodeError                           Traceback (most recent call last)
File ~/LLM-diploma-project/venv/lib/python3.10/site-packages/pydantic/main.py:1097, in BaseModel.parse_raw(cls, b, content_type, encoding, proto, allow_pickle)
   1096 try:
-> 1097     obj = parse.load_str_bytes(
   1098         b,
   1099         proto=proto,
   1100         content_type=content_type,
   1101         encoding=encoding,
   1102         allow_pickle=allow_pickle,
   1103     )
   1104 except (ValueError, TypeError) as exc:

File ~/LLM-diploma-project/venv/lib/python3.10/site-packages/pydantic/deprecated/parse.py:49, in load_str_bytes(b, content_type, encoding, proto, allow_pickle, json_loads)
     48         b = b.decode(encoding)
---> 49     return json_loads(b)  # type: ignore
     50 elif proto == Protocol.pickle:

File /usr/lib/python3.10/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:

File /usr/lib/python3.10/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
    333 """Return the Python representation of ``s`` (a ``str`` instance
    334 containing a JSON document).
    335 
    336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338 end = _w(s, end).end()

File /usr/lib/python3.10/json/decoder.py:353, in JSONDecoder.raw_decode(self, s, idx)
    352 try:
--> 353     obj, end = self.scan_once(s, idx)
    354 except StopIteration as err:

JSONDecodeError: Unterminated string starting at: line 1 column 34 (char 33)

During handling of the above exception, another exception occurred:

ValidationError                           Traceback (most recent call last)
Cell In[8], line 9
      5     age: int
      7 generator = generate.json(model, User, whitespace_pattern="")
----> 9 result = generator(
     10     """Based on user information create a user profile with the fields first_name, last_name, age.
     11     User information is: Jane Doe age=10"""
     12 )
     14 print(result)

File ~/LLM-diploma-project/venv/lib/python3.10/site-packages/outlines/generate/api.py:501, in SequenceGeneratorAdapter.__call__(self, prompts, max_tokens, stop_at, seed, **model_specific_params)
    489 generation_params = self.prepare_generation_parameters(
    490     max_tokens, stop_at, seed
    491 )
    493 completions = self.model.generate(
    494     prompts,
    495     generation_params,
   (...)
    498     **model_specific_params,
    499 )
--> 501 return format(completions)

File ~/LLM-diploma-project/venv/lib/python3.10/site-packages/outlines/generate/api.py:487, in SequenceGeneratorAdapter.__call__.<locals>.format(sequences)
    485     return [format(sequence) for sequence in sequences]
    486 else:
--> 487     return self.format_sequence(sequences)

File ~/LLM-diploma-project/venv/lib/python3.10/site-packages/outlines/generate/json.py:50, in json.<locals>.<lambda>(x)
     48     regex_str = build_regex_from_schema(schema, whitespace_pattern)
     49     generator = regex(model, regex_str, sampler)
---> 50     generator.format_sequence = lambda x: schema_object.parse_raw(x)
     51 elif callable(schema_object):
     52     schema = pyjson.dumps(get_schema_from_signature(schema_object))

File ~/LLM-diploma-project/venv/lib/python3.10/site-packages/pydantic/main.py:1124, in BaseModel.parse_raw(cls, b, content_type, encoding, proto, allow_pickle)
   1117     # ctx is missing here, but since we've added `input` to the error, we're not pretending it's the same
   1118     error: pydantic_core.InitErrorDetails = {
   1119         # The type: ignore on the next line is to ignore the requirement of LiteralString
   1120         'type': pydantic_core.PydanticCustomError(type_str, str(exc)),  # type: ignore
   1121         'loc': ('__root__',),
   1122         'input': b,
   1123     }
-> 1124     raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
   1125 return cls.model_validate(obj)

ValidationError: 1 validation error for User
__root__
  Unterminated string starting at: line 1 column 34 (char 33) [type=value_error.jsondecode, input_value='{"first_name":"Jane","last_name":"Doe', input_type=str]

Outlines/Python version information:

Version information
0.0.40
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]

accelerate==0.28.0
aiohttp==3.9.3
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.3.0
asttokens==2.4.1
astunparse==1.6.3
async-timeout==4.0.3
attrs==23.2.0
beautifulsoup4==4.12.3
certifi==2024.2.2
charset-normalizer==3.3.2
ci-info==0.3.0
click==8.1.7
cloudpickle==3.0.0
comm==0.2.2
configobj==5.0.8
configparser==6.0.1
contourpy==1.2.1
cycler==0.12.1
dataclasses-json==0.6.4
datasets==2.18.0
debugpy==1.8.1
decorator==5.1.1
Deprecated==1.2.14
dill==0.3.8
dirtyjson==1.0.8
diskcache==5.6.3
distro==1.9.0
etelemetry==0.3.1
evaluate==0.4.1
exceptiongroup==1.2.0
executing==2.0.1
fastapi==0.110.1
filelock==3.13.3
fonttools==4.51.0
frozenlist==1.4.1
fsspec==2024.2.0
greenlet==3.0.3
guidance==0.1.13
h11==0.14.0
httpcore==1.0.5
httplib2==0.22.0
httpx==0.27.0
huggingface-hub==0.20.3
idna==3.6
interegular==0.3.3
ipykernel==6.29.4
ipython==8.23.0
isodate==0.6.1
jedi==0.19.1
Jinja2==3.1.3
joblib==1.3.2
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter_client==8.6.1
jupyter_core==5.7.2
kiwisolver==1.4.5
lark==1.1.9
llama-index==0.10.26
llama-index-agent-openai==0.2.1
llama-index-cli==0.1.11
llama-index-core==0.10.26
llama-index-embeddings-huggingface==0.2.0
llama-index-embeddings-openai==0.1.7
llama-index-extractors-entity==0.1.2
llama-index-indices-managed-llama-cloud==0.1.5
llama-index-legacy==0.9.48
llama-index-llms-huggingface==0.1.4
llama-index-llms-llama-cpp==0.1.3
llama-index-llms-openai==0.1.14
llama-index-multi-modal-llms-openai==0.1.4
llama-index-program-guidance==0.1.2
llama-index-program-openai==0.1.5
llama-index-question-gen-openai==0.1.3
llama-index-readers-file==0.1.13
llama-index-readers-llama-parse==0.1.4
llama-parse==0.4.0
llama_cpp_python==0.2.62
llamaindex-py-client==0.1.15
llvmlite==0.42.0
lmql==0.7.3
looseversion==1.3.0
lxml==5.2.1
MarkupSafe==2.1.5
marshmallow==3.21.1
matplotlib==3.8.4
matplotlib-inline==0.1.6
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
nest-asyncio==1.6.0
networkx==3.2.1
nibabel==5.2.1
nipype==1.8.6
nltk==3.8.1
numba==0.59.1
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.4.99
nvidia-nvtx-cu12==12.1.105
openai==1.16.0
ordered-set==4.1.0
outlines==0.0.40
packaging==24.0
pandas==2.2.1
parso==0.8.3
pathlib==1.0.1
pexpect==4.9.0
pillow==10.3.0
platformdirs==4.2.0
prompt-toolkit==3.0.43
protobuf==5.26.1
prov==2.0.0
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==15.0.2
pyarrow-hotfix==0.6
pydantic==2.6.4
pydantic_core==2.16.3
pydot==2.0.0
pyformlang==1.0.9
Pygments==2.17.2
PyMuPDF==1.24.0
PyMuPDFb==1.24.0
pyparsing==3.1.2
pypdf==4.1.0
python-dateutil==2.9.0.post0
pytz==2024.1
pyxnat==1.6.2
PyYAML==6.0.1
pyzmq==25.1.2
rdflib==7.0.0
referencing==0.35.0
regex==2023.12.25
requests==2.31.0
responses==0.18.0
rpds-py==0.18.0
safetensors==0.4.2
scikit-learn==1.4.2
scipy==1.13.0
sentence-transformers==2.7.0
seqeval==1.2.2
simplejson==3.19.2
six==1.16.0
sniffio==1.3.1
soupsieve==2.5
span-marker==1.5.0
SQLAlchemy==2.0.29
stack-data==0.6.3
starlette==0.37.2
striprtf==0.0.26
sympy==1.12
tenacity==8.2.3
termcolor==2.4.0
threadpoolctl==3.4.0
tiktoken==0.6.0
tokenizers==0.15.2
torch==2.2.2
tornado==6.4
tqdm==4.66.2
traitlets==5.14.2
traits==6.3.2
transformers==4.39.3
triton==2.2.0
typing-inspect==0.9.0
typing_extensions==4.11.0
tzdata==2024.1
urllib3==2.2.1
uvicorn==0.29.0
wcwidth==0.2.13
wrapt==1.16.0
xxhash==3.4.1
yarl==1.9.4

Context for the issue:

I would like to present Outlines as an intuitive and fast alternative to Guidance or LMQL in my diploma school project

@DorotaBjoorn
Copy link
Author

Can the issue acutally be with the model I am using instead of the mistralai/Mistral-7B-v0.1?

@DorotaBjoorn DorotaBjoorn changed the title generate.json() gives ValidationError generate.json() gives ValidationError when run with mistral-7b-instruct-v0.2.Q6_K.gguf Apr 25, 2024
@wang-haoxian
Copy link

I have the same problem with "mistral-7b-instruct-v0.2.Q5_K_S.gguf"

JSONDecodeError: Unterminated string starting at: line 5 column 5 (char 24)

My code is from the example too.

from outlines import models
import outlines
from outlines import generate
from llama_cpp import Llama

json_grammar = outlines.grammars.json

def add(a: int, b: int):
    return a + b


llm = Llama("./mistral-7b-instruct-v0.2.Q5_K_S.gguf", chat_format="mistral")
model = models.LlamaCpp(llm)
generator = generate.json(model, add)
result = generator("Return two integers named a and b respectively. a is odd and b even.")

print(add(**result))

@rlouf
Copy link
Member

rlouf commented May 3, 2024

Can you try generator = generate.json(model, add, whitespace_pattern="") ?

@rlouf rlouf added correctness Everything related to the generation correctness and removed bug labels May 3, 2024
@wang-haoxian
Copy link

Can you try generator = generate.json(model, add, whitespace_pattern="") ?

Yes it works as expected with this.
It's kind of confusing though. I didn't find this in the doc.
Thank you.

@wang-haoxian
Copy link

Can you try generator = generate.json(model, add, whitespace_pattern="") ?

After digging around, I found that the model stopped earlier before the json is completed.
By adding max_tokens=100 of some big enough value for the whole output json, it works.
I think it's important to be able to detect this when the model's output is somehow stopped due to the generation control.

@DorotaBjoorn
Copy link
Author

DorotaBjoorn commented May 6, 2024 via email

@rlouf
Copy link
Member

rlouf commented May 6, 2024

We may need to override llama.cpp default value max_tokens=16

@allo-
Copy link

allo- commented May 7, 2024

I have the same problem with higher max_tokens if I do not set generator = generate.json(model, User) before each call to generator again.
I wonder if the LlamaCpp backend is not clearing the output of the call before correctly.

The problem with max_tokens itself seems to be solvable by using str with typing.Annotated adding a max_length (in chars, so take care it matches your max_tokens).

Example:

class Person(pydantic.BaseModel):
   name: str
   description: typing.Annotated[str, pydantic.StringConstraints(strip_whitespace=True, max_length=300)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
correctness Everything related to the generation correctness JSON
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants