Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quotes aren't always parsed properly #324

Open
nopepper opened this issue Feb 16, 2024 · 3 comments
Open

Quotes aren't always parsed properly #324

nopepper opened this issue Feb 16, 2024 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@nopepper
Copy link

nopepper commented Feb 16, 2024

For example:

import lmql

@lmql.query()
def quote():
    '''lmql
    "\"[VAL]\""
    return VAL
    '''

Fails with the error:

File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\IPython\core\interactiveshell.py:3553](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/IPython/core/interactiveshell.py:3553) in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In[16], [line 1](vscode-notebook-cell:?execution_count=16&line=1)
    @lmql.query()

  File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\api\queries.py:108](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/api/queries.py:108) in wrapper
    return query(fct, input_variables=input_variables, is_async=is_async, calling_frame=calling_frame, **extra_args)

  File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\api\queries.py:130](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/api/queries.py:130) in query
    module = load(temp_lmql_file, output_writer=silent)

  File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\api\queries.py:22](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/api/queries.py:22) in load
    module = compiler.compile(filepath)

  File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\language\compiler.py:924](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/language/compiler.py:924) in compile
    transformations.transform(q)

  File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\language\compiler.py:789](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/language/compiler.py:789) in transform
    t = T(query).transform()

  File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\language\compiler.py:346](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/language/compiler.py:346) in transform
    self.query.prompt = [self.visit(p) for p in self.query.prompt]
...
  File <unknown>:1
    f""""[VAL]""""
                 ^
SyntaxError: unterminated string literal (detected at line 1)

There is an ugly workaround for now:

@lmql.query()
def quote():
    '''lmql
    q = "\""
    "\"[VAL]{q}"
    return VAL
    '''
@lbeurerkellner lbeurerkellner added the bug Something isn't working label Feb 20, 2024
@lbeurerkellner lbeurerkellner added the good first issue Good for newcomers label Feb 27, 2024
@lbeurerkellner
Copy link
Collaborator

lbeurerkellner commented Feb 27, 2024

Thanks for reporting. Marking this as a good first issue.

The fix is likely somewhere close to https://github.com/eth-sri/lmql/blob/main/src/lmql/language/compiler.py#L428, where we compile LLM query strings into multi-line strings in the compiled representation of the program.

@Saibo-creator
Copy link
Contributor

I did some investigation and here are my findings:

import lmql

@lmql.query()
def quote_at_begin():
    '''lmql
    "\"x\"=123"
    '''
# This case passes without any errors.

@lmql.query()
def quote_at_end():
    '''lmql
    "123=\"x\""
    '''
# This triggers a SyntaxError:
# SyntaxError: unterminated string literal (detected at line 1)

So only the quote_at_end causes error.
I feel this may be a bug of python ast parser ?
because it can parse ast.parse(f""""x"==123""") works but not ast.parse(f"""123=="x"""") ?

A temporary workaround is to check if we ends with quote and add a space after it to make the parser working and then remove the space immediately after parsing.

@lbeurerkellner
Copy link
Collaborator

I think the behavior of ast.parse is actually correct here. In Python, """"a""" is valid, whereas """a"""" is not valid. This is because after reading """ a parser's scanner will look for the next """ and then terminate the current string terminal. An extra " at the end of such a string will thus be read as an unterminated string literal.

To fix this issue, I think it should be enough to just add:

if compiled_qstring.endswith("\""):
            compiled_qstring = compiled_qstring[:-1] + "\\\""

This prevents a """" (four quotes) sequences in compiled_qstring, as the last quote in a qstring will always be escaped. This may need some more testing to check whether it covers all the cases correctly though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants