Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMQL doesn't run when fed latin accented characters #290

Open
OPaivaHeitor opened this issue Dec 6, 2023 · 1 comment
Open

LMQL doesn't run when fed latin accented characters #290

OPaivaHeitor opened this issue Dec 6, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@OPaivaHeitor
Copy link

Hello, I've been running into issues when using LMQL with strings containing accented characters:

When running

@lmql.query
def test():
    '''lmql
    "Q: J'adore déguster du café dans un café authentique à Paris."
    "A: [ANSWER]" where STOPS_AT(ANSWER, ".")
    print(ANSWER)
    '''

The following message is shown:

Traceback (most recent call last):
  File "E: ... \LMQLTesting.py", line 296, in <module>
    @lmql.query
     ^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\lmql\api\queries.py", line 130, in query
    module = load(temp_lmql_file, output_writer=silent)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\lmql\api\queries.py", line 22, in load
    module = compiler.compile(filepath)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\lmql\language\compiler.py", line 902, in compile
    contents = f.read()
               ^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 19: invalid continuation byte

However, if I run the same code, but removing the accents:

@lmql.query
def test():
    '''lmql
    "Q: J'adore deguster du cafe dans un cafe authentique a Paris."
    "A: [ANSWER]" where STOPS_AT(ANSWER, ".")
    print(ANSWER)
    '''

then, lmql seems to be able to give me a reply from the model:

C'est génial ! Il y a tellement de cafés authentiques à Paris, tu as de la chance de pouvoir en profiter.

@lbeurerkellner lbeurerkellner added the bug Something isn't working label Dec 10, 2023
@lbeurerkellner
Copy link
Collaborator

lbeurerkellner commented Dec 10, 2023

Thanks for reporting this. It seems like the compile may have issue processing the code, before even parsing it. We can have a look. It may be related to Windows and file encodings, accented characters should not lead to problems normally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants