lmql run not working for llama models even though same script works in playground and also for openai models #343

iurimatias · 2024-03-26T16:28:02Z

In short: A very simple script works on both playground and command line when it's using OpenAI models, but when using llama.cpp it works only on playground but not on the command line.

filename: test_llama.lmql

sample(1, 0.2)
    """Hello! my name is [NAME]."""
from
    lmql.model("llama.cpp:D:\models\llama-2-7b.Q4_0.gguf",trust_remote_code=True, endpoint="192.168.1.23:8080")

This works on playground but not on the command line (i.e lmql run test_llama.lmql)

filename: test_openai.lmql

sample(1, 0.2)
    """Hello! my name is [NAME]."""
from
    "chatgpt"

This works on both.

Trace

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Exception in thread Thread-1 (load):
Traceback (most recent call last):
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/tokenizer.model/resolve/main/tokenizer_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1403, in hf_hub_download
    raise head_call_error
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1261, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1674, in get_hf_file_metadata
    r = _request_wrapper(
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 369, in _request_wrapper
    response = _request_wrapper(
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 393, in _request_wrapper
    hf_raise_for_status(response)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 352, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6602f667-65af9e757de6bc72081ec15d;a84ddd00-5a5c-44a9-9a9a-000e2938e73e)

Repository Not Found for url: https://huggingface.co/tokenizer.model/resolve/main/tokenizer_config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 43, in load
    t = loader()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 349, in loader
    t = TransformersTokenizer.from_pretrained(model_identifier, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tokenizers/hf_tokenizer.py", line 28, in from_pretrained
    tokenizer = AutoTokenizer.from_pretrained(model_identifier, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 779, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 612, in get_tokenizer_config
    resolved_config_file = cached_file(
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/utils/hub.py", line 421, in cached_file
    raise EnvironmentError(
OSError: tokenizer.model is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
Traceback (most recent call last):
  File "/Users/username/miniconda3/envs/py310/bin/lmql", line 8, in <module>
    sys.exit(main())
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/cli.py", line 260, in main
    command[0]()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/cli.py", line 71, in cmd_run
    results = asyncio.run(lmql.run_file(absolute_path, **kwargs))
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/api/run.py", line 9, in run_file
    return await q(*args, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/lmql_runtime.py", line 230, in __acall__
    results = await interpreter.run(self.fct, **query_kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tracing/tracer.py", line 240, in wrapper
    return await fct(*args, **kwargs)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/interpreter.py", line 955, in run
    self.root_state = await self.advance(self.root_state)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/interpreter.py", line 385, in advance
    await continue_for_more_prompt_stmts()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/interpreter.py", line 365, in continue_for_more_prompt_stmts
    await query_head.continue_()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/multi_head_interpretation.py", line 140, in continue_
    await self.handle_current_arg()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/multi_head_interpretation.py", line 112, in handle_current_arg
    await self.advance(res)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/multi_head_interpretation.py", line 89, in advance
    await self.handle_current_arg()
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/multi_head_interpretation.py", line 111, in handle_current_arg
    res = await fct(*self.current_args[1], **self.current_args[2])
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/interpreter.py", line 166, in set_model
    self.interpreter.set_model(model_name)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/interpreter.py", line 323, in set_model
    VocabularyMatcher.init(model_handle.get_tokenizer())
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/ops/token_set.py", line 42, in init
    if tokenizer.name in VocabularyMatcher._instances:
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 88, in name
    return self.tokenizer_impl.name
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 83, in tokenizer_impl
    tokenizer_not_found_error(self.model_identifier)
  File "/Users/username/miniconda3/envs/py310/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 366, in tokenizer_not_found_error
    raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'tokenizer.model' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)

The text was updated successfully, but these errors were encountered:

iurimatias · 2024-03-26T16:52:18Z

Another detail, this was working up to relaese v0.7b3 then it breaks on v.07

Version v0.7b3 works

pip install --force-reinstall lmql==v0.7b3
lmql run test_query.lmql

result: prompt executes and prints result

Version v0.7 breaks

pip install --force-reinstall lmql==v0.7
lmql run test_query.lmql

result: error, same trace as above

iurimatias changed the title ~~lmql run not working for llama models even though same script works in playground and for openai models~~ lmql run not working for llama models even though same script works in playground and also for openai models Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lmql run not working for llama models even though same script works in playground and also for openai models #343

lmql run not working for llama models even though same script works in playground and also for openai models #343

iurimatias commented Mar 26, 2024

iurimatias commented Mar 26, 2024

lmql run not working for llama models even though same script works in playground and also for openai models #343

lmql run not working for llama models even though same script works in playground and also for openai models #343

Comments

iurimatias commented Mar 26, 2024

Trace

iurimatias commented Mar 26, 2024

Version v0.7b3 works

Version v0.7 breaks