Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulty getting started! #352

Open
mcchung52 opened this issue May 4, 2024 · 3 comments
Open

Difficulty getting started! #352

mcchung52 opened this issue May 4, 2024 · 3 comments

Comments

@mcchung52
Copy link

mcchung52 commented May 4, 2024

LMQL looks very promising (having played w/ Guidance) so I want to make this work but having issues from get go, trying to run it locally. I'm really hoping I can get some help.

IMMEDIATE GOAL: what is the simplest way to make this work?

Context:
I have several gguf models in my comp that I want to run in my Macbook pro (pre-M, intel), basically via CPU which I ran previously many times in python code, though slow.

I want to:
1.run model directly in python code
2a.run model by exposing via api, like localhost:8081
2b.(can't in my mac but can in pc) run gguf via LM Studio and expose ip:port in PC and have python code in mac tap into it

Code:

import lmql

model_path = "/Users/mchung/Desktop/proj-ai/models/"
# model = "wizardcoder-python-13b-v1.0.Q4_K_S.gguf"
model = "codeqwen-1_5-7b-chat-q8_0.gguf"
# model = "mistral-7b-instruct-v0.2.Q5_K_M.gguf"

m = f"local:llama.cpp:{model_path+model}"
print(m)

@lmql.query(model=lmql.model(m, verbose=True))
def query_function():
    '''lmql
    """A great good dad joke. A indicates the punchline
    Q:[JOKE]
    A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and \
                           STOPS_AT(PUNCHLINE, "\n")
    '''
    return "What's the best way to learn Python?"

response = query_function()
print(response)

Thanks in advance.

@mcchung52
Copy link
Author

for now, getting this error:
raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)

@sashokbg
Copy link

Hello @mcchung52 I had similar issues. Please check this issue: #350

@miqaP
Copy link

miqaP commented May 13, 2024

Hi,
I had the same error when trying to run a model using llama-cpp loader.
It is not really clear when reading the documentation but to run model using llama-cpp you have to install the package lmql[hf] (instead of lmql) and llama-cpp-python to provide the inference backend.

Does it help ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants