You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LMQL looks very promising (having played w/ Guidance) so I want to make this work but having issues from get go, trying to run it locally. I'm really hoping I can get some help.
IMMEDIATE GOAL: what is the simplest way to make this work?
Context:
I have several gguf models in my comp that I want to run in my Macbook pro (pre-M, intel), basically via CPU which I ran previously many times in python code, though slow.
I want to:
1.run model directly in python code
2a.run model by exposing via api, like localhost:8081
2b.(can't in my mac but can in pc) run gguf via LM Studio and expose ip:port in PC and have python code in mac tap into it
Code:
import lmql
model_path = "/Users/mchung/Desktop/proj-ai/models/"
# model = "wizardcoder-python-13b-v1.0.Q4_K_S.gguf"
model = "codeqwen-1_5-7b-chat-q8_0.gguf"
# model = "mistral-7b-instruct-v0.2.Q5_K_M.gguf"
m = f"local:llama.cpp:{model_path+model}"
print(m)
@lmql.query(model=lmql.model(m, verbose=True))
def query_function():
'''lmql
"""A great good dad joke. A indicates the punchline
Q:[JOKE]
A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and \
STOPS_AT(PUNCHLINE, "\n")
'''
return "What's the best way to learn Python?"
response = query_function()
print(response)
Thanks in advance.
The text was updated successfully, but these errors were encountered:
for now, getting this error:
raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
Hi,
I had the same error when trying to run a model using llama-cpp loader.
It is not really clear when reading the documentation but to run model using llama-cpp you have to install the package lmql[hf] (instead of lmql) and llama-cpp-python to provide the inference backend.
LMQL looks very promising (having played w/ Guidance) so I want to make this work but having issues from get go, trying to run it locally. I'm really hoping I can get some help.
IMMEDIATE GOAL: what is the simplest way to make this work?
Context:
I have several gguf models in my comp that I want to run in my Macbook pro (pre-M, intel), basically via CPU which I ran previously many times in python code, though slow.
I want to:
1.run model directly in python code
2a.run model by exposing via api, like localhost:8081
2b.(can't in my mac but can in pc) run gguf via LM Studio and expose ip:port in PC and have python code in mac tap into it
Code:
Thanks in advance.
The text was updated successfully, but these errors were encountered: