Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mac m2 run air llm garage-bAInd/Platypus2-7B get error Input must be a file-like object opened in binary mode, or string #116

Open
wuxiongwei opened this issue Feb 23, 2024 · 6 comments

Comments

@wuxiongwei
Copy link


ValueError Traceback (most recent call last)
Cell In[23], line 2
1 import mlx.core as mx
----> 2 generation_output = model.generate(
3 mx.array(input_tokens['input_ids']),
4 max_new_tokens=3,
5 use_cache=True,
6 return_dict_in_generate=True)
8 print(generation_output)

File ~/opt/anaconda3/envs/air_llm_python_3_8/lib/python3.8/site-packages/airllm/airllm_llama_mlx.py:254, in AirLLMLlamaMlx.generate(self, x, temperature, max_new_tokens, **kwargs)
252 def generate(self, x, temperature=0, max_new_tokens=None, **kwargs):
253 tokens = []
--> 254 for token in self.model_generate(x, temperature=temperature):
255 tokens.append(token)
258 if len(tokens) >= max_new_tokens:

File ~/opt/anaconda3/envs/air_llm_python_3_8/lib/python3.8/site-packages/airllm/airllm_llama_mlx.py:281, in AirLLMLlamaMlx.model_generate(self, x, temperature, max_new_tokens)
278 mask = mask.astype(self.tok_embeddings.weight.dtype)
280 self.record_memory('before_loading_tok')
--> 281 update_weights = ModelPersister.get_model_persister().load_model(self.layer_names_dict['embed'], self.checkpoint_path)
283 self.record_memory('after_loading_tok')
284 self.tok_embeddings.update(update_weights['tok_embeddings'])
...
97 #available = psutil.virtual_memory().available / 1024 / 1024
98 #print(f"loaded {layer_name}, available mem: {available:.02f}")
100 layer_state_dict = map_torch_to_mlx(layer_state_dict)

ValueError: [load] Input must be a file-like object opened in binary mode, or string

@Verdagon
Copy link

I just had this same problem, and I think I have a fix. In mlx_model_persister.py, change:
mx.load(to_load_path)
to:
mx.load(str(to_load_path))

@wuxiongwei, can you help verify the fix works for you?

@8-momo-8
Copy link

8-momo-8 commented Mar 16, 2024

@Verdagon , yes it works with converting it to string

mx.load(str(to_load_path))

@mustangs0786
Copy link

Help where i have to put above code.....

input_text = [
#'What is the capital of United States?',
'I like',
]

MAX_LENGTH = 128
input_tokens = model.tokenizer(input_text,
return_tensors="np",
return_attention_mask=False,
truncation=True,
max_length=MAX_LENGTH,
padding=False)

input_tokens

generation_output = model.generate(
mx.array(input_tokens['input_ids']),
max_new_tokens=3,
use_cache=True,
return_dict_in_generate=True)

print(generation_output)

@shahfasal
Copy link

@mustangs0786 same error for me where you able to figure it out?

@shahfasal
Copy link

@Verdagon where do i find this file mlx_model_persister.py?

@shahfasal
Copy link

Got it its under airllm/persist/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants