Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault python3 airllm2.py #129

Open
taozhiyuai opened this issue Apr 25, 2024 · 3 comments
Open

segmentation fault python3 airllm2.py #129

taozhiyuai opened this issue Apr 25, 2024 · 3 comments

Comments

@taozhiyuai
Copy link

Mac, native conda, mlx installed.

`(native) taozhiyu@603e5f4a42f1 downloads % pip show mlx airllm
Name: mlx
Version: 0.11.1
Summary: A framework for machine learning on Apple silicon.
Home-page: https://github.com/ml-explore/mlx
Author: MLX Contributors
Author-email: mlx@group.apple.com
License:
Location: /Users/taozhiyu/miniconda3/envs/native/lib/python3.12/site-packages
Requires:
Required-by:

Name: airllm
Version: 2.8.3
Summary: AirLLM allows single 4GB GPU card to run 70B large language models without quantization, distillation or pruning.
Home-page: https://github.com/lyogavin/Anima/tree/main/air_llm
Author: Gavin Li
Author-email: gavinli@animaai.cloud
License:
Location: /Users/taozhiyu/miniconda3/envs/native/lib/python3.12/site-packages
Requires: accelerate, huggingface-hub, optimum, safetensors, scipy, torch, tqdm, transformers
Required-by:
(native) taozhiyu@603e5f4a42f1 downloads % python3 airllm2.py
found index file...
found_layers:{'model.embed_tokens.': False, 'model.layers.0.': False, 'model.layers.1.': False, 'model.layers.2.': False, 'model.layers.3.': False, 'model.layers.4.': False, 'model.layers.5.': False, 'model.layers.6.': False, 'model.layers.7.': False, 'model.layers.8.': False, 'model.layers.9.': False, 'model.layers.10.': False, 'model.layers.11.': False, 'model.layers.12.': False, 'model.layers.13.': False, 'model.layers.14.': False, 'model.layers.15.': False, 'model.layers.16.': False, 'model.layers.17.': False, 'model.layers.18.': False, 'model.layers.19.': False, 'model.layers.20.': False, 'model.layers.21.': False, 'model.layers.22.': False, 'model.layers.23.': False, 'model.layers.24.': False, 'model.layers.25.': False, 'model.layers.26.': False, 'model.layers.27.': False, 'model.layers.28.': False, 'model.layers.29.': False, 'model.layers.30.': False, 'model.layers.31.': False, 'model.layers.32.': False, 'model.layers.33.': False, 'model.layers.34.': False, 'model.layers.35.': False, 'model.layers.36.': False, 'model.layers.37.': False, 'model.layers.38.': False, 'model.layers.39.': False, 'model.layers.40.': False, 'model.layers.41.': False, 'model.layers.42.': False, 'model.layers.43.': False, 'model.layers.44.': False, 'model.layers.45.': False, 'model.layers.46.': False, 'model.layers.47.': False, 'model.layers.48.': False, 'model.layers.49.': False, 'model.layers.50.': False, 'model.layers.51.': False, 'model.layers.52.': False, 'model.layers.53.': False, 'model.layers.54.': False, 'model.layers.55.': False, 'model.layers.56.': False, 'model.layers.57.': False, 'model.layers.58.': False, 'model.layers.59.': False, 'model.layers.60.': False, 'model.layers.61.': False, 'model.layers.62.': False, 'model.layers.63.': False, 'model.layers.64.': False, 'model.layers.65.': False, 'model.layers.66.': False, 'model.layers.67.': False, 'model.layers.68.': False, 'model.layers.69.': False, 'model.layers.70.': False, 'model.layers.71.': False, 'model.layers.72.': False, 'model.layers.73.': False, 'model.layers.74.': False, 'model.layers.75.': False, 'model.layers.76.': False, 'model.layers.77.': False, 'model.layers.78.': False, 'model.layers.79.': False, 'model.norm.': False, 'lm_head.': False}
some layer splits found, some are not, re-save all layers in case there's some corruptions.
0%| | 0/83 [00:00<?, ?it/s]Loading shard 1/30
zsh: segmentation fault python3 airllm2.py
(native) taozhiyu@603e5f4a42f1 downloads % /Users/taozhiyu/miniconda3/envs/native/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
(native) taozhiyu@603e5f4a42f1 downloads %

`

`from airllm import AutoModel

MAX_LENGTH = 128

could use hugging face model repo id:

model = AutoModel.from_pretrained("garage-bAInd/Platypus2-70B-instruct")

or use model's local path...

model = AutoModel.from_pretrained("/Users/taozhiyu/Downloads/Meta-Llama-3-70B-Instruct")

input_text = [
'What is the capital of United States?',
]

input_tokens = model.tokenizer(input_text,
return_tensors="pt",
return_attention_mask=False,
truncation=True,
max_length=MAX_LENGTH,
padding=False)

generation_output = model.generate(
input_tokens['input_ids'].cuda(),
max_new_tokens=20,
use_cache=True,
return_dict_in_generate=True)

output = model.tokenizer.decode(generation_output.sequences[0])

print(output)
`

@Proryanator
Copy link

What macbook are you using? An M3 Max? 🤔 I've seen an issue like this here as well as in the coreml stable diffusion repo specific to M3 macbooks.

@taozhiyuai
Copy link
Author

What macbook are you using? An M3 Max? 🤔 I've seen an issue like this here as well as in the coreml stable diffusion repo specific to M3 macbooks.

m3 max 128gb

@Proryanator
Copy link

Proryanator commented Apr 30, 2024

My suspicion yeah, very odd. I have a 36GB M3 Max.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants