Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署 #127

Open
5 tasks done
potong opened this issue Aug 29, 2023 · 0 comments
Open
5 tasks done
Labels
question Further information is requested

Comments

@potong
Copy link

potong commented Aug 29, 2023

Required prerequisites

Questions

支持Baichuan-7B 原生部署、 int8 和 int4 量化部署,代码如下:

import os
import platform
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

def auto_configure_device_map(num_gpus: int):
    num_trans_layers = 32
    per_gpu_layers = num_trans_layers / num_gpus
    device_map = {'model.embed_tokens': 0,
    'model.norm': num_gpus-1, 'lm_head': num_gpus-1}
    for i in range(num_trans_layers):
        device_map[f'model.layers.{i}'] = int(i//per_gpu_layers)

    return device_map


MODEL_NAME = "baichuan-inc/baichuan-7B"

NUM_GPUS = torch.cuda.device_count() if torch.cuda.is_available() else None
device_map = auto_configure_device_map(NUM_GPUS) if NUM_GPUS > 0 else None
device = torch.device("cuda") if NUM_GPUS > 0 else torch.device("cpu")
device_dtype = torch.half if NUM_GPUS > 0 else torch.float

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.float16, trust_remote_code=True).quantize(8) # 当前是int8量化;需要int4量化,只需将8改为4即可;需要原生部署,去掉.quantize(8)即可
model = dispatch_model(model, device_map=device_map)
model = model.eval()

感谢 #50 中小伙伴们提供的宝贵方法。

Checklist

  • I have provided all relevant and necessary information above.
  • I have chosen a suitable title for this issue.
@potong potong added the question Further information is requested label Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant