Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLEASE HELP ME - OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB. GPU #113

Open
codeolder opened this issue May 5, 2024 · 5 comments

Comments

@codeolder
Copy link

Untitled
Code.txt
BasicTransformerBlock is using checkpointing
Loaded model config from [options/SUPIR_v0.yaml]
Loaded state_dict from [/opt/data/private/AIGC_pretrain/SDXL_cache/sd_xl_base_1.0_0.9vae.safetensors]
Loaded state_dict from [/opt/data/private/AIGC_pretrain/SUPIR_cache/SUPIR-v0Q.ckpt]
Loading vision tower: openai/clip-vit-large-patch14-336
Loading checkpoint shards: 67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 2/3 [00:23<00:11, 11.70s/it]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\PC\Downloads\SUPIR\test.py:72 in │
│ │
│ 69 model = model.to(SUPIR_device) │
│ 70 # load LLaVA │
│ 71 if use_llava: │
│ ❱ 72 │ llava_agent = LLavaAgent(LLAVA_MODEL_PATH, device=LLaVA_device, load_8bit=args.load_ │
│ 73 else: │
│ 74 │ llava_agent = None │
│ 75 │
│ │
│ C:\Users\PC\Downloads\SUPIR\llava\llava_agent.py:27 in init
│ │
│ 24 │ │ │ device_map = 'auto' │
│ 25 │ │ model_path = os.path.expanduser(model_path) │
│ 26 │ │ model_name = get_model_name_from_path(model_path) │
│ ❱ 27 │ │ tokenizer, model, image_processor, context_len = load_pretrained_model( │
│ 28 │ │ │ model_path, None, model_name, device=self.device, device_map=device_map, │
│ 29 │ │ │ load_8bit=load_8bit, load_4bit=load_4bit) │
│ 30 │ │ self.model = model │
│ │
│ C:\Users\PC\Downloads\SUPIR\llava\model\builder.py:103 in load_pretrained_model │
│ │
│ 100 │ │ │ │ model = LlavaMPTForCausalLM.from_pretrained(model_path, low_cpu_mem_usag │
│ 101 │ │ │ else: │
│ 102 │ │ │ │ tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False) │
│ ❱ 103 │ │ │ │ model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_us │
│ 104 │ else: │
│ 105 │ │ # Load language model │
│ 106 │ │ if model_base is not None: │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_util │
│ s.py:2795 in from_pretrained │
│ │
│ 2792 │ │ │ │ mismatched_keys, │
│ 2793 │ │ │ │ offload_index, │
│ 2794 │ │ │ │ error_msgs, │
│ ❱ 2795 │ │ │ ) = cls._load_pretrained_model( │
│ 2796 │ │ │ │ model, │
│ 2797 │ │ │ │ state_dict, │
│ 2798 │ │ │ │ loaded_state_dict_keys, # XXX: rename? │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_util │
│ s.py:3123 in _load_pretrained_model │
│ │
│ 3120 │ │ │ │ ) │
│ 3121 │ │ │ │ │
│ 3122 │ │ │ │ if low_cpu_mem_usage: │
│ ❱ 3123 │ │ │ │ │ new_error_msgs, offload_index, state_dict_index = _load_state_dict_i │
│ 3124 │ │ │ │ │ │ model_to_load, │
│ 3125 │ │ │ │ │ │ state_dict, │
│ 3126 │ │ │ │ │ │ loaded_keys, │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_util │
│ s.py:698 in _load_state_dict_into_meta_model │
│ │
│ 695 │ │ │ state_dict_index = offload_weight(param, param_name, state_dict_folder, stat │
│ 696 │ │ elif not load_in_8bit: │
│ 697 │ │ │ # For backward compatibility with older versions of accelerate
│ ❱ 698 │ │ │ set_module_tensor_to_device(model, param_name, param_device, **set_module_kw │
│ 699 │ │ else: │
│ 700 │ │ │ if param.dtype == torch.int8 and param_name.replace("weight", "SCB") in stat │
│ 701 │ │ │ │ fp16_statistics = state_dict[param_name.replace("weight", "SCB")] │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\utils\modeling. │
│ py:149 in set_module_tensor_to_device │
│ │
│ 146 │ │ if value is None: │
│ 147 │ │ │ new_value = old_value.to(device) │
│ 148 │ │ elif isinstance(value, torch.Tensor): │
│ ❱ 149 │ │ │ new_value = value.to(device) │
│ 150 │ │ else: │
│ 151 │ │ │ new_value = torch.tensor(value, device=device) │
│ 152 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB. GPU

@FurkanGozukara
Copy link

what GPU you have? we have auto installer and a version that works as low as 8 GB (FP8 + tiled VAE + cpu offloading)

@codeolder
Copy link
Author

The GPU I use is RTX 4060, can please provide that auto installer
I have installed this project many times and it has always failed

@FurkanGozukara
Copy link

The GPU I use is RTX 4060, can please provide that auto installer I have installed this project many times and it has always failed

here our video

it would work super on your GPU

https://youtu.be/OYxVEvDf284

@codeolder
Copy link
Author

I tried registering an account and participating in the link in the video mentioned, can you send me the file or does this cost money?

@yuanzhi-zhu
Copy link

@codeolder on a 40g a100, I make it work by set load_4bit=True in test.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants