New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training questions #44
Comments
Hello, we used gradient checkpointing and 8 bit adam for training and fit batch size 6 to single A100 GPU. |
Thanks @yisol. Are you perhaps not doing EMA? Also if you could share a work-in-progress rough train script here, that'd be really helpful - just to get a better understanding of the differences with mine, doesn't have to be a working script. |
Did you use noise_offset or snr_gamma (=5) during training? |
Hey @nom I am trying to replicate the training, it would be great if you share a glimpse of your script or an idea also will work. |
@nom can you share finetune code for me ? |
Hey, great work! Quick question on training.
I was wondering how you're fitting two SDXL UNets (garment UNet and tryon UNet) on a single A800 with batch size 24/4=6 (assuming 4xA800 in total). I see you're using FP16 models, but are you doing any optimizations to bring memory down, like precomputing embeddings / features, 8bit adam or gradient accumulation? I'm trying to reproduce training, but can only fit 3 samples at 1024x768 resolution on 80GB VRAM during training and a single step takes ~1.3 seconds on a H100. I'm already doing the above tricks (8bit adam, precomputing VAE embeddings, frozen garment unet).
Also curious about training speed if you can share. Thanks!
The text was updated successfully, but these errors were encountered: