Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance 4/6] Precompute is_sdxl_inpaint flag #15806

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

huchenlei
Copy link
Contributor

Description

According to lllyasviel/stable-diffusion-webui-forge#716 (comment) , the check of whether the model is sdxl inpaint is calling state_dict on every sampling step. state_dict is a very expensive function that costs ~40ms. This overhead is for all inference regardless of model type, which is dumb.

This PR precomputes is_sdxl_inpaint flag so that we do not call state_dict on every sampling step.

Original PR that introduce this change: #14390

Screenshots/videos:

image

Checklist:

@huchenlei huchenlei changed the title Precompute is_sdxl_inpaint flag [Performance 4/6] Precompute is_sdxl_inpaint flag May 15, 2024
@huchenlei huchenlei changed the base branch from master to dev May 15, 2024 20:50
@Panchovix
Copy link

Panchovix commented May 16, 2024

Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)

@huchenlei
Copy link
Contributor Author

Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)

There are 2 more PRs to come, but they are not as straightforward. So they might take longer to prepare. I am also having all performance fix merged to https://github.com/huchenlei/stable-diffusion-webui/tree/all_perf so you don't need to patch these PRs one by one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants