推理得到的图像或者视频都有问题 #369

HioZx · 2024-05-06T08:52:23Z

都是长这样，不知道是什么情况

JamesTensor · 2024-05-07T06:53:43Z

me too

zhengzangw · 2024-05-09T09:14:37Z

This is not excepted. Please pull the latest repo and run it again. If this is still the case, please provide your running command, and environment for us to investigate it.

HioZx · 2024-05-09T11:37:03Z

command：python scripts/inference.py configs/opensora-v1-1/inference/sample.py --ckpt-path /home/hio/code/STDiT2/model.safetensors --prompt "A beautiful sunset over the city" --num-frames 1 --image-size 512 512

environment:
absl-py 2.1.0
accelerate 0.29.1
addict 2.4.0
aiosignal 1.3.1
annotated-types 0.6.0
anykeystore 0.2
appdirs 1.4.4
attrs 23.2.0
bcrypt 4.1.2
beartype 0.18.5
beautifulsoup4 4.12.3
certifi 2022.12.7
cffi 1.16.0
cfgv 3.4.0
charset-normalizer 2.1.1
click 8.1.7
cmake 3.25.0
colossalai 0.3.6
contexttimer 0.3.3
contourpy 1.2.1
cryptacular 1.6.2
cryptography 42.0.5
cycler 0.12.1
decorator 5.1.1
defusedxml 0.7.1
Deprecated 1.2.14
diffusers 0.27.2
dill 0.3.8
distlib 0.3.8
docker-pycreds 0.4.0
einops 0.7.0
fabric 3.2.2
filelock 3.13.3
flash-attn 2.5.8
fonttools 4.51.0
frozenlist 1.4.1
fsspec 2024.3.1
ftfy 6.2.0
gdown 5.1.0
gitdb 4.0.11
GitPython 3.1.43
google 3.0.0
greenlet 3.0.3
grpcio 1.62.1
huggingface-hub 0.22.2
hupper 1.12.1
identify 2.5.35
idna 3.4
importlib_metadata 7.1.0
invoke 2.2.0
Jinja2 3.1.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lit 15.0.7
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.8.4
mdurl 0.1.2
mmengine 0.10.3
mpmath 1.3.0
msgpack 1.0.8
networkx 3.2.1
ninja 1.11.1.1
nodeenv 1.8.0
numpy 1.26.3
nvidia-cublas-cu11 11.11.3.6
nvidia-cuda-cupti-cu11 11.8.87
nvidia-cuda-nvrtc-cu11 11.8.89
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cudnn-cu11 8.7.0.84
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.3.0.86
nvidia-cusolver-cu11 11.4.1.48
nvidia-cusparse-cu11 11.7.5.86
nvidia-nccl-cu11 2.19.3
nvidia-nvtx-cu11 11.8.86
oauthlib 3.2.2
opencv-python 4.9.0.80
opensora 1.1.0
packaging 24.0
pandarallel 1.6.5
pandas 2.2.2
paramiko 3.4.0
PasteDeploy 3.1.0
pbkdf2 1.3
pillow 10.2.0
pip 22.3.1
plaster 1.1.2
plaster-pastedeploy 1.0.1
platformdirs 4.2.0
pre-commit 3.7.0
protobuf 4.25.3
psutil 5.9.8
pyarrow 16.0.0
pyav 12.0.5
pycparser 2.22
pydantic 2.6.4
pydantic_core 2.16.3
Pygments 2.17.2
PyNaCl 1.5.0
pyparsing 3.1.2
pyramid 2.0.2
pyramid-mailer 0.15.1
PySocks 1.7.1
python-dateutil 2.9.0.post0
python3-openid 3.2.0
pytz 2024.1
PyYAML 6.0.1
ray 2.10.0
referencing 0.34.0
regex 2023.12.25
repoze.sendmail 4.4.1
requests 2.28.1
requests-oauthlib 2.0.0
rich 13.7.1
rotary-embedding-torch 0.5.3
rpds-py 0.18.0
safetensors 0.4.2
sentencepiece 0.2.0
sentry-sdk 1.44.1
setproctitle 1.3.3
setuptools 65.5.1
six 1.16.0
smmap 5.0.1
soupsieve 2.5
SQLAlchemy 2.0.29
sympy 1.12
tensorboard 2.16.2
tensorboard-data-server 0.7.2
termcolor 2.4.0
timm 0.9.16
tokenizers 0.15.2
tomli 2.0.1
torch 2.2.2+cu118
torchaudio 2.2.2+cu118
torchvision 0.17.2+cu118
tqdm 4.66.2
transaction 4.0
transformers 4.39.3
translationstring 1.4
triton 2.2.0
typing_extensions 4.8.0
tzdata 2024.1
urllib3 1.26.13
velruse 1.1.1
venusian 3.1.0
virtualenv 20.25.1
wandb 0.16.6
wcwidth 0.2.13
WebOb 1.8.7
Werkzeug 3.0.2
wheel 0.38.4
wrapt 1.16.0
WTForms 3.1.2
wtforms-recaptcha 0.3.2
xformers 0.0.25.post1+cu118
yapf 0.40.2
zipp 3.18.1
zope.deprecation 5.0
zope.interface 6.2
zope.sqlalchemy 3.1

I changed the parameters:enable_flashattn and enable_layernorm_kernel
configs/opensora-v1-1/inference/sample.py:
`num_frames = 16
frame_interval = 3
fps = 24
image_size = (240, 426)
multi_resolution = "STDiT2"

model = dict(
type="STDiT2-XL/2",
from_pretrained=None,
input_sq_size=512,
qk_norm=True,
enable_flashattn=False,
enable_layernorm_kernel=False,
)
vae = dict(
type="VideoAutoencoderKL",
from_pretrained="/home/hio/code/sd-vae-ft-ema",
# cache_dir=None, # "/mnt/hdd/cached_models",
micro_batch_size=4,
)
text_encoder = dict(
type="t5",
from_pretrained="/home/hio/code/t5-v1_1-xxl",
# cache_dir=None, # "/mnt/hdd/cached_models",
model_max_length=200,
)
scheduler = dict(
type="iddpm",
num_sampling_steps=100,
cfg_scale=7.0,
cfg_channel=3, # or None
)
dtype = "fp16"

prompt_path = "./assets/texts/t2v_samples.txt"
prompt = None # prompt has higher priority than prompt_path

batch_size = 1
seed = 42
save_dir = "./samples/samples/"
`

zhengzangw · 2024-05-09T12:31:50Z

I think you do not pull the latest version as the latest version has enable_flash_attn instead of enable_flashattn.

Besides, we do not use --ckpt-path /home/hio/code/STDiT2/model.safetensors. You can try not passing --ckpt-path and our code now enable automatic downloading.

buxianggaimingzi · 2024-05-10T03:24:38Z

btw, enable_flashattn in the training config has not been changed to enable_flash_attn in the latest version, it may cause OOM during training

zhengzangw · 2024-05-10T05:10:52Z

Open-Sora/configs/opensora-v1-1/inference/sample.py

Line 13 in c6cc021

enable_flash_attn=True,

But here we does use enable_flash_attn

buxianggaimingzi · 2024-05-10T06:20:18Z

There may be some misunderstanding. I mean the training configuration has not been modified yet:

Open-Sora/configs/opensora-v1-1/train/stage3.py

Line 46 in c6cc021

enable_flashattn=True,

There is no problem with the inference configuration. It’s just that I saw the field enable_flash_attn mentioned here, so I mentioned it by the way.

HioZx · 2024-05-10T08:06:20Z

Pulling the latest warehouse is the same result. I didn't install apex, so enable_flash_attn and enable_layernorm_kernel were disabled, but that shouldn't have caused the failure

xunshengliuyin · 2024-05-15T07:44:25Z

me too

zhengzangw added the question Further information is requested label May 9, 2024

zhengzangw mentioned this issue May 10, 2024

change flashattn to flash_attn #387

Merged

zhengzangw closed this as completed in #387 May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

推理得到的图像或者视频都有问题 #369

推理得到的图像或者视频都有问题 #369

HioZx commented May 6, 2024

JamesTensor commented May 7, 2024

zhengzangw commented May 9, 2024

HioZx commented May 9, 2024

zhengzangw commented May 9, 2024

buxianggaimingzi commented May 10, 2024 •

edited

zhengzangw commented May 10, 2024

buxianggaimingzi commented May 10, 2024

HioZx commented May 10, 2024

xunshengliuyin commented May 15, 2024

推理得到的图像或者视频都有问题 #369

推理得到的图像或者视频都有问题 #369

Comments

HioZx commented May 6, 2024

JamesTensor commented May 7, 2024

zhengzangw commented May 9, 2024

HioZx commented May 9, 2024

zhengzangw commented May 9, 2024

buxianggaimingzi commented May 10, 2024 • edited

zhengzangw commented May 10, 2024

buxianggaimingzi commented May 10, 2024

HioZx commented May 10, 2024

xunshengliuyin commented May 15, 2024

buxianggaimingzi commented May 10, 2024 •

edited