G3AN: Disentangling Appearance and Motion for Video Generation

Project Page | Paper

This is the official PyTorch implementation of the CVPR 2020 paper "G3AN: Disentangling Appearance and Motion for Video Generation"

Requirements

Python 3.6
cuda 9.2
cudnn 7.1
PyTorch 1.4+
scikit-video
tensorboard
moviepy
PyAV

Dataset

You can download the original UvA-NEMO datest from https://www.uva-nemo.org/ and use https://github.com/1adrianb/face-alignment to crop face regions. We also provide our preprocessed version here.

Pretrained model

Download the G3AN pretrained model on UvA-NEMO from here.

Inference

For sampling NUM videos and saving them under ./demos/EXP_NAME

python demo_random.py --model_path $MODEL_PATH --n $NUM --demo_name $EXP_NAME

For sampling N appearances with M motions and saving them under ./demos/EXP_NAME

python demo_nxm.py --model_path $MODEL_PATH --n_za_test $N --n_zm_test $M --demo_name $EXP_NAME

For sampling N appearances with different video lengthes (9 different video lengthes) and saving them under ./demos/EXP_NAME

python demo_multilength.py --model_path $MODEL_PATH --n_za_test $N --demo_name $EXP_NAME

Training

python train.py --data_path $DATASET --exp_name $EXP_NAME

Evaluation

Generate 5000 videos for evaluation, save them in $GEN_PATH

python generate_videos.py --gen_path $GEN_PATH

Move into evaluation folder

cd evaluation

Download feature extractor resnext-101-kinetics.pth from here to the current folder. Pre-computed UvA_NEMO dataset stats can be found in stats/uva.npz. If you would like to compute it youeself, save all the training videos in $UVA_PATH and run (to obtain 64x64 videos, you need to specify output size when using ffmpeg),

python precalc_stats.py --data_path $UVA_PATH

To compute FID

python fid.py $GEN_PATH stats/uva_64.npz

I have provided npz file for both 64 and 128 resolutions. You can obtain FID around 60 (64x64) and 130 (128x128) by evaluating provided model. Here I improved the original video discriminator by using a (2+1)D ConvNets instead of 3D ConvNets.

TODOs

Unconditional Generation
Evaluation
Conditional Generation

Citation

If you find this code useful for your research, please consider citing our paper:

@InProceedings{Wang_2020_CVPR,
    author = {Wang, Yaohui and Bilinski, Piotr and Bremond, Francois and Dantcheva, Antitza},
    title = {{G3AN}: Disentangling Appearance and Motion for Video Generation},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2020}
}

Acknowledgement

Part of the evaluation code is adapted from evan. I have moved most of the operations from CPU into GPU to accelerate the computation. We thank authors for their contribution to the community.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
demos		demos
evaluation		evaluation
model		model
LICENSE		LICENSE
README.md		README.md
cfg.py		cfg.py
dataset.py		dataset.py
demo.gif		demo.gif
demo_multilength.py		demo_multilength.py
demo_nxm.py		demo_nxm.py
demo_random.py		demo_random.py
generate_videos.py		generate_videos.py
train.py		train.py
trainer.py		trainer.py
transforms_vid.py		transforms_vid.py

License

wyhsirius/g3an-project

Folders and files

Latest commit

History

Repository files navigation

G3AN: Disentangling Appearance and Motion for Video Generation

Project Page | Paper

Requirements

Dataset

Pretrained model

Inference

Training

Evaluation

TODOs

Citation

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages