Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization

This repo includes the authors' Pytorch implementation of the paper:

Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization

Computer Vision and Pattern Recognition (CVPR) 2023

Introduction

In this work, we take a deep look into the generalization ability of binary classifiers for the task of deepfake detection. Specifically,

We discover that deepfake detection models supervised only by binary labels are very sensitive to the identity information of the images, which is termed as the Implicit Identity Leakage in the paper.
Based on our analyses, we propose a simple yet effective method termed as the ID-unaware Deepfake Detection Model to reduce the influence of the ID representation, successfully outperforming other state-of-the-art methods.

Updates

[03/2023] release the training and test code for our model
[03/2023] release the pretrained weight

Dependencies

Python 3 >= 3.6
Pytorch >= 1.6.0
OpenCV >= 4.4.0
Scipy >= 1.4.1
NumPy >= 1.19.5

Data Preparation

Take FF++ as an example:

Download the dataset from FF++ and put them under the ./data.

.
└── data
    └── FaceForensics++
        ├── original_sequences
        │   └── youtube
        │       └── raw
        │           └── videos
        │               └── *.mp4
        ├── manipulated_sequences
        │   ├── Deepfakes
        │       └── raw
        │           └── videos
        │               └── *.mp4
        │   ├── Face2Face
        │		...
        │   ├── FaceSwap
        │		...
        │   ├── NeuralTextures
        │		...
        │   ├── FaceShifter
        │		...

Download the landmark detector from here and put it in the folder ./lib.
Run the code to extract frames from FF++ videos and save them under the ./train_images or ./test_images based on the division in the original dataset.
```
 python3 lib/extract_frames_ldm_ff++.py
```

Pretrained weights

You can download pretrained weights here.

Evaluations

To evaluate the model performance, please run:

python3 test.py   --cfg ./configs/caddm_test.cfg

Results

Our model achieved the following performance on:

Training Data	Backbone	FF++	Celeb-DF	DFDC
FF++	ResNet-34	99.70%	91.15%	71.49%
FF++	EfficientNet-b3	99.78%	93.08%	73.34%
FF++	EfficientNet-b4	99.79%	93.88%	73.85%

Note: the metric is video-level AUC.

Training

To train our model from scratch, please run :

python3  train.py --cfg ./configs/caddm_train.cfg

Citation

Coming soon

Acknowledgements

SSD

Contact

If you have any questions, please feel free to contact us via jirenhe@megvii.com.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backbones		backbones
checkpoints		checkpoints
configs		configs
data		data
detection_layers		detection_layers
lib		lib
test_images		test_images
train_images		train_images
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
model.py		model.py
overview.png		overview.png
test.py		test.py
train.py		train.py

License

megvii-research/CADDM

Folders and files

Latest commit

History

Repository files navigation

Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization

Introduction

Updates

Dependencies

Data Preparation

Take FF++ as an example:

Pretrained weights

Evaluations

Results

Training

Citation

Acknowledgements

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages