PyTorch reimplementation of "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks" (Radford et al., 2016).
Figure 1: DCGAN generator used for LSUN (Radford et al., 2016).
Equation 1: Loss of the discriminator and generator (Lucic et al., 2018).
Download CelebA from one of the two sources:
- https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
- https://www.kaggle.com/datasets/jessicali9530/celeba-dataset
Ensure the data folder supports the following tree structure and naming convention:
|-- celeba
| |-- data // image folder or prepro_data folder
| |-- landmarks.csvThe original CelebA dataset consists of human faces with diverse backgrounds. However, working with human faces and diverse backgrounds makes it difficult for the DCGAN generator. To make his life more easy, I used a pretrained YOLOv8 medium (v0.2) face detector to extract tight face crops from the original CelebA images (see in prepro.py).
Ensure you download the pretrained weights of YOLOv8 medium, and specify the path to the data directory in ./utils/prepro.py. After that, you are ready to run:
python3 ./utils/prepro.pyNOTE: The preprocessing step using the YOLOv2 model is optional.
Modify the trainig script train_celeba.py to point to your data directory.
python3 train_celeba.pyThis script will train both the generator and discriminator on the CelebA dataset. It will automatically create:
- a file
dcgan_report_celeba.csvwhere the losses of both, the generator and discriminator are stored, - a directory
checkpoints_dcganwhere the weights will be saved, - a directory
celeba_previewwhere example images are saved during training.
Figure 2: Loss of the DCGAN discriminator (left) and the DCGAN generator (right) when training on CelebA.
Figure 3: Image generated by DCGAN generator by follwoing the instructions from this repository.
Using the trained generator you can easily interpolate two noise samples, and project them using the generator as follows:
# Generator output shape is [1, 3, 64, 64]
noise_start = torch.rand((1, 100))*2 - 1 # shape [1, 100]
noise_end = torch.rand((1, 100))*2 - 1 # shape [1, 100]
n_steps = 16
steps = torch.linspace(0, 1, n_steps) # shape [n_steps,]
zs = []
for i in range(n_steps):
alpha = steps[i]
z = (1-alpha)*noise_start + alpha*noise_end
inter_noise = torch.cat(zs, dim=0) # shape [n_steps, 100]
fake_imgs = generator(inter_noise) # shape [n_steps, 3, 64, 64]
Figure 4: Interpolation between a series of 9 noise samples. All noise samples were projected using the trained DCGAN generator (from left-to-right).
Since the original LSUN dataset is "large" (> 43GB), I used a subset of the dataset.
Download the subset (or the original one), and ensure the following tree structure inside the data directory:
|-- LSUN
| |-- bedroom
| | |-- 0 // image folder
| | |-- 1 // image folder
| | |-- 2 // image folder
| | |-- .
| | |-- .
| | |-- . Modify the trainig script train_lsun.py to point to your data directory.
python3 train_lsun.pyThis script will train both the generator and discriminator on the LSUN dataset. It will automatically train the models, and create some files and folders as described in the CelebA description.
Figure 5: Loss of the DCGAN discriminator (left) and the DCGAN generator (right) when training on LSUN.
Figure 6: Image generated by DCGAN generator by follwoing the instructions from this repository.
-
OS: Fedora Linux 42 (Workstation Edition) x86_64
-
CPU: AMD Ryzen 5 2600X (12) @ 3.60 GHz
-
GPU: NVIDIA GeForce RTX 3060 ti (8GB VRAM)
-
RAM: 32 GB DDR4 3200 MHz
-
CelebA training time: < 3 hours
-
LSUN training time: < 3 hours
@misc{radford2016unsupervisedrepresentationlearningdeep,
title={Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks},
author={Alec Radford and Luke Metz and Soumith Chintala},
year={2016},
eprint={1511.06434},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1511.06434},
}@misc{goodfellow2014generativeadversarialnetworks,
title={Generative Adversarial Networks},
author={Ian J. Goodfellow and Jean Pouget-Abadie and Mehdi Mirza and Bing Xu and David Warde-Farley and Sherjil Ozair and Aaron Courville and Yoshua Bengio},
year={2014},
eprint={1406.2661},
archivePrefix={arXiv},
primaryClass={stat.ML},
url={https://arxiv.org/abs/1406.2661},
}@misc{lucic2018ganscreatedequallargescale,
title={Are GANs Created Equal? A Large-Scale Study},
author={Mario Lucic and Karol Kurach and Marcin Michalski and Sylvain Gelly and Olivier Bousquet},
year={2018},
eprint={1711.10337},
archivePrefix={arXiv},
primaryClass={stat.ML},
url={https://arxiv.org/abs/1711.10337},
}@inproceedings{liu2015faceattributes,
title = {Deep Learning Face Attributes in the Wild},
author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}@article{yu15lsun,
Author = {Yu, Fisher and Zhang, Yinda and Song, Shuran and Seff, Ari and Xiao, Jianxiong},
Title = {LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop},
Journal = {arXiv preprint arXiv:1506.03365},
Year = {2015}
}