StyleGAN is a Generative Adversarial Network proposed by NVIDIA researchers. It builds upon the Progressive Growing GAN architecture to produce photo-realistic synthetic images. They have released three research papers regarding this as of June 2021. Their contributions are briefly summarized as follows:
-
- Bilinear Sampling improves image quality.
- Mapping Network transforms latent space to w-space (Intermediate latent space).
- Synthesis Network uses ADA-IN (Adaptive Instance Normalization) to generate images.
- Constant Initial Input increases performance. W-space and ADA-IN control generated images anyway.
- Gaussian Noise makes the generated image look more realistic by bringing in finer features.
- Mixing Regularization performs style mixing where images are generated from two intermediate styles.
- Perceptual Path Length measures the difference between successive images when interpolating between two noise inputs.
- Linear Separability separates the latent space with a linear hyperplane based on attribute.
-
- Weight Demodulation removes the droplet artifacts that were in the original StyleGAN.
- Lazy Regularization alleviates heavy memory usage and computation cost of regularization.
- Perceptual Path Length Regularization encourages smooth mapping from latent space to generated image to achieve feature disentanglement.
- No Growing replaces Progressive Growing GAN architecture to prevent phase artifacts with skip connections in the generator and residual connections in the discriminator.
- Large Networks yield better results where high-resolution images have more influence.
-
- Adaptive Discrimator Augmentation augments the data given to the discriminator to overcome overfitting without the augmentations leaking into generated images.
Install "libsndfile" using the following commands for handling audio files in Linux:
sudo apt update -y
sudo apt install libsndfile1 -y
Install "ffmpeg" using the following commands:
sudo apt install ffmpeg [Ubuntu]
brew install ffmpeg [macOS]
Full install
cd ~/ffmpeg_sources && \
wget -O ffmpeg-snapshot.tar.bz2 https://ffmpeg.org/releases/ffmpeg-snapshot.tar.bz2 && \
tar xjvf ffmpeg-snapshot.tar.bz2 && \
cd ffmpeg && \
PATH="$HOME/bin:$PATH" PKG_CONFIG_PATH="$HOME/ffmpeg_build/lib/pkgconfig" ./configure \
--prefix="$HOME/ffmpeg_build" \
--pkg-config-flags="--static" \
--extra-cflags="-I$HOME/ffmpeg_build/include" \
--extra-ldflags="-L$HOME/ffmpeg_build/lib" \
--extra-libs="-lpthread -lm" \
--ld="g++" \
--bindir="$HOME/bin" \
--enable-gpl \
--enable-gnutls \
--enable-libaom \
--enable-libass \
--enable-libfdk-aac \
--enable-libfreetype \
--enable-libmp3lame \
--enable-libopus \
--enable-libsvtav1 \
--enable-libdav1d \
--enable-libvorbis \
--enable-libvpx \
--enable-libx264 \
--enable-libx265 \
--enable-nonfree && \
PATH="$HOME/bin:$PATH" make && \
make install && \
hash -r
Install required libraries using the following command:
$ pip install -r requirements.txt
diags/
: This folder consists of all the architecture diagrams from all three papers.stylegan2/
: All things StyleGAN2!notebooks/
: Notebooks for generating latent space from sound.examples/
: A folder of example videos.train.py
: Main file to kick start training, generate samples images, and generate images from interpolation.infer.py
: Simplified inference file.Study.md
: Study material.
Start training with the following command:
$ python train.py --data='/path/to/dataset'
There are various arguments that can be used as command line inputs to this command.
Generate sample images from the latest checkpoint after training with the following command:
$ python train.py --generate
Generate a video of interpolation from the latest checkpoint after training with the following command:
$ python train.py --generate-interpolation
Generate video and images from interpolation from the latest checkpoint after training with the following command:
$ python train.py --generate-interpolation --save-frames
Use the following command to run Streamlit App:
$ streamlit run run.py
You can now view your Streamlit app in your browser.
Local URL: http://localhost:8501
Network URL: http://192.168.1.4:8501
Check out the GIF below for demo!
I would like to thank the following people for sharing their work:
-
Phil Wang: This code is adapted from Phil Wang's
lucidrains/stylegan2-pytorch
. I found his version of the NVlabs official PyTorch implementation ofNVlabs/stylegan2-ada-pytorch
to be easier to comprehend. His experiments are also interesting and informative. -
Yiyang Li: The pre-trained Super Resolution GAN generator model used in this repository is from Yiyang Li's
yiyang7/Super_Resolution_with_CNNs_and_GANs
.