AttnGAN_TRANS

Implementation of Pytorch for recreating the key results of the AttnGAN_TRANS models in the paper Transformer Models for Enhancing AttnGAN based Text to Image Generation by S. Naveen, M. S. S. Ram Kiran, M. Indupriya, T. V. Manikanta and P. V. Sudeep.

Code Setup

bird is implemented in google colab.
coco is implemented in our local machine (NVIDIA Quadro RTX 8000).

Dependencies

python 3.6
Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages while running in local machine:

python-dateutil
easydict
pandas
torchfile
nltk
scikit-image

If using Colab, all the dependencies will be available by default.

Data

Add our preprocessed metadata of bird and coco to your directory and save them to data/.
Download the birds dataset and extract them to data/birds/.
Download coco dataset and extract the images to data/coco/.

Training

Pre-train encoder models: Use this bird and coco files to train the encoder models.
Train models: Use this bird and coco files for training the models.
*.yml files are example configuration files for training/evaluation our models.

Pretrained Models

Encoder
- AttnGAN_GPT, AttnGAN_BERT, AttnGAN_XL for bird.
- AttnGAN_GPT for coco.
Pretrained Models
- AttnGAN_GPT, AttnGAN_BERT, AttnGAN_XL for bird.
- AttnGAN_GPT for coco.

Sampling

For sampling we need to change configurations in cfg file.

To generate images for the pre-extracted embeddings: Set cfg.TRAIN.FLAG = False and cfg.B_VALIDATION = True.
To generate images for custom text input: Set cfg.TRAIN.FLAG = False and cfg.B_VALIDATION = False.

Note If we are using T2I training. We need to add custom examples in example_caption.txt file.

Validation

We compute inception score for models trained on birds using StackGAN-inception-model.
We compute inception score for models trained on coco using improved-gan/inception_score.

Sample Results

The below are the example results with generated image and attention maps of each AttnGAN_TRANS models for the respective text captions.

The first row in both the results are generated by AttnGAN_BERT model.
The second row in both the results are generated by AttnGAN_XL model.
The third row in both the results are generated by AttnGAN_GPT model.

This bird has wings that are green and has a yellow belly.	This bird has wings that are black and has large eyes with yellow and blue as the main colors and black as an accent.

Creating an API

Evaluation code for bird is configured in this file to generate URL for the API.

Citing AttnGAN

If you find AttnGAN_TRANS useful in your research, please consider citing:

@article{
  author    = {S. Naveen, M. S. S. Ram Kiran, M. Indupriya, T. V. Manikanta, P. V. Sudeep},
  title     = {Transformer Models for Enhancing AttnGAN based Text to Image Generation},
  Year      = {2021},
  booktitle = {{IMAVIS}}
}

Reference

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [code]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AttnGAN_TRANS

Code Setup

Dependencies

Data

Training

Pretrained Models

Sampling

Validation

Sample Results

Creating an API

Citing AttnGAN

Files

README.md

Latest commit

History

README.md

File metadata and controls

AttnGANTRANS

Code Setup

Dependencies

Data

Training

Pretrained Models

Sampling

Validation

Sample Results

Creating an API

Citing AttnGAN

AttnGAN_TRANS