Skip to content

sairamkiran9/AttnGAN-trans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AttnGANTRANS

Implementation of Pytorch for recreating the key results of the AttnGANTRANS models in the paper Transformer Models for Enhancing AttnGAN based Text to Image Generation by S. Naveen, M. S. S. Ram Kiran, M. Indupriya, T. V. Manikanta and P. V. Sudeep.

Code Setup

  • bird is implemented in google colab.
  • coco is implemented in our local machine (NVIDIA Quadro RTX 8000).

Dependencies

  • python 3.6

  • Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages while running in local machine:

  • python-dateutil
  • easydict
  • pandas
  • torchfile
  • nltk
  • scikit-image

If using Colab, all the dependencies will be available by default.

Data

  • Add our preprocessed metadata of bird and coco to your directory and save them to data/.
  • Download the birds dataset and extract them to data/birds/.
  • Download coco dataset and extract the images to data/coco/.

Training

  • Pre-train encoder models: Use this bird and coco files to train the encoder models.
  • Train models: Use this bird and coco files for training the models.
  • *.yml files are example configuration files for training/evaluation our models.

Pretrained Models

Sampling

For sampling we need to change configurations in cfg file.

  • To generate images for the pre-extracted embeddings: Set cfg.TRAIN.FLAG = False and cfg.B_VALIDATION = True.

  • To generate images for custom text input: Set cfg.TRAIN.FLAG = False and cfg.B_VALIDATION = False.

    Note If we are using T2I training. We need to add custom examples in example_caption.txt file.

Validation

Sample Results

The below are the example results with generated image and attention maps of each AttnGANTRANS models for the respective text captions.

  • The first row in both the results are generated by AttnGANBERT model.
  • The second row in both the results are generated by AttnGANXL model.
  • The third row in both the results are generated by AttnGANGPT model.
This bird has wings that are green and has a yellow belly. This bird has wings that are black and has large eyes with yellow and blue as the main colors and black as an accent.

Creating an API

Evaluation code for bird is configured in this file to generate URL for the API.

Citing AttnGAN

If you find AttnGANTRANS useful in your research, please consider citing:

@article{
  author    = {S. Naveen, M. S. S. Ram Kiran, M. Indupriya, T. V. Manikanta, P. V. Sudeep},
  title     = {Transformer Models for Enhancing AttnGAN based Text to Image Generation},
  Year      = {2021},
  booktitle = {{IMAVIS}}
}

Reference