Counting component for VQA

This is the official implementation of our ICLR 2018 paper Learning to Count Objects in Natural Images for Visual Question Answering in PyTorch. In this paper, we introduce a counting component that allows VQA models to count objects from an attention map, achieving state-of-the-art results on the number category of VQA v2.

The core module is fully contained in counting.py. If you want to use the counting component, that is the only file that you need.

Check out the README's in the vqa-v2 directory for VQA v2 and toy directory for our toy dataset for more specific information on how to train and evaluate on these datasets.

Single-model results on VQA v2 test-std split

As of time of writing, our accuracy on number questions is state-of-the art for single and ensemble models. The accuracy on the overall category is, as far as we know, the second best among single models (see MFH), though our approach is complementary to theirs.

Yes/No	Number	Other	All
83.56	51.39	59.11	68.41

UPDATE: With this year's VQA Challenge, our number results are no longer SotA. However, Bilinear Attention Networks [code] use this counting component with their improved attention model and get 54.04% on the number category, which is the new SotA on the number category. This validates our claim that a better attention model should lead to further improvements in counting through our counting module.

BibTeX entry

@InProceedings{zhang2018vqacount,
  author    = {Yan Zhang and Jonathon Hare and Adam Pr\"ugel-Bennett},
  title     = {Learning to Count Objects in Natural Images for Visual Question Answering},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  eprint    = {1802.05766},
  url       = {https://openreview.net/forum?id=B12Js_yRb},
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
toy		toy
vqa-v2		vqa-v2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cats.png		cats.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Counting component for VQA

Single-model results on VQA v2 test-std split

BibTeX entry

About

Releases 1

Packages

Languages

License

Cyanogenoid/vqa-counting

Folders and files

Latest commit

History

Repository files navigation

Counting component for VQA

Single-model results on VQA v2 test-std split

BibTeX entry

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages