VQA is a new dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer.
Subscribe to our group for updates!
Details on downloading the latest dataset may be found on the download webpage.
October 2015: Full release (v1.0)
July 2015: Beta v0.9 release
June 2015: Beta v0.1 release
Papers reporting results on the VQA dataset should --
1) Report test-standard accuracies, which can be calculated using either of the non-test-dev phases, i.e., "test2015" or "Challenge test2015" on the following links: [oe-real | oe-abstract | mc-real | mc-abstract].
2) Compare their test-standard accuracies with those on the corresponding test2015 leaderboards [oe-real-leaderboard | oe-abstract-leaderboard | mc-real-leaderboard | mc-abstract-leaderboard].
For more details, please see the challenge page.