Towards a Visual Turing Challenge
Mateusz Malinowski, Mario Fritz

TL;DR
This paper discusses the challenges in defining and evaluating visual Turing tests, emphasizing the need for benchmarks that incorporate social consensus and handle ambiguous outputs in visual question-answering tasks.
Contribution
It analyzes the difficulties in quantifying progress in visual Turing challenges and proposes considering social consensus over curated datasets for benchmarking.
Findings
Introduction of a visual question-answering dataset for Turing challenge
Discussion on the limitations of curated datasets and the need for social consensus
Highlighting the importance of handling ambiguous outputs in evaluation
Abstract
As language and visual understanding by machines progresses rapidly, we are observing an increasing interest in holistic architectures that tightly interlink both modalities in a joint learning and inference process. This trend has allowed the community to progress towards more challenging and open tasks and refueled the hope at achieving the old AI dream of building machines that could pass a turing test in open domains. In order to steadily make progress towards this goal, we realize that quantifying performance becomes increasingly difficult. Therefore we ask how we can precisely define such challenges and how we can evaluate different algorithms on this open tasks? In this paper, we summarize and discuss such challenges as well as try to give answers where appropriate options are available in the literature. We exemplify some of the solutions on a recently presented dataset of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
