Towards a Visual Turing Challenge

Mateusz Malinowski; Mario Fritz

arXiv:1410.8027·cs.AI·August 23, 2021·33 cites

Towards a Visual Turing Challenge

Mateusz Malinowski, Mario Fritz

PDF

Open Access

TL;DR

This paper discusses the challenges in defining and evaluating visual Turing tests, emphasizing the need for benchmarks that incorporate social consensus and handle ambiguous outputs in visual question-answering tasks.

Contribution

It analyzes the difficulties in quantifying progress in visual Turing challenges and proposes considering social consensus over curated datasets for benchmarking.

Findings

01

Introduction of a visual question-answering dataset for Turing challenge

02

Discussion on the limitations of curated datasets and the need for social consensus

03

Highlighting the importance of handling ambiguous outputs in evaluation

Abstract

As language and visual understanding by machines progresses rapidly, we are observing an increasing interest in holistic architectures that tightly interlink both modalities in a joint learning and inference process. This trend has allowed the community to progress towards more challenging and open tasks and refueled the hope at achieving the old AI dream of building machines that could pass a turing test in open domains. In order to steadily make progress towards this goal, we realize that quantifying performance becomes increasingly difficult. Therefore we ask how we can precisely define such challenges and how we can evaluate different algorithms on this open tasks? In this paper, we summarize and discuss such challenges as well as try to give answers where appropriate options are available in the literature. We exemplify some of the solutions on a recently presented dataset of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning