Constructing Hierarchical Q&A Datasets for Video Story Understanding

Yu-Jung Heo; Kyoung-Woon On; Seongho Choi; Jaeseo Lim; Jinah Kim,; Jeh-Kwang Ryu; Byung-Chull Bae; Byoung-Tak Zhang

arXiv:1904.00623·cs.AI·April 2, 2019·5 cites

Constructing Hierarchical Q&A Datasets for Video Story Understanding

Yu-Jung Heo, Kyoung-Woon On, Seongho Choi, Jaeseo Lim, Jinah Kim,, Jeh-Kwang Ryu, Byung-Chull Bae, Byoung-Tak Zhang

PDF

Open Access

TL;DR

This paper proposes a hierarchical approach to constructing video Q&A datasets that incorporate story-level understanding, using criteria like memory, logic, and DIKW pyramid to evaluate AI's comprehension levels.

Contribution

It introduces a novel hierarchical dataset construction method based on story understanding criteria, addressing biases and variance issues in existing video Q&A datasets.

Findings

01

Hierarchical difficulty levels improve assessment of video understanding.

02

Three criteria effectively measure story comprehension in videos.

03

The 3D map serves as a metric for evaluating AI's story understanding.

Abstract

Video understanding is emerging as a new paradigm for studying human-like AI. Question-and-Answering (Q&A) is used as a general benchmark to measure the level of intelligence for video understanding. While several previous studies have suggested datasets for video Q&A tasks, they did not really incorporate story-level understanding, resulting in highly-biased and lack of variance in degree of question difficulty. In this paper, we propose a hierarchical method for building Q&A datasets, i.e. hierarchical difficulty levels. We introduce three criteria for video story understanding, i.e. memory capacity, logical complexity, and DIKW (Data-Information-Knowledge-Wisdom) pyramid. We discuss how three-dimensional map constructed from these criteria can be used as a metric for evaluating the levels of intelligence relating to video story understanding.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling