NEWSKVQA: Knowledge-Aware News Video Question Answering

Pranay Gupta; Manish Gupta

arXiv:2202.04015·cs.CV·February 9, 2022·1 cites

NEWSKVQA: Knowledge-Aware News Video Question Answering

Pranay Gupta, Manish Gupta

PDF

Open Access

TL;DR

This paper introduces NEWSKVQA, a new dataset and approach for knowledge-based question answering in news videos, enabling better understanding and retrieval of news content through multimodal reasoning.

Contribution

The paper presents a large-scale news video dataset with questions and a novel multimodal reasoning method for knowledge-aware video question answering.

Findings

01

The dataset contains 12K videos and 1M questions.

02

The proposed method achieves strong baseline performance.

03

The dataset is publicly available for research use.

Abstract

Answering questions in the context of videos can be helpful in video indexing, video retrieval systems, video summarization, learning management systems and surveillance video analysis. Although there exists a large body of work on visual question answering, work on video question answering (1) is limited to domains like movies, TV shows, gameplay, or human activity, and (2) is mostly based on common sense reasoning. In this paper, we explore a new frontier in video question answering: answering knowledge-based questions in the context of news videos. To this end, we curate a new dataset of 12K news videos spanning across 156 hours with 1M multiple-choice question-answer pairs covering 8263 unique entities. We make the dataset publicly available. Using this dataset, we propose a novel approach, NEWSKVQA (Knowledge-Aware News Video Question Answering) which performs multi-modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization