ECIS-VQG: Generation of Entity-centric Information-seeking Questions from Videos
Arpan Phukan, Manish Gupta, Asif Ekbal

TL;DR
This paper introduces ECIS-VQG, a new dataset and model for generating entity-centric questions from videos, addressing a gap in video question generation research and demonstrating effective results.
Contribution
The work provides the first large-scale dataset for entity-centric video question generation and a novel Transformer-based model utilizing multimodal signals and contrastive loss.
Findings
Achieved high scores on BLEU, ROUGE, CIDEr, and METEOR metrics.
Demonstrated the model's practical usability for generating relevant questions.
Published the dataset and code for future research.
Abstract
Previous studies on question generation from videos have mostly focused on generating questions about common objects and attributes and hence are not entity-centric. In this work, we focus on the generation of entity-centric information-seeking questions from videos. Such a system could be useful for video-based learning, recommending ``People Also Ask'' questions, video-based chatbots, and fact-checking. Our work addresses three key challenges: identifying question-worthy information, linking it to entities, and effectively utilizing multimodal signals. Further, to the best of our knowledge, there does not exist a large-scale dataset for this task. Most video question generation datasets are on TV shows, movies, or human activities or lack entity-centric information-seeking questions. Hence, we contribute a diverse dataset of YouTube videos, VideoQuestions, consisting of 411 videos…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Natural Language Processing Techniques
MethodsFocus
