Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos
Markos Stamatakis, Joshua Berger, Christian Wartena, Ralph Ewerth and, Anett Hoppe

TL;DR
This paper explores how vision-language models can generate educational questions from videos, assessing their performance, the effects of fine-tuning, and challenges in relevance and diversity to improve learning engagement.
Contribution
It provides a comprehensive evaluation of current vision-language models for question generation in educational videos and highlights the need for fine-tuning and better datasets.
Findings
Models can generate relevant questions out-of-the-box but need fine-tuning for content specificity.
Fine-tuning improves question relevance and answerability.
Challenges remain in ensuring question diversity and maintaining relevance.
Abstract
Web-based educational videos offer flexible learning opportunities and are becoming increasingly popular. However, improving user engagement and knowledge retention remains a challenge. Automatically generated questions can activate learners and support their knowledge acquisition. Further, they can help teachers and learners assess their understanding. While large language and vision-language models have been employed in various tasks, their application to question generation for educational videos remains underexplored. In this paper, we investigate the capabilities of current vision-language models for generating learning-oriented questions for educational video content. We assess (1) out-of-the-box models' performance; (2) fine-tuning effects on content-specific question generation; (3) the impact of different video modalities on question quality; and (4) in a qualitative study,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline and Blended Learning
