Saliency Detection in Educational Videos: Analyzing the Performance of Current Models, Identifying Limitations and Advancement Directions
Evelyn Navarrete, Ralph Ewerth, Anett Hoppe

TL;DR
This paper evaluates current saliency detection models on educational videos, revealing their limitations and suggesting directions for improvement to better support learning applications.
Contribution
It provides the first comprehensive evaluation of saliency detection models specifically on educational videos, highlighting challenges and potential areas for advancement.
Findings
Educational videos pose unique challenges for saliency detection.
Current models perform poorly on educational content.
Identified failure scenarios and suggested improvements.
Abstract
Identifying the regions of a learning resource that a learner pays attention to is crucial for assessing the material's impact and improving its design and related support systems. Saliency detection in videos addresses the automatic recognition of attention-drawing regions in single frames. In educational settings, the recognition of pertinent regions in a video's visual stream can enhance content accessibility and information retrieval tasks such as video segmentation, navigation, and summarization. Such advancements can pave the way for the development of advanced AI-assisted technologies that support learning with greater efficacy. However, this task becomes particularly challenging for educational videos due to the combination of unique characteristics such as text, voice, illustrations, animations, and more. To the best of our knowledge, there is currently no study that evaluates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedia Influence and Health
MethodsSoftmax · Attention Is All You Need
