Attend to what I say: Highlighting relevant content on slides

Megha Mariam K M; C. V. Jawahar

arXiv:2601.10244·cs.CV·January 16, 2026

Attend to what I say: Highlighting relevant content on slides

Megha Mariam K M, C. V. Jawahar

PDF

Open Access

TL;DR

This paper presents a method to automatically identify and highlight relevant slide regions during presentations by analyzing spoken narration to improve viewer comprehension and synchronization.

Contribution

It introduces a novel approach that matches spoken content with slide elements to highlight relevant regions, enhancing understanding in content-rich multimedia presentations.

Findings

01

Effective identification of relevant slide regions based on speech analysis

02

Improved synchronization between spoken narration and slide content

03

Assessment of different solutions with success and failure cases

Abstract

Imagine sitting in a presentation, trying to follow the speaker while simultaneously scanning the slides for relevant information. While the entire slide is visible, identifying the relevant regions can be challenging. As you focus on one part of the slide, the speaker moves on to a new sentence, leaving you scrambling to catch up visually. This constant back-and-forth creates a disconnect between what is being said and the most important visual elements, making it hard to absorb key details, especially in fast-paced or content-heavy presentations such as conference talks. This requires an understanding of slides, including text, graphics, and layout. We introduce a method that automatically identifies and highlights the most relevant slide regions based on the speaker's narrative. By analyzing spoken content and matching it with textual or graphical elements in the slides, our approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Subtitles and Audiovisual Media