Speaker-following Video Subtitles
Yongtao Hu, Jan Kautz, Yizhou Yu, Wenping Wang

TL;DR
This paper introduces a novel on-screen subtitle placement method that tracks speakers in videos to improve viewer experience and reduce eyestrain by positioning subtitles near speakers instead of fixed screen locations.
Contribution
The paper presents a new speaker identification and subtitle placement algorithm that dynamically positions subtitles next to speakers, enhancing viewing comfort over traditional fixed or previous dynamic methods.
Findings
Outperformed conventional fixed-position subtitles in usability tests.
Reduced eyestrain compared to traditional methods.
Enhanced overall viewing experience.
Abstract
We propose a new method for improving the presentation of subtitles in video (e.g. TV and movies). With conventional subtitles, the viewer has to constantly look away from the main viewing area to read the subtitles at the bottom of the screen, which disrupts the viewing experience and causes unnecessary eyestrain. Our method places on-screen subtitles next to the respective speakers to allow the viewer to follow the visual content while simultaneously reading the subtitles. We use novel identification algorithms to detect the speakers based on audio and visual information. Then the placement of the subtitles is determined using global optimization. A comprehensive usability study indicated that our subtitle placement method outperformed both conventional fixed-position subtitling and another previous dynamic subtitling method in terms of enhancing the overall viewing experience and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
