Speaker-following Video Subtitles

Yongtao Hu; Jan Kautz; Yizhou Yu; Wenping Wang

arXiv:1407.5145·cs.HC·July 20, 2015

Speaker-following Video Subtitles

Yongtao Hu, Jan Kautz, Yizhou Yu, Wenping Wang

PDF

TL;DR

This paper introduces a novel on-screen subtitle placement method that tracks speakers in videos to improve viewer experience and reduce eyestrain by positioning subtitles near speakers instead of fixed screen locations.

Contribution

The paper presents a new speaker identification and subtitle placement algorithm that dynamically positions subtitles next to speakers, enhancing viewing comfort over traditional fixed or previous dynamic methods.

Findings

01

Outperformed conventional fixed-position subtitles in usability tests.

02

Reduced eyestrain compared to traditional methods.

03

Enhanced overall viewing experience.

Abstract

We propose a new method for improving the presentation of subtitles in video (e.g. TV and movies). With conventional subtitles, the viewer has to constantly look away from the main viewing area to read the subtitles at the bottom of the screen, which disrupts the viewing experience and causes unnecessary eyestrain. Our method places on-screen subtitles next to the respective speakers to allow the viewer to follow the visual content while simultaneously reading the subtitles. We use novel identification algorithms to detect the speakers based on audio and visual information. Then the placement of the subtitles is determined using global optimization. A comprehensive usability study indicated that our subtitle placement method outperformed both conventional fixed-position subtitling and another previous dynamic subtitling method in terms of enhancing the overall viewing experience and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.