From Benedict Cumberbatch to Sherlock Holmes: Character Identification   in TV series without a Script

Arsha Nagrani; Andrew Zisserman

arXiv:1801.10442·cs.CV·February 1, 2018·5 cites

From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script

Arsha Nagrani, Andrew Zisserman

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised method for automatic character identification in TV series and films using only cast lists and web images, effectively handling occlusions and pose variations.

Contribution

It presents a semi-supervised learning approach that adapts actor faces to character faces, builds voice models for characters, and combines face context with speaker identification.

Findings

01

Achieved state-of-the-art results on Casablanca benchmark.

02

Successfully identified characters with occlusions and extreme poses.

03

Surpassed previous methods using transcript-based supervision.

Abstract

The goal of this paper is the automatic identification of characters in TV and feature film material. In contrast to standard approaches to this task, which rely on the weak supervision afforded by transcripts and subtitles, we propose a new method requiring only a cast list. This list is used to obtain images of actors from freely available sources on the web, providing a form of partial supervision for this task. In using images of actors to recognize characters, we make the following three contributions: (i) We demonstrate that an automated semi-supervised learning approach is able to adapt from the actor's face to the character's face, including the face context of the hair; (ii) By building voice models for every character, we provide a bridge between frontal faces (for which there is plenty of actor-level supervision) and profile (for which there is very little or none); and (iii)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Video Analysis and Summarization · Music and Audio Processing