Self-attention aggregation network for video face representation and   recognition

Ihor Protsenko; Taras Lehinevych; Dmytro Voitekh; Ihor Kroosh; Nick; Hasty; Anthony Johnson

arXiv:2010.05340·cs.CV·October 13, 2020

Self-attention aggregation network for video face representation and recognition

Ihor Protsenko, Taras Lehinevych, Dmytro Voitekh, Ihor Kroosh, Nick, Hasty, Anthony Johnson

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel self-attention aggregation network (SAAN) for video face recognition that effectively handles multiple identities and outperforms traditional pooling methods, validated on public datasets.

Contribution

The paper presents the first aggregation approach considering multiple identities in videos using self-attention, enhancing face representation accuracy.

Findings

01

SAAN outperforms average pooling on IJB-C dataset.

02

SAAN effectively handles videos with multiple identities.

03

A new multi-identity video dataset was introduced.

Abstract

Models based on self-attention mechanisms have been successful in analyzing temporal data and have been widely used in the natural language domain. We propose a new model architecture for video face representation and recognition based on a self-attention mechanism. Our approach could be used for video with single and multiple identities. To the best of our knowledge, no one has explored the aggregation approaches that consider the video with multiple identities. The proposed approach utilizes existing models to get the face representation for each video frame, e.g., ArcFace and MobileFaceNet, and the aggregation module produces the aggregated face representation vector for video by taking into consideration the order of frames and their quality scores. We demonstrate empirical results on a public dataset for video face recognition called IJB-C to indicate that the self-attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lehinevych/SAAN
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face and Expression Recognition · Video Surveillance and Tracking Methods

MethodsAdditive Angular Margin Loss