Unique Faces Recognition in Videos
Jiahao Huo, Terence L van Zyl

TL;DR
This paper compares various neural network architectures and loss functions for face recognition in videos, finding that 3D CNNs and 2D LSTMs with triplet loss outperform other models in top-n retrievals, especially when combined with SVMs.
Contribution
The study systematically evaluates the effectiveness of different architectures and loss functions for video face recognition, highlighting the superiority of 3D CNNs and 2D LSTMs with triplet loss.
Findings
3D CNNs and 2D LSTMs with contrastive loss do not outperform Inception with contrastive loss.
3D CNNs and 2D LSTMs with triplet loss outperform Inception with triplet loss.
Feature representations from 2D LSTM with triplet loss are most effective for facial identification.
Abstract
This paper tackles face recognition in videos employing metric learning methods and similarity ranking models. The paper compares the use of the Siamese network with contrastive loss and Triplet Network with triplet loss implementing the following architectures: Google/Inception architecture, 3D Convolutional Network (C3D), and a 2-D Long short-term memory (LSTM) Recurrent Neural Network. We make use of still images and sequences from videos for training the networks and compare the performances implementing the above architectures. The dataset used was the YouTube Face Database designed for investigating the problem of face recognition in videos. The contribution of this paper is two-fold: to begin, the experiments have established 3-D Convolutional networks and 2-D LSTMs with the contrastive loss on image sequences do not outperform Google/Inception architecture with contrastive loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Video Surveillance and Tracking Methods
MethodsTriplet Loss · Siamese Network · Convolution · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
