Trunk-Branch Ensemble Convolutional Neural Networks for Video-based Face Recognition
Changxing Ding, Dacheng Tao

TL;DR
This paper introduces a novel CNN framework, TBE-CNN, for video-based face recognition that addresses challenges like blur, pose variation, and occlusion, achieving state-of-the-art results on multiple datasets.
Contribution
The paper proposes a comprehensive CNN-based framework with blur-robust training, a trunk-branch ensemble architecture, and an improved triplet loss for enhanced video face recognition.
Findings
Achieves state-of-the-art performance on PaSC, COX Face, and YouTube Faces datasets.
First place in BTAS 2016 Video Person Recognition Evaluation.
Effective handling of blur, pose variation, and occlusion in video face recognition.
Abstract
Human faces in surveillance videos often suffer from severe image blur, dramatic pose variations, and occlusion. In this paper, we propose a comprehensive framework based on Convolutional Neural Networks (CNN) to overcome challenges in video-based face recognition (VFR). First, to learn blur-robust face representations, we artificially blur training data composed of clear still images to account for a shortfall in real-world video training data. Using training data composed of both still images and artificially blurred data, CNN is encouraged to learn blur-insensitive features automatically. Second, to enhance robustness of CNN features to pose variations and occlusion, we propose a Trunk-Branch Ensemble CNN model (TBE-CNN), which extracts complementary information from holistic face images and patches cropped around facial components. TBE-CNN is an end-to-end model that extracts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
