Pose-Selective Max Pooling for Measuring Similarity
Xiang Xiang, Trac D. Tran

TL;DR
This paper introduces a pose-selective max pooling method that improves video-based face recognition by selecting pose-representative frames, reducing computational load, and maintaining high accuracy in identity verification.
Contribution
It proposes a novel pose-aware frame selection technique and a max correlation similarity measure for more efficient and accurate face recognition in videos.
Findings
Achieves comparable accuracy to VGG-face using fewer frames.
Effectively captures pose diversity to improve recognition.
Reduces computational complexity in video-based face verification.
Abstract
In this paper, we deal with two challenges for measuring the similarity of the subject identities in practical video-based face recognition - the variation of the head pose in uncontrolled environments and the computational expense of processing videos. Since the frame-wise feature mean is unable to characterize the pose diversity among frames, we define and preserve the overall pose diversity and closeness in a video. Then, identity will be the only source of variation across videos since the pose varies even within a single video. Instead of simply using all the frames, we select those faces whose pose point is closest to the centroid of the K-means cluster containing that pose point. Then, we represent a video as a bag of frame-wise deep face features while the number of features has been reduced from hundreds to K. Since the video representation can well represent the identity, now…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Biometric Identification and Security
