KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim; Yiyang Su; Feng Liu; Anil Jain; Xiaoming Liu

arXiv:2403.14852·cs.CV·March 25, 2024·2 cites

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Yiyang Su, Feng Liu, Anil Jain, Xiaoming Liu

PDF

Open Access 3 Repos 5 Models

TL;DR

This paper introduces KP-RPE, a novel keypoint-based relative position encoding method for ViT models, enhancing robustness to affine transformations in face recognition tasks, especially under challenging conditions.

Contribution

The paper proposes KP-RPE, a new position encoding technique that incorporates facial keypoints to improve ViT robustness against affine transformations.

Findings

01

Improves face recognition accuracy on low-quality images.

02

Enhances model resilience to scale, translation, and pose variations.

03

Effective in face and gait recognition scenarios.

Abstract

In this paper, we address the challenge of making ViT models more robust to unseen affine transformations. Such robustness becomes useful in various recognition tasks such as face recognition when image alignment failures occur. We propose a novel method called KP-RPE, which leverages key points (e.g.~facial landmarks) to make ViT more resilient to scale, translation, and pose variations. We begin with the observation that Relative Position Encoding (RPE) is a good way to bring affine transform generalization to ViTs. RPE, however, can only inject the model with prior knowledge that nearby pixels are more important than far pixels. Keypoint RPE (KP-RPE) is an extension of this principle, where the significance of pixels is not solely dictated by their proximity but also by their relative positions to specific keypoints within the image. By anchoring the significance of pixels around…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques