SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder   for Self-Supervised Landmark Estimation

Kejia Yin; Varshanth R. Rao; Ruowei Jiang; Xudong Liu; Parham Aarabi,; David B. Lindell

arXiv:2405.18322·cs.CV·May 29, 2024

SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation

Kejia Yin, Varshanth R. Rao, Ruowei Jiang, Xudong Liu, Parham Aarabi,, David B. Lindell

PDF

Open Access

TL;DR

SCE-MAE introduces a self-supervised framework using masked autoencoders and a novel correspondence refinement method to improve facial landmark estimation without annotated data.

Contribution

The paper proposes SCE-MAE, a new self-supervised landmark estimation method that leverages masked autoencoders and a local correspondence refinement strategy.

Findings

01

Outperforms state-of-the-art methods by 20-44% in landmark matching.

02

Achieves 9-15% improvement in landmark detection accuracy.

03

Demonstrates robustness and effectiveness across extensive experiments.

Abstract

Self-supervised landmark estimation is a challenging task that demands the formation of locally distinct feature representations to identify sparse facial landmarks in the absence of annotated data. To tackle this task, existing state-of-the-art (SOTA) methods (1) extract coarse features from backbones that are trained with instance-level self-supervised learning (SSL) paradigms, which neglect the dense prediction nature of the task, (2) aggregate them into memory-intensive hypercolumn formations, and (3) supervise lightweight projector networks to naively establish full local correspondences among all pairs of spatial features. In this paper, we introduce SCE-MAE, a framework that (1) leverages the MAE, a region-level SSL method that naturally better suits the landmark prediction task, (2) operates on the vanilla feature map instead of on expensive hypercolumns, and (3) employs a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis

MethodsMasked autoencoder