LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition

Jinghan You; Shanglin Li; Yuanrui Sun; Jiangchuan Wei; Mingyu Guo; Chao Feng; Jiao Ran

arXiv:2501.13420·cs.CV·August 18, 2025

LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition

Jinghan You, Shanglin Li, Yuanrui Sun, Jiangchuan Wei, Mingyu Guo, Chao Feng, Jiao Ran

PDF

Open Access 1 Models

TL;DR

LVFace introduces a ViT-based face recognition model with Progressive Cluster Optimization, achieving state-of-the-art results and demonstrating scalability and robustness in large-scale and real-world scenarios.

Contribution

The paper presents LVFace, a novel ViT-based face recognition framework with PCO, improving performance and stability over CNN-inspired training methods.

Findings

01

LVFace surpasses leading face recognition models on multiple benchmarks.

02

LVFace achieves first place in the ICCV 2021 Masked Face Recognition Challenge.

03

LVFace demonstrates scalability and compatibility with mainstream vision and language models.

Abstract

Vision Transformers (ViTs) have revolutionized large-scale visual modeling, yet remain underexplored in face recognition (FR) where CNNs still dominate. We identify a critical bottleneck: CNN-inspired training paradigms fail to unlock ViT's potential, leading to suboptimal performance and convergence instability.To address this challenge, we propose LVFace, a ViT-based FR model that integrates Progressive Cluster Optimization (PCO) to achieve superior results. Specifically, PCO sequentially applies negative class sub-sampling (NCS) for robust and fast feature alignment from random initialization, feature expectation penalties for centroid stabilization, performing cluster boundary refinement through full-batch training without NCS constraints. LVFace establishes a new state-of-the-art face recognition baseline, surpassing leading approaches such as UniFace and TopoFR across multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
bytedance-research/LVFace
model· 979 dl· ♡ 27
979 dl♡ 27

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis