Face, Whole-Person, and Object Classification in a Unified Space Via The Interleaved Multi-Domain Identity Curriculum
Thomas M Metz, Matthew Q Hill, Alice J O'Toole

TL;DR
This paper introduces the Interleaved Multi-Domain Identity Curriculum (IMIC), a training method enabling foundation models to perform object, face, and person recognition tasks simultaneously in a shared embedding space without catastrophic forgetting.
Contribution
The paper presents IMIC, a novel interleaved training schedule that allows fine-tuning foundation models on multiple recognition tasks concurrently, maintaining generalization and outperforming prior methods.
Findings
IMIC enables multi-task recognition in a single embedding space.
EVA-02 and CLIP models achieved human-level multi-task performance.
The approach preserves out-of-distribution generalization.
Abstract
Vision foundation models can perform generalized object classification in zero-shot mode, and face/person recognition when they are fine-tuned. However, fine-tuned models suffer from catastrophic forgetting. We create models that perform four tasks (object recognition, face recognition from high- and low-quality images, and person recognition from whole-body images) in a single embedding space -- without incurring substantial catastrophic forgetting. To accomplish this, we introduce two variants of the Interleaved Multi-Domain Identity Curriculum (IMIC): a gradient-coupled, interleaving training schedule that fine-tunes a foundation backbone simultaneously on all four tasks. The IMIC method proved effective with three foundation model bases: DINOv3, CLIP, and EVA-02. Two of these (EVA-02 and CLIP) performed comparably with domain experts on all four tasks concurrently and were more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face Recognition and Perception · Face and Expression Recognition
