Integrating ConvNeXt and Vision Transformers for Enhancing Facial Age Estimation

Gaby Maroun; Salah Eddine Bekhouche; Fadi Dornaika

arXiv:2511.00123·cs.CV·November 4, 2025

Integrating ConvNeXt and Vision Transformers for Enhancing Facial Age Estimation

Gaby Maroun, Salah Eddine Bekhouche, Fadi Dornaika

PDF

Open Access

TL;DR

This paper introduces a hybrid model combining ConvNeXt and Vision Transformers for facial age estimation, achieving superior accuracy on benchmark datasets by leveraging the strengths of both architectures.

Contribution

The study presents a novel ConvNeXt-ViT hybrid architecture that enhances age estimation accuracy through integrated CNN and transformer features, with extensive evaluation and ablation studies.

Findings

01

Outperforms traditional age estimation methods

02

Achieves lower mean absolute error on benchmark datasets

03

Highlights importance of adapted attention mechanisms

Abstract

Age estimation from facial images is a complex and multifaceted challenge in computer vision. In this study, we present a novel hybrid architecture that combines ConvNeXt, a state-of-the-art advancement of convolutional neural networks (CNNs), with Vision Transformers (ViT). While each model independently delivers excellent performance on a variety of tasks, their integration leverages the complementary strengths of the CNNs localized feature extraction capabilities and the Transformers global attention mechanisms. Our proposed ConvNeXt-ViT hybrid solution was thoroughly evaluated on benchmark age estimation datasets, including MORPH II, CACD, and AFAD, and achieved superior performance in terms of mean absolute error (MAE). To address computational constraints, we leverage pre-trained models and systematically explore different configurations, using linear layers and advanced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Facial Rejuvenation and Surgery Techniques · Facial Nerve Paralysis Treatment and Research