A Flatter Loss for Bias Mitigation in Cross-dataset Facial Age   Estimation

Ali Akbari; Muhammad Awais; Zhen-Hua Feng; Ammarah Farooq; Josef; Kittler

arXiv:2010.10368·cs.CV·October 28, 2020

A Flatter Loss for Bias Mitigation in Cross-dataset Facial Age Estimation

Ali Akbari, Muhammad Awais, Zhen-Hua Feng, Ammarah Farooq, Josef, Kittler

PDF

TL;DR

This paper introduces a new loss function for neural networks that reduces bias and improves generalization in cross-dataset facial age estimation, addressing real-world variability.

Contribution

It proposes a novel, smoother loss function that enhances training stability and performance in cross-dataset age estimation tasks.

Findings

01

Outperforms state-of-the-art methods in accuracy

02

Demonstrates better generalization across datasets

03

Facilitates more stable neural network training

Abstract

The most existing studies in the facial age estimation assume training and test images are captured under similar shooting conditions. However, this is rarely valid in real-world applications, where training and test sets usually have different characteristics. In this paper, we advocate a cross-dataset protocol for age estimation benchmarking. In order to improve the cross-dataset age estimation performance, we mitigate the inherent bias caused by the learning algorithm itself. To this end, we propose a novel loss function that is more effective for neural network training. The relative smoothness of the proposed loss function is its advantage with regards to the optimisation process performed by stochastic gradient descent (SGD). Compared with existing loss functions, the lower gradient of the proposed loss function leads to the convergence of SGD to a better optimum point, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsStochastic Gradient Descent