When a Zero-Shooter Cheats: Improving Age Estimation via Activation Steering
Erik Imgrund, Pia Hanfeld, Klim Kireev, Konrad Rieck

TL;DR
This paper identifies a shortcut in vision-language models for age estimation, where models rely on identity recognition rather than visual features, and proposes an activation steering method to improve accuracy.
Contribution
The paper introduces an activation steering technique to mitigate the identity shortcut in VLMs, enhancing age estimation accuracy on various benchmarks.
Findings
Activation steering reduces mean absolute error by up to 25%.
VLMs exhibit high robustness to noise and adversarial attacks due to the shortcut.
The shortcut causes misidentification of non-celebrities as celebrities, leading to errors.
Abstract
Different age-related regulations have been proposed to protect minors from harmful content and interactions online. Automated age estimation is central to enforcing such regulations, and vision-language models (VLMs) achieve state-of-the-art performance on this task. However, we find that the zero-shot nature of VLM-based age estimation produces an unexpected side effect we call the identity shortcut: Instead of estimating age from visual features, VLMs tend to identify the depicted person and infer their age from memorized knowledge. This phenomenon leads to substantially incorrect predictions when non-celebrities are misidentified as celebrities. It also produces deceptively high robustness to noise and adversarial perturbations on celebrity images, which dominate popular benchmarks. To mitigate this, we propose an activation steering method that suppresses the shortcut by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
