FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition
Dexter Neo, Tsuhan Chen

TL;DR
This paper introduces FER-C, a soft benchmark for calibrating facial expression recognition models, emphasizing the importance of soft labels for better handling out-of-distribution shifts and improving model calibration.
Contribution
It proposes a novel soft calibration benchmark for FER that accounts for OOD shifts and introduces soft labels to better reflect facial expression ambiguity.
Findings
Calibration improves across five FER algorithms
Soft labels better capture expression ambiguity
Benchmark highlights OOD shift challenges
Abstract
We present a soft benchmark for calibrating facial expression recognition (FER). While prior works have focused on identifying affective states, we find that FER models are uncalibrated. This is particularly true when out-of-distribution (OOD) shifts further exacerbate the ambiguity of facial expressions. While most OOD benchmarks provide hard labels, we argue that the ground-truth labels for evaluating FER models should be soft in order to better reflect the ambiguity behind facial behaviours. Our framework proposes soft labels that closely approximates the average information loss based on different types of OOD shifts. Finally, we show the benefits of calibration on five state-of-the-art FER algorithms tested on our benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Face and Expression Recognition · EEG and Brain-Computer Interfaces
