Gender Bias in Emotion Recognition by Large Language Models
Maureen Herbert, Katie Sun, Angelica Lim, Yasaman Etesam

TL;DR
This paper investigates gender bias in emotion recognition by large language models and finds that training-based debiasing strategies are more effective than prompt engineering alone.
Contribution
It introduces and evaluates training-based debiasing methods to reduce gender bias in LLMs' emotion recognition tasks.
Findings
Training interventions significantly reduce gender bias.
Prompt engineering alone is insufficient for bias mitigation.
Debiasing improves fairness in emotion recognition by LLMs.
Abstract
The rapid advancement of large language models (LLMs) and their growing integration into daily life underscore the importance of evaluating and ensuring their fairness. In this work, we examine fairness within the domain of emotional theory of mind, investigating whether LLMs exhibit gender biases when presented with a description of a person and their environment and asked, ''How does this person feel?''. Furthermore, we propose and evaluate several debiasing strategies, demonstrating that achieving meaningful reductions in bias requires training based interventions rather than relying solely on inference-time prompt-based approaches such as prompt engineering, etc.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEmotion and Mood Recognition · Multimodal Machine Learning Applications · Topic Modeling
