Facial-R1: Aligning Reasoning and Recognition for Facial Emotion Analysis

Jiulong Wu; Yucheng Shen; Lingyong Yan; Haixin Sun; Deguo Xia; Jizhou Huang; Min Cao

arXiv:2511.10254·cs.CV·November 14, 2025

Facial-R1: Aligning Reasoning and Recognition for Facial Emotion Analysis

Jiulong Wu, Yucheng Shen, Lingyong Yan, Haixin Sun, Deguo Xia, Jizhou Huang, Min Cao

PDF

Open Access 1 Datasets 1 Video

TL;DR

Facial-R1 is a novel three-stage framework that improves facial emotion analysis by aligning reasoning with recognition, reducing hallucinations, and enhancing interpretability using minimal supervision and a new large-scale dataset.

Contribution

The paper introduces Facial-R1, a three-stage alignment framework with a new dataset, addressing hallucinated reasoning and misalignment issues in facial emotion analysis.

Findings

01

Achieves state-of-the-art performance on FEA benchmarks

02

Demonstrates strong generalization across datasets

03

Provides robust interpretability of emotion reasoning

Abstract

Facial Emotion Analysis (FEA) extends traditional facial emotion recognition by incorporating explainable, fine-grained reasoning. The task integrates three subtasks: emotion recognition, facial Action Unit (AU) recognition, and AU-based emotion reasoning to model affective states jointly. While recent approaches leverage Vision-Language Models (VLMs) and achieve promising results, they face two critical limitations: (1) hallucinated reasoning, where VLMs generate plausible but inaccurate explanations due to insufficient emotion-specific knowledge; and (2) misalignment between emotion reasoning and recognition, caused by fragmented connections between observed facial features and final labels. We propose Facial-R1, a three-stage alignment framework that effectively addresses both challenges with minimal supervision. First, we employ instruction fine-tuning to establish basic emotional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

QBiscuits/FEA-20K
dataset· 34 dl
34 dl

Videos

Facial-R1: Aligning Reasoning and Recognition for Facial Emotion Analysis· underline

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Explainable Artificial Intelligence (XAI)