FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing

Fengjian Xue; Xuecheng Wu; Heli Sun; Yunyun Shi; Shi Chen; Liangyu Fu; Jinheng Xie; Dingkang Yang; Hao Wang; Junxiao Xue; Liang He

arXiv:2603.29697·cs.CV·April 1, 2026

FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing

Fengjian Xue, Xuecheng Wu, Heli Sun, Yunyun Shi, Shi Chen, Liangyu Fu, Jinheng Xie, Dingkang Yang, Hao Wang, Junxiao Xue, Liang He

PDF

TL;DR

FED-Bench introduces a comprehensive, multi-dimensional benchmark and evaluation protocol for facial expression editing, addressing existing gaps in quality, instruction adherence, and bias mitigation.

Contribution

It provides a new scalable benchmark with a detailed evaluation suite and demonstrates its utility by improving model performance through additional training data.

Findings

01

Current models struggle with high-fidelity, accurate expression editing.

02

FED-Score effectively disentangles evaluation dimensions, reducing bias.

03

Fine-grained instruction following is the main bottleneck in current approaches.

Abstract

Facial expression image editing requires fine-grained control to strictly preserve human identity and background while precisely manipulating expression. However, existing editing benchmarks primarily focus on general scenarios, lacking high-quality facial images and corresponding editing instructions. Furthermore, current evaluation metrics exhibit systemic biases in this task, often favoring lazy editing or overfit editing. To bridge these gaps, we propose FED-Bench, a comprehensive benchmark featuring rigorous testing and an accurate evaluation suite. First, we carefully construct a benchmark of 747 triplets through a cascaded and scalable pipeline, each comprising an original image, an editing instruction, and a ground-truth image for precise evaluation. Second, we introduce FED-Score, a cross-granularity evaluation protocol that disentangles assessment into three dimensions:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.