Beyond Overlap Metrics: Rewarding Reasoning and Preferences for Faithful Multi-Role Dialogue Summarization

Xiaoyong Mei; Tingting Zuo; Da Chen; Guangyu Hu; Xiangyu Wen; Chao Duan; Mingyan Zhang; Fudan Zheng

arXiv:2604.17188·cs.CL·April 29, 2026

Beyond Overlap Metrics: Rewarding Reasoning and Preferences for Faithful Multi-Role Dialogue Summarization

Xiaoyong Mei, Tingting Zuo, Da Chen, Guangyu Hu, Xiangyu Wen, Chao Duan, Mingyan Zhang, Fudan Zheng

PDF

1 Repo

TL;DR

This paper introduces a reasoning and reward-based framework for multi-role dialogue summarization that improves factual faithfulness and human preference alignment beyond traditional metric optimization.

Contribution

It couples explicit reasoning traces with reward optimization to enhance faithfulness and preference alignment in multi-role dialogue summarization.

Findings

01

Matches strong baselines on ROUGE and BERTScore

02

Improves factual faithfulness on SAMSum dataset

03

Demonstrates stable semantic consistency on CSDS

Abstract

Multi-role dialogue summarization requires modeling complex interactions among multiple speakers while preserving role-specific information and factual consistency. However, most existing methods optimize for automatic metrics such as ROUGE and BERTScore, which favor surface-level imitation of references rather than genuine gains in faithfulness or alignment with human preferences. We propose a novel framework that couples explicit cognitive-style reasoning with reward-based optimization for multi-role dialogue summarization. Our method first distills structured reasoning traces (e.g., step-by-step inferences and intermediate reflections) from a large teacher model and uses them as auxiliary supervision to initialize a reasoning-aware summarizer via staged supervised fine-tuning. It then applies GRPO with a dual-principle reward that blends metric-based signals with human-aligned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/collections/NebulaPixel/summorchestra-multirole-summary
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.