Learning to Refine with Fine-Grained Natural Language Feedback

Manya Wadhwa; Xinyu Zhao; Junyi Jessy Li; Greg Durrett

arXiv:2407.02397·cs.CL·June 23, 2025·1 cites

Learning to Refine with Fine-Grained Natural Language Feedback

Manya Wadhwa, Xinyu Zhao, Junyi Jessy Li, Greg Durrett

PDF

Open Access 1 Repo 4 Models 1 Datasets 1 Video

TL;DR

This paper introduces a three-step refinement method called DCR that uses fine-grained natural language feedback to improve the factual accuracy of LLM-generated summaries, outperforming existing approaches.

Contribution

The paper proposes the DCR framework that decomposes refinement into detection, critique, and refinement steps, leveraging separate models for better feedback quality and factuality improvement.

Findings

01

DCR outperforms existing refinement methods in factual consistency.

02

Separate critique models provide more effective feedback than end-to-end approaches.

03

Refinement benefits from offloading discrimination to specialized models.

Abstract

Recent work has explored the capability of large language models (LLMs) to identify and correct errors in LLM-generated responses. These refinement approaches frequently evaluate what sizes of models are able to do refinement for what problems, but less attention is paid to what effective feedback for refinement looks like. In this work, we propose looking at refinement with feedback as a composition of three distinct LLM competencies: (1) detection of bad generations; (2) fine-grained natural language critique generation; (3) refining with fine-grained feedback. The first step can be implemented with a high-performing discriminative model and steps 2 and 3 can be implemented either via prompted or fine-tuned LLMs. A key property of the proposed Detect, Critique, Refine ("DCR") method is that the step 2 critique model can give fine-grained feedback about errors, made possible by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

manyawadhwa/dcr
noneOfficial

Models

Datasets

wadhma/dcr_data
dataset· 7 dl
7 dl

Videos

Learning to Refine with Fine-Grained Natural Language Feedback· underline

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSoftmax · Attention Is All You Need