Learning to Refine with Fine-Grained Natural Language Feedback
Manya Wadhwa, Xinyu Zhao, Junyi Jessy Li, Greg Durrett

TL;DR
This paper introduces a three-step refinement method called DCR that uses fine-grained natural language feedback to improve the factual accuracy of LLM-generated summaries, outperforming existing approaches.
Contribution
The paper proposes the DCR framework that decomposes refinement into detection, critique, and refinement steps, leveraging separate models for better feedback quality and factuality improvement.
Findings
DCR outperforms existing refinement methods in factual consistency.
Separate critique models provide more effective feedback than end-to-end approaches.
Refinement benefits from offloading discrimination to specialized models.
Abstract
Recent work has explored the capability of large language models (LLMs) to identify and correct errors in LLM-generated responses. These refinement approaches frequently evaluate what sizes of models are able to do refinement for what problems, but less attention is paid to what effective feedback for refinement looks like. In this work, we propose looking at refinement with feedback as a composition of three distinct LLM competencies: (1) detection of bad generations; (2) fine-grained natural language critique generation; (3) refining with fine-grained feedback. The first step can be implemented with a high-performing discriminative model and steps 2 and 3 can be implemented either via prompted or fine-tuned LLMs. A key property of the proposed Detect, Critique, Refine ("DCR") method is that the step 2 critique model can give fine-grained feedback about errors, made possible by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
MethodsSoftmax · Attention Is All You Need
