REFINER: Reasoning Feedback on Intermediate Representations

Debjit Paul; Mete Ismayilzada; Maxime Peyrard; Beatriz Borges; Antoine; Bosselut; Robert West; and Boi Faltings

arXiv:2304.01904·cs.CL·February 6, 2024·31 cites

REFINER: Reasoning Feedback on Intermediate Representations

Debjit Paul, Mete Ismayilzada, Maxime Peyrard, Beatriz Borges, Antoine, Bosselut, Robert West, and Boi Faltings

PDF

Open Access 1 Repo

TL;DR

REFINER is a framework that enhances reasoning in language models by using a critic to provide automated feedback on intermediate steps, leading to improved accuracy without extensive human data.

Contribution

The paper introduces REFINER, a novel method for training language models to generate better intermediate reasoning steps through critic feedback, improving reasoning performance.

Findings

01

Significant improvements on three reasoning tasks.

02

Automated critic enhances reasoning without finetuning the main model.

03

Critic trained without human-in-the-loop data.

Abstract

Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial context and lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermediate reasoning steps while interacting with a critic model that provides automated feedback on the reasoning. Specifically, the critic provides structured feedback that the reasoning LM uses to iteratively improve its intermediate arguments. Empirical evaluations of REFINER on three diverse reasoning tasks show significant improvements over baseline LMs of comparable scale. Furthermore, when using GPT-3.5 or ChatGPT as the reasoner, the trained critic significantly improves reasoning without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

debjitpaul/refiner
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Cosine Annealing · Weight Decay · Linear Layer · Byte Pair Encoding · 15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention · {Dispute@FaQ-s}How to file a dispute with Expedia? · Linear Warmup With Cosine Annealing · Adam