Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
Debjit Paul, Robert West, Antoine Bosselut, Boi Faltings

TL;DR
This paper investigates the faithfulness of reasoning steps in large language models and introduces FRODO, a framework that improves the accuracy and robustness of reasoning by training models to generate and rely on correct intermediate inferences.
Contribution
The paper presents FRODO, a novel framework that enhances reasoning faithfulness and robustness in language models through causal and counterfactual training methods.
Findings
FRODO outperforms four baselines in reasoning tasks.
FRODO improves out-of-distribution generalization.
FRODO's rationales are more faithful to final answers.
Abstract
Large language models (LLMs) have been shown to perform better when asked to reason step-by-step before answering a question. However, it is unclear to what degree the model's final answer is faithful to the stated reasoning steps. In this paper, we perform a causal mediation analysis on twelve LLMs to examine how intermediate reasoning steps generated by the LLM influence the final outcome and find that LLMs do not reliably use their intermediate reasoning steps when generating an answer. To address this issue, we introduce FRODO, a framework to tailor small-sized LMs to generate correct reasoning steps and robustly reason over these steps. FRODO consists of an inference module that learns to generate correct reasoning steps using an implicit causal reward function and a reasoning module that learns to faithfully reason over these intermediate inferences using a counterfactual and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Islamic Finance and Banking Studies · Advanced Text Analysis Techniques
