Do explanations generalize across large reasoning models?

Koyena Pal; David Bau; Chandan Singh

arXiv:2601.11517·cs.CL·January 19, 2026

Do explanations generalize across large reasoning models?

Koyena Pal, David Bau, Chandan Singh

PDF

Open Access

TL;DR

This paper investigates whether explanations generated by large reasoning models (LRMs) generalize across different models, finding that they often do and that certain training methods enhance this generalization, but caution is advised in interpreting these explanations.

Contribution

The study introduces a framework for evaluating explanation generalization across LRMs and proposes a sentence-level ensembling method to improve consistency.

Findings

01

Explanations often generalize across LRMs.

02

Reinforcement learning enhances explanation generalization.

03

Ensembling strategies improve answer consistency.

Abstract

Large reasoning models (LRMs) produce a textual chain of thought (CoT) in the process of solving a problem, which serves as a potentially powerful tool to understand the problem by surfacing a human-readable, natural-language explanation. However, it is unclear whether these explanations generalize, i.e. whether they capture general patterns about the underlying problem rather than patterns which are esoteric to the LRM. This is a crucial question in understanding or discovering new concepts, e.g. in AI for science. We study this generalization question by evaluating a specific notion of generalizability: whether explanations produced by one LRM induce the same behavior when given to other LRMs. We find that CoT explanations often exhibit this form of generalization (i.e. they increase consistency between LRMs) and that this increased generalization is correlated with human preference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)