Beyond Language: Format-Agnostic Reasoning Subspaces in Large Language Models
Aojie Yuan, Zhiyuan Su

TL;DR
This paper identifies a shared, format-agnostic reasoning subspace in large language models, demonstrating its role in cross-form reasoning and its potential as a core internal substrate.
Contribution
The authors introduce the FARS framework, revealing a common reasoning space across diverse symbolic formats in LLMs, supported by extensive experiments.
Findings
FARS is a 10-dimensional subspace that preserves 90-96% of model output during cross-form patching.
Replacing FARS dimensions disrupts reasoning, confirming their causal role.
FARS generalizes across concepts and architectures, supporting the Platonic Representation Hypothesis.
Abstract
Large language models represent the same reasoning in vastly different surface forms -- English prose, Python code, mathematical notation -- yet whether they share a common internal substrate across these symbolic systems remains unknown. We introduce the TriForm Benchmark (18 concepts x 6 forms x 3 instances = 324 stimuli) and study five LLMs (1.6B-8B) across three architecture families. Using permutation-corrected RSA, cross-form probing, and activation patching, we find converging evidence for a Format-Agnostic Reasoning Subspace (FARS) in middle layers. We make FARS concrete: concept-centroid PCA extracts a 10-dimensional subspace that amplifies concept structure 3x while suppressing form information to near zero. Replacing only these 10 dimensions during cross-form patching preserves 90-96% of model output -- far exceeding both full activation replacement (44-56%) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
