Thinking in Different Spaces: Domain-Specific Latent Geometry Survives Cross-Architecture Translation
Marcus Armstrong, Navid Ayoobi, Arjun Mukherjee

TL;DR
This paper explores whether independently trained language models share compatible latent geometries and demonstrates that a linear projection can align their internal representations, enabling behavior correction without weight updates across diverse architectures and reasoning tasks.
Contribution
It introduces a method to align and intervene in different language models' internal states via linear projections, revealing domain-specific geometric properties and enabling cross-architecture behavioral correction.
Findings
Linear projections achieve moderate geometric alignment (R^2 ≈ 0.50) across heterogeneous models.
Behavioral correction rates range from 8.5% to 50% across reasoning tasks.
Projection matrices collapse when transferred across reasoning domains, indicating domain-specific geometry.
Abstract
We investigate whether independently trained language models converge to geometrically compatible latent representations, and whether this compatibility can be exploited to correct model behavior at inference time without any weight updates. We learn a linear projection matrix that maps activation vectors from a large teacher model into the coordinate system of a smaller student model, then intervene on the student's residual stream during generation by substituting its internal state with the translated teacher representation. Across a fully crossed experimental matrix of 20 heterogeneous teacher-student pairings spanning mixture-of-experts, dense, code-specialized, and synthetically trained architectures, the Ridge projection consistently achieves R^2 = 0.50 on verbal reasoning and R^2 = 0.40 on mathematical reasoning, collapsing to R^2 = -0.22 under permutation control and R^2 = 0.01…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Intelligent Tutoring Systems and Adaptive Learning · Multimodal Machine Learning Applications
