
TL;DR
This paper develops a geometric theory of neural representations, emphasizing the importance of symmetry-invariant probes and introducing quotient-based transfer for better model comparison.
Contribution
It introduces a symmetry-aware framework for neural probing, identifying a hierarchy of stable probes and proposing quotient-based transfer for cross-model analysis.
Findings
Linear probes are the simplest stable probes under symmetry.
Degree-2 probes capture more structure than linear probes.
Quotient-based transfer improves cross-model probe portability.
Abstract
Neural representations are not unique objects. Even when two systems realize the same downstream computation, their hidden coordinates may differ by reparameterization. A probe family intended to reveal structure already present in a representation should therefore be stable under the relevant representation symmetries rather than be tied to a particular basis. We study this group action in the tractable exact setting of the final readout layer, where equivalent realizations induce affine changes of hidden coordinates. The resulting symmetry principle singles out a unique hierarchy of shallow coordinate-stable probes, with linear probes as its degree-1 member. We also show that a natural object for cross-model probe transfer is a shared probe-visible quotient--the representation modulo directions invisible to the probe family--rather than the full hidden state. Experiments on synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
