Mechanistic Evidence for Spectral Structures in Prior-Data Fitted Networks

Kaustubh Sharma; Srijan Tiwari; Ojasva Nema; and Parikshit Pareek

arXiv:2601.21731·cs.LG·May 14, 2026

Mechanistic Evidence for Spectral Structures in Prior-Data Fitted Networks

Kaustubh Sharma, Srijan Tiwari, Ojasva Nema, and Parikshit Pareek

PDF

TL;DR

This paper demonstrates that Prior-Data Fitted Networks learn structured spectral representations that can be explicitly extracted as kernels, enabling efficient Bayesian inference.

Contribution

It provides mechanistic evidence of spectral structures in PFNs and introduces a decoder to recover explicit kernels from PFN latents.

Findings

01

Spectral information is linearly decodable from latent attention scores.

02

Spectral directions are causally used for prediction and are more effective than random directions.

03

Reconstructed kernels support GP regression with a single forward pass.

Abstract

Prior-Data Fitted Networks (PFNs) enable amortized Bayesian inference in a single forward pass, yet their internal representations remain opaque. It is unknown whether PFNs encode identifiable Bayesian structure or merely memorize input-output mappings. We provide mechanistic evidence that PFNs learn structured spectral representations and that these can be extracted as explicit kernels. First, probing experiments across three architectures, including the publicly released TabPFN, show that spectral information is linearly decodable from the latent attention score and organized along a dominant principal axis. Activation patching and targeted subspace interventions establish that this information is causally used for prediction and concentrated in a low-dimensional subspace, with spectral directions an order of magnitude more effective than random ones. Crucially, these properties hold…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.