K-Way Energy Probes for Metacognition Reduce to Softmax in Discriminative Predictive Coding Networks
Jon-Paul Cacioli

TL;DR
This study shows that K-way energy probes in discriminative predictive coding networks essentially reduce to softmax, challenging the idea that they read richer signals than softmax in this context.
Contribution
The paper provides an approximate reduction demonstrating that K-way energy probes are structurally similar to softmax, with empirical evidence across multiple conditions supporting this claim.
Findings
K-way energy margin decomposes into a function of softmax margin plus a residual.
Empirical tests show the probe consistently sits below softmax across conditions.
Final-state and trajectory-integrated training yield nearly identical AUROC_2 values.
Abstract
We present this as a negative result with an explanatory mechanism, not as a formal upper bound. Predictive coding networks (PCNs) admit a K-way energy probe in which each candidate class is fixed as a target, inference is run to settling, and the per-hypothesis settled energies are compared. The probe appears to read a richer signal source than softmax, since the per-hypothesis energy depends on the entire generative chain. We argue this appearance is misleading under the standard Pinchetti-style discriminative PC formulation. We present an approximate reduction showing that with target-clamped CE-energy training and effectively-feedforward latent dynamics, the K-way energy margin decomposes into a monotone function of the log-softmax margin plus a residual that is not trained to correlate with correctness. The decomposition predicts that the structural probe should track softmax…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
