Model-Centric Diagnostics: A Framework for Internal State Readouts

Fangzheng Wu; Brian Summa

arXiv:2601.16874·cs.CV·February 3, 2026

Model-Centric Diagnostics: A Framework for Internal State Readouts

Fangzheng Wu, Brian Summa

PDF

Open Access

TL;DR

This paper introduces a unified, model-centric diagnostic framework that interprets training states through internal readouts like gradients and entropy, aiding checkpointing and early stopping decisions.

Contribution

It unifies various internal diagnostics under a single geometric perspective of the model's training state, enabling more informed model management strategies.

Findings

01

Preliminary experiments on ImageNet and COCO show potential benefits.

02

Different readouts provide complementary insights into training progress.

Abstract

We present a model-centric diagnostic framework that treats training state as a latent variable and unifies a family of internal readouts -- head-gradient norms, confidence, entropy, margin, and related signals -- as anchor-relative projections of that state. A preliminary version of this work introduced a head-gradient probe for checkpoint selection. In this version, we focus on the unifying perspective and structural diagnostics; full algorithmic details, theoretical analysis, and experimental validation will appear in a forthcoming paper. We outline the conceptual scaffold: any prediction head induces a local loss landscape whose geometry (gradient magnitude, curvature, sharpness) reflects how well the upstream features are aligned with the task. Different readout choices -- gradient norms, softmax entropy, predictive margin -- correspond to different projections of this geometry,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis