Diagnosing Generalization Failures from Representational Geometry Markers

Chi-Ning Chou; Artem Kirsanov; Yao-Yuan Yang; SueYeon Chung

arXiv:2603.01879·cs.LG·March 3, 2026

Diagnosing Generalization Failures from Representational Geometry Markers

Chi-Ning Chou, Artem Kirsanov, Yao-Yuan Yang, SueYeon Chung

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a top-down approach using representational geometry markers to predict and diagnose generalization failures in AI models, especially for out-of-distribution scenarios, improving model robustness and interpretability.

Contribution

It proposes a novel geometric marker-based method to forecast model performance on unseen data, moving beyond mechanistic explanations to system-level indicators.

Findings

01

Geometric properties of in-distribution object manifolds predict out-of-distribution performance.

02

Reductions in effective manifold dimensionality and utility forecast weaker OOD generalization.

03

Geometric patterns outperform ID accuracy in predicting transfer learning success.

Abstract

Generalization, the ability to perform well beyond the training context, is a hallmark of biological and artificial intelligence, yet anticipating unseen failures remains a central challenge. Conventional approaches often take a ``bottom-up'' mechanistic route by reverse-engineering interpretable features or circuits to build explanatory models. While insightful, these methods often struggle to provide the high-level, predictive signals for anticipating failure in real-world deployment. Here, we propose using a ``top-down'' approach to studying generalization failures inspired by medical biomarkers: identifying system-level measurements that serve as robust indicators of a model's future performance. Rather than mapping out detailed internal mechanisms, we systematically design and test network markers to probe structure, function links, identify prognostic indicators, and validate…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 10Confidence 4

Strengths

* Fantastic scientific exposition of the idea, the design, and the results. * The found candidates for OOD failure markers are interesting and non-trivial. Thus, it may trigger future research on the mechanism beyond their contribution to identifying OOD failures, leading to a breakthrough in our understanding of the issue.

Weaknesses

* The manuscript is performing what is called in statistical literature "a fishing expedition" for markers. For a fixed set of datasets, had the author tested thousands of markers, they could have reported selectively on the most promising ones, thus finding markers that succeed by chance and do not generalize to other datasets. I don't suspect the authors' ethics -- but they need to take measures against such a mistake. The authors can safeguard against random markers by calculating the number

Reviewer 02Rating 4Confidence 4

Strengths

1. The paper is extremely well-written, clear, and argues its central claims well. Despite relying heavily on prior work in GLUE, and therefore having little space to go over the theory, the authors do a good job of providing the necessary intuition for various concepts. 2. The empirical evaluation is comprehensive, covering many model architectures, datasets, and hyper-parameter configurations. Not only does this go a long way towards bolstering the authors claims, I believe the existence of th

Weaknesses

1. While I appreciate the novelty of the medical framing, ultimately, none of the technical aspects of this framing translate into the framework. Instead of providing new insight, this perspective only seemed to confuse me. For example, the connection to "biomarkers" is far less important than emphasizing that measures of performance need to be task-relevant *as well as* descriptive of underlying mechanisms. The framing is not a deal-breaker, but I believe the paper would be stronger if it spent

Reviewer 03Rating 4Confidence 4

Strengths

1. The topic is of great importance -- being able to reliably predict a model's OOD performance without having access to the OOD dataset is of great importance. 2. The presentation of the whole work is clear and interesting, with some room for improvement regarding the technical details (which I'll discuss later). 3. I particularly appreciate authors' debating current trends which overly focus on studying models using tools developed in mathematics or physics. The idea to draw more inspiration

Weaknesses

1. The main issue I see with this work is the lack of comparison to previous works studying the problem. While the authors do reference several papers in related works, they seem to miss the core works that study the same questions. For instance, [1] introduced the Tunnel Effect Hypothesis, showing that the drop of OOD performance is strongly correlated with the numerical rank of representations. Further [2] refined the Tunnel Hypothesis, showing how the Tunnel Effect (and thus OOD performance)

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Cell Image Analysis Techniques · Explainable Artificial Intelligence (XAI)