Revisiting the Relation Between Robustness and Universality

M. Klabunde; L. Caspari; F. Lemmerich

arXiv:2510.19427·cs.LG·October 23, 2025

Revisiting the Relation Between Robustness and Universality

M. Klabunde, L. Caspari, F. Lemmerich

PDF

Open Access

TL;DR

This paper critically examines the universality hypothesis in adversarially robust models, revealing that while some similarities exist, predictive behavior varies across datasets and is not universally convergent, with implications for model training.

Contribution

The study tests the generality of the universality hypothesis, showing partial universality and identifying classifier retraining as a way to improve predictive consistency.

Findings

01

High representational similarity in specific settings

02

Predictive behavior does not universally converge with robustness

03

Retraining classifiers can enhance predictive universality

Abstract

The modified universality hypothesis proposed by Jones et al. (2022) suggests that adversarially robust models trained for a given task are highly similar. We revisit the hypothesis and test its generality. While we verify Jones' main claim of high representational similarity in specific settings, results are not consistent across different datasets. We also discover that predictive behavior does not converge with increasing robustness and thus is not universal. We find that differing predictions originate in the classification layer, but show that more universal predictive behavior can be achieved with simple retraining of the classifiers. Overall, our work points towards partial universality of neural networks in specific settings and away from notions of strict universality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)