Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini, Adel Javanmard, Murat A. Erdogdu

TL;DR
This paper demonstrates that in high-dimensional multi-index models, robust feature learning can be achieved efficiently by first learning features normally and then tuning a linear layer for adversarial robustness, with sample complexity independent of dimension.
Contribution
It proves that hidden directions in multi-index models provide optimal low-dimensional projections for adversarial robustness, enabling robust learning with minimal additional samples.
Findings
Robust feature learning is as straightforward as standard learning.
Sample complexity for robustness does not depend on ambient dimension.
Hidden directions serve as optimal projections for adversarial robustness.
Abstract
Recently, there have been numerous studies on feature learning with neural networks, specifically on learning single- and multi-index models where the target is a function of a low-dimensional projection of the input. Prior works have shown that in high dimensions, the majority of the compute and data resources are spent on recovering the low-dimensional projection; once this subspace is recovered, the remainder of the target can be learned independently of the ambient dimension. However, implications of feature learning in adversarial settings remain unexplored. In this work, we take the first steps towards understanding adversarially robust feature learning with neural networks. Specifically, we prove that the hidden directions of a multi-index model offer a Bayes optimal low-dimensional projection for robustness against -bounded adversarial perturbations under the squared…
Peer Reviews
Decision·ICLR 2025 Poster
- The problem of obtaining theoretical guarantees on adversarial risk and the dependence on the structure of the data is well-motivated of interest to the community. - The direction of studying the interplay between feature learning and adversarial robustness appears to be novel. - The paper is well written and describes the setup, literature well. - The proofs are easy to follow.
My main criticism of the paper is that the extent of the theoretical novelty in the analysis and unexpectedness of the result is unclear in the paper's presentation. The paper would improve by emphasizing the central difficulties in combining the finite-dimensional guarantees with feature learning results. Specifically, it would be useful to know what parts of the analysis would change if one were directly studying the setup of random features on finite dimensions. It appears that the role of th
The paper’s first contribution is that the ground truth latent features are in fact the optimal features in terms of adversarial robustness. This is not largely surprising, but it plays an important role in informing the kinds of algorithms one should use to achieve an adversarially robust model. They then show how to characterize the sample complexity and model complexity for just adversarially training the second layer for a broad class of target functions when given access to a feature learni
The major weakness is the lack of extensive experiments. I think that the first row of figure one is not a fair comparison as the paper graphs a model trained completely using adversarial training techniques to just training the last layer using adversarial techniques while initializing the first layer to the ground truth representation. In the second row, all models are forced to learn representations which is a more realistic scenario. However, based on the high-level result that adversarial
- Although I do not have any particular knowledge in robust training, I found the paper accessible and well explained. The stakes are clearly presented in the introduction. Additionally, the setup and the assumptions are generally well motivated. - Despite not having read the proofs in detail, I appreciated that the proof techniques originate from diverse ideas and prior works including statistical learning, with generalization bounds provided in Appendix C1, geometrical analysis from the low-d
- Theorems 4, 5, 6 and 7 rely on specific activations (ReLU / polynomial). What is the technical challenge arising when considering more general activations? - Although it is interesting to include some proofs in the main text, the proof of Theorem 1 could be easily replaced with an outline of the proof ideas. - On the other side, it would be helpful for the reader to have a paragraph or two summarizing the main steps or ideas of the proofs of the main theorems. - One of the main claim of thi
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
