Limitations of SGD for Multi-Index Models Beyond Statistical Queries

Daniel Barzilai; Ohad Shamir

arXiv:2602.05704·cs.LG·February 6, 2026

Limitations of SGD for Multi-Index Models Beyond Statistical Queries

Daniel Barzilai, Ohad Shamir

PDF

Open Access

TL;DR

This paper investigates the limitations of standard SGD in learning multi-index models, highlighting the inadequacy of the SQ framework and proposing a new approach applicable to various models including neural networks.

Contribution

It introduces a novel non-SQ framework to analyze vanilla SGD's limitations in multi-index models, extending understanding beyond prior SQ-based analyses.

Findings

01

SQ framework can be misleading for SGD analysis

02

Standard SGD faces fundamental limitations in multi-index models

03

Results apply to deep neural network architectures

Abstract

Understanding the limitations of gradient methods, and stochastic gradient descent (SGD) in particular, is a central challenge in learning theory. To that end, a commonly used tool is the Statistical Queries (SQ) framework, which studies performance limits of algorithms based on noisy interaction with the data. However, it is known that the formal connection between the SQ framework and SGD is tenuous: Existing results typically rely on adversarial or specially-structured gradient noise that does not reflect the noise in standard SGD, and (as we point out here) can sometimes lead to incorrect predictions. Moreover, many analyses of SGD for challenging problems rely on non-trivial algorithmic modifications, such as restricting the SGD trajectory to the sphere or using very small learning rates. To address these shortcomings, we develop a new, non-SQ framework to study the limitations of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Generative Adversarial Networks and Image Synthesis