
TL;DR
This paper revisits the informativeness of gradients in deep learning, providing a theoretical bound on gradient variance and demonstrating its implications for learning efficiency and security in specific models.
Contribution
It introduces a general variance bound for gradients based on target function independence and input entropy, enhancing understanding of gradient limitations.
Findings
The variance bound scales with input entropy and target independence.
Application to Learning with Errors (LWE) shows practical relevance.
Experiments analyze deep learning attacks on LWE.
Abstract
In the past decade gradient-based deep learning has revolutionized several applications. However, this rapid advancement has highlighted the need for a deeper theoretical understanding of its limitations. Research has shown that, in many practical learning tasks, the information contained in the gradient is so minimal that gradient-based methods require an exceedingly large number of iterations to achieve success. The informativeness of the gradient is typically measured by its variance with respect to the random selection of a target function from a hypothesis class. We use this framework and give a general bound on the variance in terms of a parameter related to the pairwise independence of the target function class and the collision entropy of the input distribution. Our bound scales as , where $ \tilde{\mathcal{O}}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
