More data speeds up training time in learning halfspaces over sparse vectors
Amit Daniely, Nati Linial, Shai Shalev Shwartz

TL;DR
This paper demonstrates that more data can reduce training time in learning halfspaces over sparse vectors, revealing a tradeoff between sample size and computational complexity under certain hardness assumptions.
Contribution
It introduces a novel methodology for establishing computational-statistical gaps and shows how additional data enables efficient learning of sparse halfspaces beyond traditional sample complexity limits.
Findings
More data speeds up learning of sparse halfspaces.
Computational-statistical gaps are established under hardness assumptions.
Efficient learning is possible with significantly more data than the sample complexity bound.
Abstract
The increased availability of data in recent years has led several authors to ask whether it is possible to use data as a {\em computational} resource. That is, if more data is available, beyond the sample complexity limit, is it possible to use the extra examples to speed up the computation time required to perform the learning task? We give the first positive answer to this question for a {\em natural supervised learning problem} --- we consider agnostic PAC learning of halfspaces over -sparse vectors in . This class is inefficiently learnable using examples. Our main contribution is a novel, non-cryptographic, methodology for establishing computational-statistical gaps, which allows us to show that, under a widely believed assumption that refuting random formulas is hard, it is impossible to efficiently learn this class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Imbalanced Data Classification Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
