Agnostic Sample Compression Schemes for Regression

Idan Attias; Steve Hanneke; Aryeh Kontorovich; Menachem Sadigurschi

arXiv:1810.01864·cs.LG·February 6, 2024

Agnostic Sample Compression Schemes for Regression

Idan Attias, Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi

PDF

Open Access

TL;DR

This paper introduces new sample compression schemes for agnostic regression with various $\, ext{losses}$, demonstrating their size bounds and limitations, especially for linear models and specific $\, ext{loss functions}$.

Contribution

It constructs the first positive approximate compression schemes for agnostic regression with $\, ext{losses}$ in [1,∞], including linear regression and specific $\, ext{loss}$ cases, and establishes limitations for others.

Findings

01

Approximate compression of size linear in dimension for linear regression.

02

Exact compression schemes of size linear in dimension for $\, ext{losses}$ }$\, ext{like}\, ext{L}_1$ and $\, ext{L}_ ext{infinity}$.

03

Non-existence of bounded size exact schemes for certain $\, ext{losses}$, refining previous results.

Abstract

We obtain the first positive results for bounded sample compression in the agnostic regression setting with the $ℓ_{p}$ loss, where $p \in [1, \infty]$ . We construct a generic approximate sample compression scheme for real-valued function classes exhibiting exponential size in the fat-shattering dimension but independent of the sample size. Notably, for linear regression, an approximate compression of size linear in the dimension is constructed. Moreover, for $ℓ_{1}$ and $ℓ_{\infty}$ losses, we can even exhibit an efficient exact sample compression scheme of size linear in the dimension. We further show that for every other $ℓ_{p}$ loss, $p \in (1, \infty)$ , there does not exist an exact agnostic compression scheme of bounded size. This refines and generalizes a negative result of David, Moran, and Yehudayoff for the $ℓ_{2}$ loss. We close by posing general open questions: for agnostic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods

MethodsLinear Regression