Distributed Source Coding for Parametric and Non-Parametric Regression
Jiahui Wei, Elsa Dupraz, Philippe Mary

TL;DR
This paper explores the fundamental limits of compressing data for machine learning regression tasks, establishing rate-error regions and revealing no trade-off between data reconstruction and regression performance in both asymptotic and finite regimes.
Contribution
It extends Wyner-Ziv coding schemes to analyze the interplay between data compression and regression accuracy, providing new theoretical insights for goal-oriented communication systems.
Findings
No trade-off between data reconstruction and regression in asymptotic regime
Achievable rate-generalization error regions are characterized for both parametric and non-parametric regression
Finite-length analysis confirms the absence of trade-off in practical scenarios
Abstract
The design of communication systems dedicated to machine learning tasks is one key aspect of goal-oriented communications. In this framework, this article investigates the interplay between data reconstruction and learning from the same compressed observations, particularly focusing on the regression problem. We establish achievable rate-generalization error regions for both parametric and non-parametric regression, where the generalization error measures the regression performance on previously unseen data. The analysis covers both asymptotic and finite block-length regimes, providing fundamental results and practical insights for the design of coding schemes dedicated to regression. The asymptotic analysis relies on conventional Wyner-Ziv coding schemes which we extend to study the convergence of the generalization error. The finite-length analysis uses the notions of information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Data Compression Techniques
