You Only Compress Once: Optimal Data Compression for Estimating Linear Models
Jeffrey Wong, Eskil Forsell, Randall Lewis, Tobias Mao, Matthew, Wardrop

TL;DR
This paper introduces a data compression method that allows for efficient and accurate estimation of linear models directly from compressed data, facilitating faster research and deployment in decision-making systems.
Contribution
It proposes a unified compression and estimation strategy that preserves parameter estimates and covariances from compressed data, even with autocorrelated errors.
Findings
Estimates from compressed data match those from original data.
Method handles autocorrelated errors within clusters.
Enhances productivity for researchers and engineering systems.
Abstract
Linear models are used in online decision making, such as in machine learning, policy algorithms, and experimentation platforms. Many engineering systems that use linear models achieve computational efficiency through distributed systems and expert configuration. While there are strengths to this approach, it is still difficult to have an environment that enables researchers to interactively iterate and explore data and models, as well as leverage analytics solutions from the open source community. Consequently, innovation can be blocked. Conditionally sufficient statistics is a unified data compression and estimation strategy that is useful for the model development process, as well as the engineering deployment process. The strategy estimates linear models from compressed data without loss on the estimated parameters and their covariances, even when errors are autocorrelated within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Distributed Sensor Networks and Detection Algorithms
