The Interplay of Statistics and Noisy Optimization: Learning Linear Predictors with Random Data Weights
Gabriel Clara, Yazan Mash'al

TL;DR
This paper provides a unified analysis of gradient descent with random data weights in linear regression, revealing how different weighting schemes affect convergence, regularization, and statistical properties of estimators.
Contribution
It introduces a comprehensive framework for analyzing the effects of arbitrary continuous data weightings on gradient descent in linear models, connecting implicit regularization with weighted regression.
Findings
Characterizes implicit regularization from random weights.
Derives non-asymptotic convergence bounds.
Shows how weighting choices impact statistical performance.
Abstract
We analyze gradient descent with randomly weighted data points in a linear regression model, under a generic weighting distribution. This includes various forms of stochastic gradient descent, importance sampling, but also extends to weighting distributions with arbitrary continuous values, thereby providing a unified framework to analyze the impact of various kinds of noise on the training trajectory. We characterize the implicit regularization induced through the random weighting, connect it with weighted linear regression, and derive non-asymptotic bounds for convergence in first and second moments. Leveraging geometric moment contraction, we also investigate the stationary distribution induced by the added noise. Based on these results, we discuss how specific choices of weighting distribution influence both the underlying optimization problem and statistical properties of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods
