Online Updating Statistics for Heterogenous Updating Regressions via Homogenization Techniques
Lin Lu, Lu Jun, Li Weiyu

TL;DR
This paper introduces a homogenization strategy for online updating in heterogenous regression models with changing variable sets, enabling efficient parameter estimation and statistical inference in big data streams.
Contribution
It proposes a novel homogenization technique to represent heterogenous models, allowing for consistent online updating and theoretical analysis without constraints on data batch size.
Findings
Achieves estimation efficiency and oracle property.
Provides asymptotic properties for online statistics.
Validated through simulation experiments.
Abstract
Under the environment of big data streams, it is a common situation where the variable set of a model may change according to the condition of data streams. In this paper, we propose a homogenization strategy to represent the heterogenous models that are gradually updated in the process of data streams. With the homogenized representations, we can easily construct various online updating statistics such as parameter estimation, residual sum of squares and -statistic for the heterogenous updating regression models. The main difference from the classical scenarios is that the artificial covariates in the homogenized models are not identically distributed as the natural covariates in the original models, consequently, the related theoretical properties are distinct from the classical ones. The asymptotical properties of the online updating statistics are established, which show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Bandit Algorithms Research · Bayesian Methods and Mixture Models
