Additive Partially Linear Models for Massive Heterogeneous Data
Binhuan Wang, Yixin Fang, Heng Lian, Hua Liang

TL;DR
This paper develops an additive partially linear modeling framework for massive heterogeneous data, enabling simultaneous extraction of common features and exploration of heterogeneity across sub-populations with optimal estimators and hypothesis tests.
Contribution
It introduces aggregation estimators with optimal asymptotic properties for common parameters and constructs heterogeneity and homogeneity tests within a flexible additive model.
Findings
Estimators achieve asymptotic optimal bounds.
Heterogeneity test effectively detects differences across sub-populations.
Simulation and real data demonstrate the methods' practical performance.
Abstract
We consider an additive partially linear framework for modelling massive heterogeneous data. The major goal is to extract multiple common features simultaneously across all sub-populations while exploring heterogeneity of each sub-population. We propose an aggregation type of estimators for the commonality parameters that possess the asymptotic optimal bounds and the asymptotic distributions as if there were no heterogeneity. This oracle result holds when the number of sub-populations does not grow too fast and the tuning parameters are selected carefully. A plug-in estimator for the heterogeneity parameter is further constructed, and shown to possess the asymptotic distribution as if the commonality information were available. Furthermore, we develop a heterogeneity test for the linear components and a homogeneity test for the non-linear components accordingly. The performance of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
