Online Data Collection for Efficient Semiparametric Inference

Shantanu Gupta; Zachary C. Lipton; David Childers

arXiv:2411.03195·stat.ML·November 6, 2024

Online Data Collection for Efficient Semiparametric Inference

Shantanu Gupta, Zachary C. Lipton, David Childers

PDF

Open Access 1 Repo

TL;DR

This paper introduces online data collection strategies for semiparametric inference, enabling sequential, cost-effective data gathering from multiple sources to improve estimation accuracy under budget constraints.

Contribution

It formalizes the online moment selection problem and proposes two policies with proven zero regret, advancing adaptive data collection methods for semiparametric models.

Findings

01

Policies outperform fixed data collection methods.

02

Both policies achieve zero asymptotic MSE regret.

03

Validated on synthetic and real-world causal inference tasks.

Abstract

While many works have studied statistical data fusion, they typically assume that the various datasets are given in advance. However, in practice, estimation requires difficult data collection decisions like determining the available data sources, their costs, and how many samples to collect from each source. Moreover, this process is often sequential because the data collected at a given time can improve collection decisions in the future. In our setup, given access to multiple data sources and budget constraints, the agent must sequentially decide which data source to query to efficiently estimate a target parameter. We formalize this task using Online Moment Selection, a semiparametric framework that applies to any parameter identified by a set of moment conditions. Interestingly, the optimal budget allocation depends on the (unknown) true parameters. We present two online data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shantanu95/online-moment-selection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Gaussian Processes and Bayesian Inference

MethodsSparse Evolutionary Training