Stochastic Learning of Semiparametric Monotone Index Models with Large Sample Size
Qingsong Yao

TL;DR
This paper introduces a stochastic, subsample-based estimation method for semiparametric monotone index models that significantly reduces computational time while maintaining statistical accuracy, suitable for very large datasets.
Contribution
It generalizes mini-batch gradient descent to semiparametric models, providing a fast, scalable estimation procedure with theoretical guarantees.
Findings
Reduces computational time by roughly n times compared to full-sample methods.
Achieves $1/\sqrt{n}$-consistency and asymptotic normality of the averaged estimator.
Maintains estimation accuracy comparable to traditional methods.
Abstract
I study the estimation of semiparametric monotone index models in the scenario where the number of observation points is extremely large and conventional approaches fail to work due to heavy computational burdens. Motivated by the mini-batch gradient descent algorithm (MBGD) that is widely used as a stochastic optimization tool in the machine learning field, I proposes a novel subsample- and iteration-based estimation procedure. In particular, starting from any initial guess of the true parameter, I progressively update the parameter using a sequence of subsamples randomly drawn from the data set whose sample size is much smaller than . The update is based on the gradient of some well-chosen loss function, where the nonparametric component is replaced with its Nadaraya-Watson kernel estimator based on subsamples. My proposed algorithm essentially generalizes MBGD algorithm to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Statistical Methods and Inference
