Semiparametric regression in testicular germ cell data
Anastasia Voulgaraki, Benjamin Kedem, Barry I. Graubard

TL;DR
This paper introduces a semiparametric regression method that combines multiple data sources for more efficient estimation, demonstrated on testicular germ cell data to analyze effects of height and age on weight.
Contribution
It develops a semiparametric density ratio model for multivariate data integration, improving kernel density estimation efficiency over traditional single-sample methods.
Findings
The proposed estimator outperforms traditional kernel density estimators in efficiency.
Application to testicular germ cell data reveals insights into height and age effects.
Diagnostic tools are provided for model validation.
Abstract
It is possible to approach regression analysis with random covariates from a semiparametric perspective where information is combined from multiple multivariate sources. The approach assumes a semiparametric density ratio model where multivariate distributions are "regressed" on a reference distribution. A kernel density estimator can be constructed from many data sources in conjunction with the semiparametric model. The estimator is shown to be more efficient than the traditional single-sample kernel density estimator, and its optimal bandwidth is discussed in some detail. Each multivariate distribution and the corresponding conditional expectation (regression) of interest are estimated from the combined data using all sources. Graphical and quantitative diagnostic tools are suggested to assess model validity. The method is applied in quantifying the effect of height and age on weight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
