A Benchmark Study of Classical and Dual Polynomial Regression (DPR)-Based Probability Density Estimation Technique
Shantanu Sarkar, Mousumi Sinha, Dexter Cahoy

TL;DR
This paper introduces a computationally efficient Dual Polynomial Regression (DPR) method for probability density estimation, leveraging piecewise modeling and GPU-accelerated kernel density estimation to better capture asymmetries in real-world data.
Contribution
The paper proposes a novel DPR approach that improves density estimation accuracy and efficiency by combining piecewise polynomial fitting with GPU-accelerated KDE and HDE methods.
Findings
DPR with tKDE achieves high accuracy in modeling asymmetric unimodal distributions.
GPU-based KDE and HDE significantly outperform traditional Python implementations.
DPR order 4 balances accuracy and computational efficiency on real-world data.
Abstract
The probability density function (PDF) plays a central role in statistical and machine learning modeling. Real-world data often deviates from Gaussian assumptions, exhibiting skewness and exponential decay. To evaluate how well different density estimation methods capture such irregularities, we generated six unimodal datasets from diverse distributions that reflect real-world anomalies. These were compared using parametric methods (Pearson Type I and Normal distribution) as well as non-parametric approaches, including histograms, kernel density estimation (KDE), and our proposed method. To accelerate computation, we implemented GPU-based versions of KDE (tKDE) and histogram estimation (tHDE) in TensorFlow, both of which outperform Python SciPy's KDE. Prior work demonstrated the use of piecewise modeling for density estimation, such as local polynomial regression; however, these methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning in Healthcare · Machine Learning and Data Classification
