A Bayesian Hurdle Quantile Regression Model for Citation Analysis with Mass Points at Lower Values
Marzieh Shahmandi, Paul Wilson, and Mike Thelwall

TL;DR
This paper introduces an improved Bayesian hurdle quantile regression model tailored for citation count data with significant mass points at low values, enhancing accuracy in estimating citation effects.
Contribution
It advances existing models by shifting the hurdle point beyond the main mass points, improving quantile regression accuracy for moderately to highly cited articles.
Findings
More accurate quantile estimates near mass points.
Effective modeling of low citation probability factors.
Validated with simulated and real citation data.
Abstract
Quantile regression presents a complete picture of the effects on the location, scale, and shape of the dependent variable at all points, not just the mean. We focus on two challenges for citation count analysis by quantile regression: discontinuity and substantial mass points at lower counts. A Bayesian hurdle quantile regression model for count data with a substantial mass point at zero was proposed by King and Song (2019). It uses quantile regression for modeling the nonzero data and logistic regression for modeling the probability of zeros versus nonzeros. We show that substantial mass points for low citation counts will nearly certainly also affect parameter estimation in the quantile regression part of the model, similar to a mass point at zero. We update the King and Song model by shifting the hurdle point past the main mass points. This model delivers more accurate quantile…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
