A survey of a hurdle model for heavy-tailed data based on the   generalized lambda distribution

Diego Marcondes; Cl\'audia Peixoto; Ana Carolina Maia

arXiv:1712.02183·stat.AP·January 4, 2019

A survey of a hurdle model for heavy-tailed data based on the generalized lambda distribution

Diego Marcondes, Cl\'audia Peixoto, Ana Carolina Maia

PDF

TL;DR

This paper surveys the use of the Generalized Lambda Distribution (GLD) in hurdle models for heavy-tailed data with excess zeros, demonstrating superior performance over other models on healthcare expense data.

Contribution

It introduces a flexible hurdle model based on the GLD for heavy-tailed, zero-inflated data, and compares its effectiveness with existing models.

Findings

01

GLD-based hurdle models outperform GPD models on healthcare data.

02

The proposed models effectively handle heavy tails and excess zeros.

03

Empirical results show better fit with GLD models.

Abstract

In this survey we present an extensive research of the vast literature about the Generalized Lambda Distribution (GLD) and propose a hurdle, or two-way, model whose associated distribution is the GLD in order to meet the demand for a highly flexible model of heavy-tailed data with excess of zeros. We apply the developed models to a dataset consisting of yearly healthcare expenses, a typical example of heavy-tailed data with excess of zeros. The fitted models are compared with models based on the Generalised Pareto Distribution and it is established that the GLD models perform best.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.