Accounting for missing data when modelling block maxima
Emma S. Simpson, Paul J. Northrop

TL;DR
This paper extends the GEV block maxima model to explicitly handle missing data, improving the accuracy of extreme value estimates in datasets with incomplete observations.
Contribution
It introduces a likelihood-based method that accounts for missing data proportions within the GEV framework, enhancing modeling robustness.
Findings
Method performs well in simulations, comparable to complete data scenarios.
Application to real-world data demonstrates practical utility.
Improves bias and accuracy in return level estimation.
Abstract
Modelling block maxima using the generalised extreme value (GEV) distribution is a classical and widely used method for studying univariate extremes. It allows for theoretically motivated estimation of return levels, including extrapolation beyond the range of observed data. A frequently overlooked challenge in applying this methodology comes from handling datasets containing missing values. In this case, one cannot be sure whether the true maximum has been recorded in each block, and simply ignoring the issue can lead to biased parameter estimators and, crucially, underestimated return levels. We propose an extension of the standard block maxima approach to overcome such missing data issues. This is achieved by explicitly accounting for the proportion of missing values in each block within the GEV model. Inference is carried out using likelihood-based techniques, and we propose an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydrology and Drought Analysis · Statistical Distribution Estimation and Applications · Financial Risk and Volatility Modeling
