The Problem with Assessing Statistical Methods
Abigail Arnold, Jason Loeppky

TL;DR
This paper highlights the issues in evaluating statistical methods through simulation studies, emphasizing the importance of proper summaries like the Empirical Cumulative Distribution Function for accurate assessment.
Contribution
It demonstrates that the Empirical Cumulative Distribution Function is an effective tool for assessing statistical methods in large-scale simulations.
Findings
Empirical CDF provides clear insights into method performance.
Qualitative summaries often mislead conclusions.
Proper visualization improves assessment accuracy.
Abstract
In this paper, we investigate the problem of assessing statistical methods and effectively summarizing results from simulations. Specifically, we consider problems of the type where multiple methods are compared on a reasonably large test set of problems. These simulation studies are typically used to provide advice on an effective method for analyzing future untested problems. Most of these simulation studies never apply statistical methods to find which method(s) are expected to perform best. Instead, conclusions are based on a qualitative assessment of poorly chosen graphical and numerical summaries of the results. We illustrate that the Empirical Cumulative Distribution Function when used appropriately is an extremely effective tool for assessing what matters in large scale statistical simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Probabilistic and Robust Engineering Design · Optimal Experimental Design Methods
