Using the Gini coefficient to characterize the shape of computational   chemistry error distributions

Pascal Pernot; Andreas Savin

arXiv:2012.09589·physics.chem-ph·February 19, 2021

Using the Gini coefficient to characterize the shape of computational chemistry error distributions

Pascal Pernot, Andreas Savin

PDF

2 Repos

TL;DR

This paper investigates the use of the Gini coefficient to characterize error distributions in computational chemistry, proposing a mode-centered approach to improve diagnostic clarity and complement existing benchmarking methods.

Contribution

It introduces a mode-centered Gini coefficient method to better diagnose error distribution shapes in computational chemistry benchmarking.

Findings

01

Gini coefficient provides a global view of error distributions

02

Mode-centered Gini coefficient reduces ambiguity in diagnostics

03

Complementary to traditional benchmarking statistics

Abstract

The distribution of errors is a central object in the assesment and benchmarking of computational chemistry methods. The popular and often blind use of the mean unsigned error as a benchmarking statistic leads to ignore distributions features that impact the reliability of the tested methods. We explore how the Gini coefficient offers a global representation of the errors distribution, but, except for extreme values, does not enable an unambiguous diagnostic. We propose to relieve the ambiguity by applying the Gini coefficient to mode-centered error distributions. This version can usefully complement benchmarking statistics and alert on error sets with potentially problematic shapes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.