An extended note on the multibin logarithmic score used in the FluSight   competitions

Johannes Bracher

arXiv:1910.07084·stat.AP·June 8, 2022

An extended note on the multibin logarithmic score used in the FluSight competitions

Johannes Bracher

PDF

Open Access

TL;DR

This paper examines the multibin logarithmic score used in CDC's FluSight influenza forecasting competitions, highlighting its non-proper nature and potential to incentivize dishonest forecasts, with analysis based on 2016/17 competition data.

Contribution

It critically analyzes the multibin logarithmic score's properties and practical implications, revealing issues with its non-properness in influenza forecasting evaluations.

Findings

01

Multibin score is not a proper scoring rule.

02

Potential for encouraging dishonest forecasts.

03

Analysis based on 2016/17 FluSight data.

Abstract

In recent years the Centers for Disease Control and Prevention (CDC) have organized FluSight influenza forecasting competitions. To evaluate the participants' forecasts a multibin logarithmic score has been created, which is a non-standard variant of the established logarithmic score. Unlike the original log score, the multibin version is not proper and may thus encourage dishonest forecasting. We explore the practical consequences this may have, using forecasts from the 2016/17 FluSight competition for illustration.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · COVID-19 epidemiological studies · Influenza Virus Research Studies