# A Comparison of Flare Forecasting Methods. II. Benchmarks, Metrics and   Performance Results for Operational Solar Flare Forecasting Systems

**Authors:** K.D. Leka, Sung-Hong Park, Kanya Kusano, Jesse Andries, Graham Barnes,, Suzy Bingham, D. Shaun Bloomfield, Aoife E. McCloskey, Veronique Delouille,, David Falconer, Peter T. Gallagher, Manolis K. Georgoulis, Yuki Kubo, Kangjin, Lee, Sangwoo Lee, Vasily Lobzin, JunChul Mun, Sophie A. Murray, Tarek A.M., Hamad Nageem, Rami Qahwaji, Michael Sharpe, Rob Steenburgh, Graham Steward,, Michael Terkildsen

arXiv: 1907.02905 · 2019-09-04

## TL;DR

This paper compares various operational solar flare forecasting methods using multiple metrics, establishing performance benchmarks and evaluation methodologies to assess and improve forecasting accuracy.

## Contribution

It provides the first direct comparison of operational solar flare forecasting methods with a robust evaluation framework and performance benchmarks.

## Key findings

- Multiple methods outperform 'no skill' baseline
- No single method is universally best, performance depends on event definition and metrics
- Established a comprehensive evaluation methodology for operational forecasting

## Abstract

Solar flares are extremely energetic phenomena in our Solar System. Their impulsive, often drastic radiative increases, in particular at short wavelengths, bring immediate impacts that motivate solar physics and space weather research to understand solar flares to the point of being able to forecast them. As data and algorithms improve dramatically, questions must be asked concerning how well the forecasting performs; crucially, we must ask how to rigorously measure performance in order to critically gauge any improvements. Building upon earlier-developed methodology (Barnes et al, 2016, Paper I), international representatives of regional warning centers and research facilities assembled in 2017 at the Institute for Space-Earth Environmental Research, Nagoya University, Japan to - for the first time - directly compare the performance of operational solar flare forecasting methods. Multiple quantitative evaluation metrics are employed, with focus and discussion on evaluation methodologies given the restrictions of operational forecasting. Numerous methods performed consistently above the "no skill" level, although which method scored top marks is decisively a function of flare event definition and the metric used; there was no single winner. Following in this paper series we ask why the performances differ by examining implementation details (Leka et al. 2019, Paper III), and then we present a novel analysis method to evaluate temporal patterns of forecasting errors in (Park et al. 2019, Paper IV). With these works, this team presents a well-defined and robust methodology for evaluating solar flare forecasting methods in both research and operational frameworks, and today's performance benchmarks against which improvements and new methods may be compared.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.02905/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1907.02905/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1907.02905/full.md

---
Source: https://tomesphere.com/paper/1907.02905