# Field-aware Calibration: A Simple and Empirically Strong Method for   Reliable Probabilistic Predictions

**Authors:** Feiyang Pan, Xiang Ao, Pingzhong Tang, Min Lu, Dapeng Liu, Lei Xiao,, Qing He

arXiv: 1905.10713 · 2020-01-28

## TL;DR

This paper introduces a new field-aware calibration method called Neural Calibration that improves the reliability of probabilistic predictions by addressing miscalibration issues across specific data subsets, validated on large-scale datasets.

## Contribution

The paper proposes Neural Calibration, a novel post-hoc calibration technique that leverages field-aware information to enhance probabilistic prediction accuracy and calibration.

## Key findings

- Significant reduction in field-level calibration error.
- Improved negative log-likelihood, Brier score, and AUC.
- Outperforms existing calibration methods on large datasets.

## Abstract

It is often observed that the probabilistic predictions given by a machine learning model can disagree with averaged actual outcomes on specific subsets of data, which is also known as the issue of miscalibration. It is responsible for the unreliability of practical machine learning systems. For example, in online advertising, an ad can receive a click-through rate prediction of 0.1 over some population of users where its actual click rate is 0.15. In such cases, the probabilistic predictions have to be fixed before the system can be deployed.   In this paper, we first introduce a new evaluation metric named field-level calibration error that measures the bias in predictions over the sensitive input field that the decision-maker concerns. We show that existing post-hoc calibration methods have limited improvements in the new field-level metric and other non-calibration metrics such as the AUC score. To this end, we propose Neural Calibration, a simple yet powerful post-hoc calibration method that learns to calibrate by making full use of the field-aware information over the validation set. We present extensive experiments on five large-scale datasets. The results showed that Neural Calibration significantly improves against uncalibrated predictions in common metrics such as the negative log-likelihood, Brier score and AUC, as well as the proposed field-level calibration error.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.10713/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.10713/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/1905.10713/full.md

---
Source: https://tomesphere.com/paper/1905.10713