An Alternative Probabilistic Interpretation of the Huber Loss

Gregory P. Meyer

arXiv:1911.02088·stat.ML·November 20, 2020

An Alternative Probabilistic Interpretation of the Huber Loss

Gregory P. Meyer

PDF

TL;DR

This paper introduces an alternative probabilistic interpretation of the Huber loss, linking it to KL divergence between Laplace distributions, which aids in selecting hyper-parameters based on noise estimation in data.

Contribution

It proposes a new probabilistic perspective that relates the Huber loss transition point to noise distribution parameters, improving hyper-parameter selection.

Findings

01

The new interpretation relates the transition point to noise in data.

02

It enables intuitive hyper-parameter tuning based on noise estimation.

03

Demonstrated effectiveness on object detection models.

Abstract

The Huber loss is a robust loss function used for a wide range of regression tasks. To utilize the Huber loss, a parameter that controls the transitions from a quadratic function to an absolute value function needs to be selected. We believe the standard probabilistic interpretation that relates the Huber loss to the Huber density fails to provide adequate intuition for identifying the transition point. As a result, a hyper-parameter search is often necessary to determine an appropriate value. In this work, we propose an alternative probabilistic interpretation of the Huber loss, which relates minimizing the loss to minimizing an upper-bound on the Kullback-Leibler divergence between Laplace distributions, where one distribution represents the noise in the ground-truth and the other represents the noise in the prediction. In addition, we show that the parameters of the Laplace…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsHuber loss · Region Proposal Network · Softmax · Convolution · RoIPool · Faster R-CNN