Cross-Entropy Loss Functions: Theoretical Analysis and Applications

Anqi Mao; Mehryar Mohri; Yutao Zhong

arXiv:2304.07288·cs.LG·June 21, 2023·214 cites

Cross-Entropy Loss Functions: Theoretical Analysis and Applications

Anqi Mao, Mehryar Mohri, Yutao Zhong

PDF

Open Access 1 Video

TL;DR

This paper provides a comprehensive theoretical analysis of cross-entropy and related loss functions, introducing new bounds and loss variants, and demonstrates their effectiveness in adversarial robustness and accuracy through empirical evaluation.

Contribution

It introduces the first H-consistency bounds for comp-sum losses, proposes smooth adversarial comp-sum losses, and develops new adversarial robustness algorithms with empirical validation.

Findings

01

H-consistency bounds are tight and non-asymptotic.

02

Smooth adversarial comp-sum losses improve adversarial robustness.

03

Proposed algorithms outperform state-of-the-art in experiments.

Abstract

Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss applied to the outputs of a neural network, when the softmax is used. But, what guarantees can we rely on when using cross-entropy as a surrogate loss? We present a theoretical analysis of a broad family of loss functions, comp-sum losses, that includes cross-entropy (or logistic loss), generalized cross-entropy, the mean absolute error and other cross-entropy-like loss functions. We give the first $H$ -consistency bounds for these loss functions. These are non-asymptotic guarantees that upper bound the zero-one loss estimation error in terms of the estimation error of a surrogate loss, for the specific hypothesis set $H$ used. We further show that our bounds are tight. These bounds depend on quantities called minimizability gaps. To make them more explicit, we give a specific analysis of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Cross-Entropy Loss Functions: Theoretical Analysis and Applications· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Integrated Circuits and Semiconductor Failure Analysis

MethodsSoftmax