On Loss Functions for Deep Neural Networks in Classification
Katarzyna Janocha, Wojciech Marian Czarnecki

TL;DR
This paper investigates how different loss functions impact deep neural network classification performance, robustness, and learning dynamics, providing theoretical insights and experimental comparisons on classical datasets.
Contribution
It offers a comprehensive analysis of various loss functions, including L1 and L2, with probabilistic interpretations and introduces two novel viable loss functions for deep nets.
Findings
L1 and L2 losses are justified classification objectives with probabilistic interpretation.
Different loss functions significantly affect learning dynamics and robustness.
Two new loss functions are shown to be effective alternatives.
Abstract
Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design - one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors' opinion, underrepresented - while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Neural Networks and Applications · Explainable Artificial Intelligence (XAI)
