Regularizing cross entropy loss via minimum entropy and K-L divergence
Abdulrahman Oladipupo Ibraheem

TL;DR
This paper introduces two new loss functions, MIX-ENT and MIN-ENT, that regularize cross entropy with entropy and K-L divergence terms, improving classification accuracy on EMNIST-Letters with deep neural networks.
Contribution
The paper proposes two novel regularized loss functions, MIX-ENT and MIN-ENT, extending cross entropy with entropy and K-L divergence terms for improved deep learning classification.
Findings
MIX-ENT and MIN-ENT outperform standard cross entropy on EMNIST-Letters.
VGG with MIN-ENT reaches 95.933% accuracy, surpassing previous models.
Models and code are publicly available at the provided GitHub link.
Abstract
I introduce two novel loss functions for classification in deep learning. The two loss functions extend standard cross entropy loss by regularizing it with minimum entropy and Kullback-Leibler (K-L) divergence terms. The first of the two novel loss functions is termed mixed entropy loss (MIX-ENT for short), while the second one is termed minimum entropy regularized cross-entropy loss (MIN-ENT for short). The MIX-ENT function introduces a regularizer that can be shown to be equivalent to the sum of a minimum entropy term and a K-L divergence term. However, it should be noted that the K-L divergence term here is different from that in the standard cross-entropy loss function, in the sense that it swaps the roles of the target probability and the hypothesis probability. The MIN-ENT function simply adds a minimum entropy regularizer to the standard cross entropy loss function. In both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Mechanics and Entropy · Model Reduction and Neural Networks
MethodsMax Pooling · Dense Connections · Convolution · Dropout · Softmax
