Chi-square Loss for Softmax: an Echo of Neural Network Structure
Zeyu Wang, Meiqing Wang

TL;DR
This paper introduces a chi-square based loss function for Softmax classification, which is statistically grounded, sensitive to neural network structure, and affected by label smoothing, offering new insights into model training dynamics.
Contribution
It proposes a novel chi-square loss for Softmax, analyzes its statistical properties, and explores its relationship with neural network structure and label smoothing effects.
Findings
Chi-square loss is unbiased in optimization.
Distribution of chi-square loss reflects neural network structure.
Performance degrades with many classes due to its strictness.
Abstract
Softmax working with cross-entropy is widely used in classification, which evaluates the similarity between two discrete distribution columns (predictions and true labels). Inspired by chi-square test, we designed a new loss function called chi-square loss, which is also works for Softmax. Chi-square loss has a statistical background. We proved that it is unbiased in optimization, and clarified its using conditions (its formula determines that it must work with label smoothing). In addition, we studied the sample distribution of this loss function by visualization and found that the distribution is related to the neural network structure, which is distinct compared to cross-entropy. In the past, the influence of structure was often ignored when visualizing. Chi-square loss can notice changes in neural network structure because it is very strict, and we explained the reason for this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Neural Networks and Applications · Image Enhancement Techniques
MethodsLabel Smoothing · Softmax
