# Improving Generalization of Deep Neural Networks by Leveraging Margin   Distribution

**Authors:** Shen-Huan Lyu, Lu Wang, Zhi-Hua Zhou

arXiv: 1812.10761 · 2024-07-10

## TL;DR

This paper introduces a new theoretical framework for understanding DNN generalization by considering the entire margin distribution, and validates it through experiments optimizing the margin ratio.

## Contribution

It presents a novel generalization bound based on the margin distribution and proposes a margin distribution loss function to improve DNN generalization.

## Key findings

- Margin distribution statistics influence generalization performance.
- Optimizing margin ratio reduces generalization gap.
- Experimental results confirm the theoretical predictions.

## Abstract

Recent research has used margin theory to analyze the generalization performance for deep neural networks (DNNs). The existed results are almost based on the spectrally-normalized minimum margin. However, optimizing the minimum margin ignores a mass of information about the entire margin distribution, which is crucial to generalization performance. In this paper, we prove a generalization upper bound dominated by the statistics of the entire margin distribution. Compared with the minimum margin bounds, our bound highlights an important measure for controlling the complexity, which is the ratio of the margin standard deviation to the expected margin. We utilize a convex margin distribution loss function on the deep neural networks to validate our theoretical results by optimizing the margin ratio. Experiments and visualizations confirm the effectiveness of our approach and the correlation between generalization gap and margin ratio.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.10761/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/1812.10761/full.md

## References

87 references — full list in the complete paper: https://tomesphere.com/paper/1812.10761/full.md

---
Source: https://tomesphere.com/paper/1812.10761