Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu, Jian Li

TL;DR
This paper investigates how gradient descent implicitly maximizes the margin in homogeneous neural networks, providing theoretical insights and empirical validation on standard datasets, with implications for model robustness.
Contribution
It generalizes previous margin maximization results to broader classes of neural networks and offers quantitative convergence analysis under weaker assumptions.
Findings
Normalized margin increases over training time.
Convergence to a KKT point of a margin-related optimization problem.
Empirical validation on MNIST and CIFAR-10 datasets.
Abstract
In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks, including fully-connected and convolutional neural networks with ReLU or LeakyReLU activations. In particular, we study the gradient descent or gradient flow (i.e., gradient descent with infinitesimal step size) optimizing the logistic loss or cross-entropy loss of any homogeneous model (possibly non-smooth), and show that if the training loss decreases below a certain threshold, then we can define a smoothed version of the normalized margin which increases over time. We also formulate a natural constrained optimization problem related to margin maximization, and prove that both the normalized margin and its smoothed version converge to the objective value at a KKT point of the optimization problem. Our results generalize the previous results for logistic regression with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsLogistic Regression · *Communicated@Fast*How Do I Communicate to Expedia?
