On the Decision Boundary of Deep Neural Networks
Yu Li, Lizhong Ding, Xin Gao

TL;DR
This paper reveals that the last layer of deep neural networks converges to a linear SVM trained on the last hidden layer's output, providing insights into their decision boundary and implications for generalization and robustness.
Contribution
It establishes a theoretical and empirical link between neural network last layers and SVMs, enhancing understanding of deep learning decision boundaries.
Findings
Last layer converges to linear SVM under weak assumptions
Training entire network improves bias constant for better generalization
Results applicable to addressing catastrophic forgetting and adversarial attacks
Abstract
While deep learning models and techniques have achieved great empirical success, our understanding of the source of success in many aspects remains very limited. In an attempt to bridge the gap, we investigate the decision boundary of a production deep learning architecture with weak assumptions on both the training data and the model. We demonstrate, both theoretically and empirically, that the last weight layer of a neural network converges to a linear SVM trained on the output of the last hidden layer, for both the binary case and the multi-class case with the commonly used cross-entropy loss. Furthermore, we show empirically that training a neural network as a whole, instead of only fine-tuning the last weight layer, may result in better bias constant for the last weight layer, which is important for generalization. In addition to facilitating the understanding of deep learning, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
MethodsSupport Vector Machine
