Pairwise Margin Maximization for Deep Neural Networks

Berry Weinstein; Shai Fine; Yacov Hel-Or

arXiv:2110.04519·cs.LG·October 12, 2021

Pairwise Margin Maximization for Deep Neural Networks

Berry Weinstein, Shai Fine, Yacov Hel-Or

PDF

1 Repo

TL;DR

This paper introduces Pairwise Margin Maximization (PMM), a novel regularization method for deep neural networks that improves generalization by focusing on the minimal displacement needed to change classification, outperforming traditional weight decay.

Contribution

The paper proposes PMM, a new regularization scheme tailored for deep networks, addressing limitations of the maximum margin principle in multi-class classification.

Findings

01

PMM leads to substantial performance improvements over standard regularization.

02

Implementing PMM in the deep feature space enhances training stability.

03

Empirical results show better generalization with PMM.

Abstract

The weight decay regularization term is widely used during training to constrain expressivity, avoid overfitting, and improve generalization. Historically, this concept was borrowed from the SVM maximum margin principle and extended to multi-class deep networks. Carefully inspecting this principle reveals that it is not optimal for multi-class classification in general, and in particular when using deep neural networks. In this paper, we explain why this commonly used principle is not optimal and propose a new regularization scheme, called {\em Pairwise Margin Maximization} (PMM), which measures the minimal amount of displacement an instance should take until its predicted classification is switched. In deep neural networks, PMM can be implemented in the vector space before the network's output layer, i.e., in the deep feature space, where we add an additional normalization term to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

berryweinst/pairwise-margin-maximization
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsWeight Decay · Support Vector Machine