Memorization in Deep Neural Networks: Does the Loss Function matter?

Deep Patel; P.S. Sastry

arXiv:2107.09957·cs.LG·July 23, 2021

Memorization in Deep Neural Networks: Does the Loss Function matter?

Deep Patel, P.S. Sastry

PDF

Open Access 1 Repo

TL;DR

This paper investigates how the choice of loss function influences a deep neural network's ability to memorize random data, showing symmetric losses improve robustness against overfitting.

Contribution

It provides empirical evidence and a theoretical explanation that symmetric loss functions enhance resistance to memorization in deep neural networks.

Findings

01

Symmetric loss functions improve resistance to overfitting on MNIST and CIFAR-10.

02

Standard regularization techniques do not mitigate memorization.

03

Theoretical analysis explains why symmetric losses confer robustness.

Abstract

Deep Neural Networks, often owing to the overparameterization, are shown to be capable of exactly memorizing even randomly labelled data. Empirical studies have also shown that none of the standard regularization techniques mitigate such overfitting. We investigate whether the choice of the loss function can affect this memorization. We empirically show, with benchmark data sets MNIST and CIFAR-10, that a symmetric loss function, as opposed to either cross-entropy or squared error loss, results in significant improvement in the ability of the network to resist such overfitting. We then provide a formal definition for robustness to memorization and provide a theoretical explanation as to why the symmetric losses provide this robustness. Our results clearly bring out the role loss functions alone can play in this phenomenon of memorization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dbp1994/masters_thesis_codes/tree/main/memorization_and_overparam
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications