Reducing Overfitting in Deep Networks by Decorrelating Representations

Michael Cogswell; Faruk Ahmed; Ross Girshick; Larry Zitnick; Dhruv; Batra

arXiv:1511.06068·cs.LG·June 13, 2016·ICLR·78 cites

Reducing Overfitting in Deep Networks by Decorrelating Representations

Michael Cogswell, Faruk Ahmed, Ross Girshick, Larry Zitnick, Dhruv, Batra

PDF

Open Access

TL;DR

This paper introduces DeCov, a new regularizer that reduces overfitting in deep neural networks by decorrelating hidden layer representations, leading to improved generalization across various datasets and architectures.

Contribution

The paper proposes DeCov, a novel regularizer that minimizes cross-covariance of activations to promote diverse representations, which was not previously applied in supervised learning.

Findings

01

DeCov significantly reduces overfitting in deep networks.

02

DeCov often outperforms Dropout in generalization performance.

03

DeCov improves model robustness across multiple datasets and architectures.

Abstract

One major challenge in training Deep Neural Networks is preventing overfitting. Many techniques such as data augmentation and novel regularizers such as Dropout have been proposed to prevent overfitting without requiring a massive amount of training data. In this work, we propose a new regularizer called DeCov which leads to significantly reduced overfitting (as indicated by the difference between train and val performance), and better generalization. Our regularizer encourages diverse or non-redundant representations in Deep Neural Networks by minimizing the cross-covariance of hidden activations. This simple intuition has been explored in a number of past works but surprisingly has never been applied as a regularizer in supervised learning. Experiments across a range of datasets and network architectures show that this loss always reduces overfitting while almost always maintaining or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsDropout