Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data   to Learn Robust and Invariant Representations

Haohan Wang; Zeyi Huang; Xindi Wu; Eric P. Xing

arXiv:2011.13052·cs.LG·November 30, 2020

Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations

Haohan Wang, Zeyi Huang, Xindi Wu, Eric P. Xing

PDF

Open Access 1 Repo

TL;DR

This paper investigates regularization techniques for neural network embeddings using data augmentation, highlighting the effectiveness of squared $ ext{L}_2$ norm regularization in learning invariant representations across multiple tasks.

Contribution

It provides a general analysis of embedding regularization choices, emphasizing the benefits of squared $ ext{L}_2$ norm regularization for robustness and invariance in augmented data training.

Findings

01

Squared $ ext{L}_2$ norm regularization outperforms recent specialized methods.

02

Regularization improves invariance learning beyond accuracy.

03

The proposed method is simpler and more effective across multiple tasks.

Abstract

Data augmentation is one of the most popular techniques for improving the robustness of neural networks. In addition to directly training the model with original samples and augmented samples, a torrent of methods regularizing the distance between embeddings/representations of the original samples and their augmented counterparts have been introduced. In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings. Our analysis suggests the ideal choices of regularization correspond to various assumptions. With an invariance test, we argue that regularization is important if the model is to be used in a broader context than the accuracy-driven setting because non-regularized approaches are limited in learning the concept of invariance, despite equally high accuracy. Finally, we also show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jyanln/alignreg
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks