On Regularization Properties of Artificial Datasets for Deep Learning

Karol Antczak

arXiv:1908.07005·cs.LG·August 21, 2019

On Regularization Properties of Artificial Datasets for Deep Learning

Karol Antczak

PDF

Open Access

TL;DR

This paper explores how artificial datasets, generated by injecting noise into high-level features, serve as a form of deep regularization in neural networks, especially useful when real data is scarce.

Contribution

It introduces the concept of using artificial data as a deep regularizer, linking data generation techniques to existing regularization methods in deep learning.

Findings

01

Artificial data generation mimics regularization effects.

02

Artificial data can compensate for real data shortages.

03

Deep regularization via artificial data improves training stability.

Abstract

The paper discusses regularization properties of artificial data for deep learning. Artificial datasets allow to train neural networks in the case of a real data shortage. It is demonstrated that the artificial data generation process, described as injecting noise to high-level features, bears several similarities to existing regularization methods for deep neural networks. One can treat this property of artificial data as a kind of "deep" regularization. It is thus possible to regularize hidden layers of the network by generating the training data in a certain way.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms · Advanced Data Processing Techniques