On Regularization Properties of Artificial Datasets for Deep Learning
Karol Antczak

TL;DR
This paper explores how artificial datasets, generated by injecting noise into high-level features, serve as a form of deep regularization in neural networks, especially useful when real data is scarce.
Contribution
It introduces the concept of using artificial data as a deep regularizer, linking data generation techniques to existing regularization methods in deep learning.
Findings
Artificial data generation mimics regularization effects.
Artificial data can compensate for real data shortages.
Deep regularization via artificial data improves training stability.
Abstract
The paper discusses regularization properties of artificial data for deep learning. Artificial datasets allow to train neural networks in the case of a real data shortage. It is demonstrated that the artificial data generation process, described as injecting noise to high-level features, bears several similarities to existing regularization methods for deep neural networks. One can treat this property of artificial data as a kind of "deep" regularization. It is thus possible to regularize hidden layers of the network by generating the training data in a certain way.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Algorithms · Advanced Data Processing Techniques
