Mitigating deep double descent by concatenating inputs

John Chen; Qihan Wang; Anastasios Kyrillidis

arXiv:2107.00797·cs.LG·July 5, 2021

Mitigating deep double descent by concatenating inputs

John Chen, Qihan Wang, Anastasios Kyrillidis

PDF

Open Access

TL;DR

This paper proposes a dataset augmentation method to mitigate the double descent phenomenon in deep neural networks, resulting in smoother performance curves across model sizes and training epochs.

Contribution

It introduces a novel data augmentation technique that empirically reduces double descent effects in deep learning models.

Findings

01

Mitigates double descent curve in neural networks

02

Results in smoother performance across model sizes

03

Effective with respect to training epochs

Abstract

The double descent curve is one of the most intriguing properties of deep neural networks. It contrasts the classical bias-variance curve with the behavior of modern neural networks, occurring where the number of samples nears the number of parameters. In this work, we explore the connection between the double descent phenomena and the number of samples in the deep neural network setting. In particular, we propose a construction which augments the existing dataset by artificially increasing the number of samples. This construction empirically mitigates the double descent curve in this setting. We reproduce existing work on deep double descent, and observe a smooth descent into the overparameterized region for our construction. This occurs both with respect to the model size, and with respect to the number epochs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms