Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Xuran Meng, Difan Zou, Yuan Cao

TL;DR
This paper demonstrates that over-parameterized two-layer ReLU CNNs can effectively learn XOR classification tasks with label noise, achieving near-optimal accuracy under certain conditions, and provides theoretical bounds on their performance.
Contribution
It offers the first theoretical analysis of CNNs learning XOR-type problems with noise, establishing conditions for near-optimal learning and bounds when these conditions are not met.
Findings
CNNs can learn XOR problems with high accuracy under specific conditions.
A lower bound shows CNNs' performance is limited when conditions are not satisfied.
The study extends understanding of benign overfitting in nonlinear models.
Abstract
Modern deep learning models are usually highly over-parameterized so that they can overfit the training data. Surprisingly, such overfitting neural networks can usually still achieve high prediction accuracy. To study this "benign overfitting" phenomenon, a line of recent works has theoretically studied the learning of linear models and two-layer neural networks. However, most of these analyses are still limited to the very simple learning problems where the Bayes-optimal classifier is linear. In this work, we investigate a class of XOR-type classification tasks with label-flipping noises. We show that, under a certain condition on the sample complexity and signal-to-noise ratio, an over-parameterized ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Moreover, we also establish a matching lower bound result showing that when the previous condition is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Neural Networks and Applications
