The Foes of Neural Network's Data Efficiency Among Unnecessary Input Dimensions
Vanessa D'Amario, Sanjana Srivastava, Tomotake Sasaki, Xavier Boix

TL;DR
This paper investigates how unnecessary input dimensions in datasets, such as background in object recognition, impair the data efficiency of deep neural networks, emphasizing the importance of removing task-unrelated features for better learning efficiency.
Contribution
It reveals that task-unrelated input dimensions significantly reduce data efficiency in DNNs and highlights the need for mechanisms to eliminate such dimensions.
Findings
Unnecessary input dimensions degrade data efficiency.
Removing task-unrelated features can improve learning performance.
The impact is more pronounced on input layers than hidden layers.
Abstract
Datasets often contain input dimensions that are unnecessary to predict the output label, e.g. background in object recognition, which lead to more trainable parameters. Deep Neural Networks (DNNs) are robust to increasing the number of parameters in the hidden layers, but it is unclear whether this holds true for the input layer. In this letter, we investigate the impact of unnecessary input dimensions on a central issue of DNNs: their data efficiency, ie. the amount of examples needed to achieve certain generalization performance. Our results show that unnecessary input dimensions that are task-unrelated substantially degrade data efficiency. This highlights the need for mechanisms that remove {task-unrelated} dimensions to enable data efficiency gains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Machine Learning and Data Classification
