The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization
Abdelali Bouyahia, Fr\'ed\'eric LeBlanc, Mario Marchand

TL;DR
This paper introduces an information-theoretic framework to analyze how data augmentation influences generalization and invariance in machine learning, emphasizing the role of augmentation geometry and stability.
Contribution
It develops a new generalization bound based on mutual information and augmentation group geometry, linking invariance, stability, and data fidelity.
Findings
The bounds accurately predict the generalization gap in experiments.
Group diameter effectively controls the trade-off between stability and bias.
Augmentation geometry impacts model robustness and generalization.
Abstract
Data augmentation is one of the most widely used techniques to improve generalization in modern machine learning, often justified by its ability to promote invariance to label-irrelevant transformations. However, its theoretical role remains only partially understood. In this work, we propose an information-theoretic framework that systematically accounts for the effect of augmentation on generalization and invariance learning. Our approach builds upon mutual information-based bounds, which relate the generalization gap to the amount of information a learning algorithm retains about its training data. We extend this framework by modeling the augmented distribution as a composition of the original data distribution with a distribution over transformations, which naturally induces an orbit-averaged loss function. Under mild sub-Gaussian assumptions on the loss function and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Stochastic Gradient Optimization Techniques · Machine Learning and Data Classification
