The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization

Abdelali Bouyahia; Fr\'ed\'eric LeBlanc; Mario Marchand

arXiv:2602.14423·cs.LG·February 17, 2026

The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization

Abdelali Bouyahia, Fr\'ed\'eric LeBlanc, Mario Marchand

PDF

Open Access

TL;DR

This paper introduces an information-theoretic framework to analyze how data augmentation influences generalization and invariance in machine learning, emphasizing the role of augmentation geometry and stability.

Contribution

It develops a new generalization bound based on mutual information and augmentation group geometry, linking invariance, stability, and data fidelity.

Findings

01

The bounds accurately predict the generalization gap in experiments.

02

Group diameter effectively controls the trade-off between stability and bias.

03

Augmentation geometry impacts model robustness and generalization.

Abstract

Data augmentation is one of the most widely used techniques to improve generalization in modern machine learning, often justified by its ability to promote invariance to label-irrelevant transformations. However, its theoretical role remains only partially understood. In this work, we propose an information-theoretic framework that systematically accounts for the effect of augmentation on generalization and invariance learning. Our approach builds upon mutual information-based bounds, which relate the generalization gap to the amount of information a learning algorithm retains about its training data. We extend this framework by modeling the augmented distribution as a composition of the original data distribution with a distribution over transformations, which naturally induces an orbit-averaged loss function. Under mild sub-Gaussian assumptions on the loss function and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Stochastic Gradient Optimization Techniques · Machine Learning and Data Classification