PGrad: Learning Principal Gradients For Domain Generalization
Zhe Wang, Jake Grigsby, Yanjun Qi

TL;DR
PGrad introduces a novel training strategy that learns principal gradient directions to enhance neural network robustness and generalization across unseen domains, effectively handling out-of-distribution challenges.
Contribution
The paper proposes PGrad, a new gradient aggregation method that improves domain generalization by focusing on main parameter dynamics and reducing domain-specific noise.
Findings
PGrad achieves competitive results on seven datasets.
It effectively reduces loss curves during training.
Demonstrates robustness across synthetic and real-world shifts.
Abstract
Machine learning models fail to perform when facing out-of-distribution (OOD) domains, a challenging task known as domain generalization (DG). In this work, we develop a novel DG training strategy, we call PGrad, to learn a robust gradient direction, improving models' generalization ability on unseen domains. The proposed gradient aggregates the principal directions of a sampled roll-out optimization trajectory that measures the training dynamics across all training domains. PGrad's gradient design forces the DG training to ignore domain-dependent noise signals and updates all training domains with a robust direction covering main components of parameter dynamics. We further improve PGrad via bijection-based computational refinement and directional plus length-based calibrations. Our theoretical proof connects PGrad to the spectral analysis of Hessian in training neural networks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis · Multimodal Machine Learning Applications
Methodsfail
