Beyond Losses Reweighting: Empowering Multi-Task Learning via the Generalization Perspective
Hoang Phan, Lam Tran, Quyen Tran, Ngoc N. Tran, Tuan Truong, Qi Lei, Nhat Ho, Dinh Phung, Trung Le

TL;DR
This paper proposes a new multi-task learning framework that uses weight perturbation to regulate gradient norms, reducing conflicts and overfitting, thereby enhancing generalization and performance across tasks.
Contribution
It introduces a novel weight perturbation approach for gradient norm regulation in multi-task learning, improving generalization and task performance.
Findings
Outperforms existing gradient-based MTL methods in various applications.
Reduces gradient conflicts and overfitting through adaptive weight perturbation.
Theoretically links gradient norm control to improved generalization.
Abstract
Multi-task learning (MTL) trains deep neural networks to optimize several objectives simultaneously using a shared backbone, which leads to reduced computational costs, improved data efficiency, and enhanced performance through cross-task knowledge sharing. Although recent gradient manipulation techniques aim to find a common descent direction that benefits all tasks, conventional empirical loss minimization still leaves models vulnerable to overfitting and gradient conflicts. To address this, we introduce a novel MTL framework that leverages weight perturbation to regulate gradient norms, thus improving generalization. By adaptively modulating weight perturbations, our approach harmonizes task-specific gradients, reducing conflicts and encouraging more robust learning across tasks. Theoretical insights reveal that controlling the gradient norm through weight perturbation directly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM
