Gradient Similarity Surgery in Multi-Task Deep Learning
Thomas Borsani, Andrea Rosani, Giuseppe Nicosia, Giuseppe Di Fatta

TL;DR
This paper introduces SAM-GS, a novel gradient surgery method for multi-task deep learning that uses gradient similarity to improve training stability and convergence by addressing conflicting gradients.
Contribution
The paper proposes SAM-GS, a scalable gradient surgery technique based on gradient magnitude similarity, enhancing multi-task learning optimization.
Findings
SAM-GS improves convergence speed in multi-task learning.
Gradient similarity regularizes gradient aggregation effectively.
Experimental results show SAM-GS outperforms existing methods.
Abstract
The multi-task learning () paradigm aims to simultaneously learn multiple tasks within a single model capturing higher-level, more general hidden patterns that are shared by the tasks. In deep learning, a significant challenge in the backpropagation training process is the design of advanced optimisers to improve the convergence speed and stability of the gradient descent learning rule. In particular, in multi-task deep learning () the multitude of tasks may generate potentially conflicting gradients that would hinder the concurrent convergence of the diverse loss functions. This challenge arises when the gradients of the task objectives have either different magnitudes or opposite directions, causing one or a few to dominate or to interfere with each other, thus degrading the training process. Gradient surgery methods address the problem explicitly dealing with conflicting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
