GDOD: Effective Gradient Descent using Orthogonal Decomposition for Multi-Task Learning
Xin Dong, Ruize Wu, Chao Xiong, Hai Li, Lei Cheng, Yong He, Shiyou, Qian, Jian Cao, Linjian Mo

TL;DR
GDOD introduces an orthogonal decomposition-based optimization method for multi-task learning, effectively managing gradient conflicts and improving performance across multiple datasets.
Contribution
The paper proposes GDOD, a novel gradient manipulation technique using orthogonal basis decomposition to enhance multi-task learning optimization.
Findings
GDOD significantly outperforms existing MTL models in AUC and Logloss.
GDOD effectively decomposes gradients into shared and conflicting components.
Theoretical convergence of GDOD is proven under convex and non-convex conditions.
Abstract
Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several tasks simultaneously. Some related work attributed the source of the problem is the conflicting gradients. In this case, it is needed to select useful gradient updates for all tasks carefully. To this end, we propose a novel optimization approach for MTL, named GDOD, which manipulates gradients of each task using an orthogonal basis decomposed from the span of all task gradients. GDOD decomposes gradients into task-shared and task-conflict components explicitly and adopts a general update rule for avoiding interference across all task gradients. This allows guiding the update directions depending on the task-shared components. Moreover, we prove the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
