Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse   Training Perspective

Zhi Zhang; Jiayi Shen; Congfeng Cao; Gaole Dai; Shiji Zhou; Qizhe; Zhang; Shanghang Zhang; Ekaterina Shutova

arXiv:2411.18615·cs.LG·November 28, 2024

Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective

Zhi Zhang, Jiayi Shen, Congfeng Cao, Gaole Dai, Shiji Zhou, Qizhe, Zhang, Shanghang Zhang, Ekaterina Shutova

PDF

Open Access

TL;DR

This paper introduces a sparse training approach to reduce gradient conflicts in multi-task learning, improving performance and compatibility with existing gradient manipulation methods.

Contribution

It proposes a novel sparse training strategy that mitigates gradient conflicts in multi-task learning, enhancing task performance and compatibility with other optimization techniques.

Findings

01

Sparse training reduces gradient conflicts effectively.

02

ST improves multi-task learning performance.

03

Compatible with existing gradient manipulation methods.

Abstract

Advancing towards generalist agents necessitates the concurrent processing of multiple tasks using a unified model, thereby underscoring the growing significance of simultaneous model training on multiple downstream tasks. A common issue in multi-task learning is the occurrence of gradient conflict, which leads to potential competition among different tasks during joint training. This competition often results in improvements in one task at the expense of deterioration in another. Although several optimization methods have been developed to address this issue by manipulating task gradients for better task balancing, they cannot decrease the incidence of gradient conflict. In this paper, we systematically investigate the occurrence of gradient conflict across different methods and propose a strategy to reduce such conflicts through sparse training (ST), wherein only a portion of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPsychological and Educational Research Studies