ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning

Junguang Jiang; Baixu Chen; Junwei Pan; Ximei Wang; Liu Dapeng; Jie; Jiang; Mingsheng Long

arXiv:2301.12618·cs.LG·November 17, 2023·5 cites

ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning

Junguang Jiang, Baixu Chen, Junwei Pan, Ximei Wang, Liu Dapeng, Jie, Jiang, Mingsheng Long

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

ForkMerge is a novel method that addresses negative transfer in auxiliary-task learning by dynamically managing model branches and task weights, leading to improved target task performance.

Contribution

It introduces ForkMerge, a new approach that mitigates negative transfer by periodically forking, searching optimal task weights, and merging branches based on validation errors.

Findings

01

Outperforms existing methods on auxiliary-task learning benchmarks.

02

Effectively mitigates negative transfer in multi-task learning.

03

Improves target task accuracy by managing task conflicts dynamically.

Abstract

Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks. Occasionally, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, which is known as negative transfer. This problem is often attributed to the gradient conflicts among tasks, and is frequently tackled by coordinating the task gradients in previous works. However, these optimization-based methods largely overlook the auxiliary-target generalization capability. To better understand the root cause of negative transfer, we experimentally investigate it from both optimization and generalization perspectives. Based on our findings, we introduce ForkMerge, a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights by minimizing target validation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thuml/forkmerge
noneOfficial

Datasets

tanganke/nyuv2
dataset· 491 dl
491 dl

Videos

ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Multimodal Machine Learning Applications