MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio   for Multi-Task Learning

Caoyun Fan; Wenqing Chen; Jidong Tian; Yitian Li; Hao He; Yaohui Jin

arXiv:2302.09352·cs.CV·February 21, 2023

MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio for Multi-Task Learning

Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin

PDF

Open Access

TL;DR

This paper introduces MaxGNR, a novel dynamic weight strategy that maximizes the gradient-to-noise ratio to improve multi-task learning by mitigating inter-task gradient noise, leading to better performance on standard datasets.

Contribution

The paper proposes MaxGNR, a new algorithm that enhances multi-task learning by explicitly maximizing the gradient-to-noise ratio to address training inefficiencies.

Findings

01

MaxGNR outperforms baseline methods on NYUv2 and Cityscapes datasets.

02

Maximizing GNR reduces inter-task gradient noise interference.

03

Improved training stability and task performance in MTL.

Abstract

When modeling related tasks in computer vision, Multi-Task Learning (MTL) can outperform Single-Task Learning (STL) due to its ability to capture intrinsic relatedness among tasks. However, MTL may encounter the insufficient training problem, i.e., some tasks in MTL may encounter non-optimal situation compared with STL. A series of studies point out that too much gradient noise would lead to performance degradation in STL, however, in the MTL scenario, Inter-Task Gradient Noise (ITGN) is an additional source of gradient noise for each task, which can also affect the optimization process. In this paper, we point out ITGN as a key factor leading to the insufficient training problem. We define the Gradient-to-Noise Ratio (GNR) to measure the relative magnitude of gradient noise and design the MaxGNR algorithm to alleviate the ITGN interference of each task by maximizing the GNR of each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM