TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning
Zelun Zhang, Xue Pan

TL;DR
This paper introduces a task-level gradient clipping method for multi-task learning that independently clips and rescales task gradients to reduce bias, along with a multi-branch data augmentation strategy, achieving top results in CVPR2023 Foundation Model Challenge.
Contribution
The paper proposes a novel task-level gradient clipping paradigm and a multi-branch data augmentation strategy to improve multi-task foundation model training.
Findings
Relieves gradient bias in multi-task learning.
Achieves 1st place in Leaderboard A and 2nd in Leaderboard B of CVPR2023 Challenge.
Effective in handling conflicting task gradients.
Abstract
The AllInOne training paradigm squeezes a wide range of tasks into a unified model in a multi-task learning manner. However, optimization in multi-task learning is more challenge than single-task learning, as the gradient norm from different tasks may vary greatly, making the backbone overly biased towards one specific task. To address this issue, we propose the task-level backbone-oriented gradient clip paradigm, compared with the vanilla gradient clip method, it has two points of emphasis:1) gradient clip is performed independently for each task. 2) backbone gradients generated from each task are rescaled to the same norm scale. Based on the experimental results, we argue that the task-level backbone-oriented gradient clip paradigm can relieve the gradient bias problem to some extent. We also propose a novel multi-branch data augmentation strategy where conflict augmentations are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · COVID-19 diagnosis using AI
MethodsContrastive Language-Image Pre-training
