Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral
Lucio M. Dery, Yann Dauphin, David Grangier

TL;DR
This paper introduces a model-agnostic framework for decomposing auxiliary task gradients into helpful, harmful, or neutral directions, enabling more effective multitask learning especially with out-of-distribution data.
Contribution
It proposes a novel, scalable algorithm for fine-grained gradient manipulation that improves multitask learning by selectively weighting auxiliary task updates.
Findings
Outperforms strong baselines on text and image classification tasks.
Effectively leverages out-of-distribution data.
Provides a generic framework encompassing prior methods.
Abstract
While deep learning has been very beneficial in data-rich settings, tasks with smaller training set often resort to pre-training or multitask learning to leverage data from other tasks. In this case, careful consideration is needed to select tasks and model parameterizations such that updates from the auxiliary tasks actually help the primary task. We seek to alleviate this burden by formulating a model-agnostic framework that performs fine-grained manipulation of the auxiliary task gradients. We propose to decompose auxiliary updates into directions which help, damage or leave the primary task loss unchanged. This allows weighting the update directions differently depending on their impact on the problem of interest. We present a novel and efficient algorithm for that purpose and show its advantage in practice. Our method leverages efficient automatic differentiation procedures and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
