Mitigating Negative Transfer in Multi-Task Learning with Exponential Moving Average Loss Weighting Strategies
Anish Lakkapragada, Essam Sleiman, Saimourya Surabhi, Dennis P. Wall

TL;DR
This paper introduces exponential moving average-based loss weighting strategies to mitigate negative transfer in multi-task learning, demonstrating competitive performance on benchmark datasets.
Contribution
The paper proposes novel loss balancing techniques using exponential moving averages, offering a simpler alternative to existing complex methods for negative transfer mitigation in MTL.
Findings
Achieves comparable or better performance than current methods.
Effective in balancing task losses based on observed magnitudes.
Applicable across multiple benchmark datasets.
Abstract
Multi-Task Learning (MTL) is a growing subject of interest in deep learning, due to its ability to train models more efficiently on multiple tasks compared to using a group of conventional single-task models. However, MTL can be impractical as certain tasks can dominate training and hurt performance in others, thus making some tasks perform better in a single-task model compared to a multi-task one. Such problems are broadly classified as negative transfer, and many prior approaches in the literature have been made to mitigate these issues. One such current approach to alleviate negative transfer is to weight each of the losses so that they are on the same scale. Whereas current loss balancing approaches rely on either optimization or complex numerical analysis, none directly scale the losses based on their observed magnitudes. We propose multiple techniques for loss balancing based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsNone
