FAMO: Fast Adaptive Multitask Optimization
Bo Liu, Yihao Feng, Peter Stone, Qiang Liu

TL;DR
FAMO is a novel multitask optimization method that adaptively balances task losses efficiently using constant space and time, outperforming existing gradient manipulation techniques in large-scale scenarios.
Contribution
Introduces FAMO, a dynamic weighting approach for multitask learning that reduces task losses efficiently with constant resource requirements.
Findings
FAMO achieves comparable or better performance than state-of-the-art methods.
FAMO significantly reduces computational and memory overhead.
Experimental results cover supervised and reinforcement learning tasks.
Abstract
One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL). However, in practice, applying gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks. Previous approaches that manipulate task gradients for a more balanced loss decrease require storing and computing all task gradients ( space and time where is the number of tasks), limiting their use in large-scale scenarios. In this work, we introduce Fast Adaptive Multitask Optimization FAMO, a dynamic weighting method that decreases task losses in a balanced way using space and time. We conduct an extensive set of experiments covering multi-task supervised and reinforcement learning problems. Our results indicate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Advanced Neural Network Applications
