Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning
Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He, He, George Karypis

TL;DR
This paper investigates how the number and relatedness of tasks in multi-task learning influence the quality of learned representations and downstream performance, highlighting the trade-offs between scale and task similarity.
Contribution
It disentangles the effects of task scale and relatedness in multi-task learning, showing that larger task sets improve representations, but related smaller sets can be equally effective for specific targets.
Findings
Increasing task number generally improves representations.
Related smaller task sets can match large-scale training for known targets.
Scale and relatedness have distinct impacts on transfer performance.
Abstract
Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks. In contrast, literature on task transferability has established that the choice of intermediate tasks can heavily affect downstream task performance. In this work, we aim to disentangle the effect of scale and relatedness of tasks in multi-task representation learning. We find that, on average, increasing the scale of multi-task learning, in terms of the number of tasks, indeed results in better learned representations than smaller multi-task setups. However, if the target tasks are known ahead of time, then training on a smaller set of related tasks is competitive to the large-scale multi-task training at a reduced computational cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Stochastic Gradient Optimization Techniques
