Exceeding the Limits of Visual-Linguistic Multi-Task Learning

Cameron R. Wolfe; Keld T. Lundgaard

arXiv:2107.13054·cs.AI·July 29, 2021·1 cites

Exceeding the Limits of Visual-Linguistic Multi-Task Learning

Cameron R. Wolfe, Keld T. Lundgaard

PDF

Open Access

TL;DR

This paper demonstrates that large-scale multi-task learning with over 1000 tasks across e-commerce product data can surpass previous limits, using a multi-modal transformer and novel heuristics for task-specific capacity allocation.

Contribution

It introduces a scalable methodology for multi-task learning on thousands of tasks, including new heuristics like DyPa for efficient parameter allocation.

Findings

01

Successful training of a single model on 1000 tasks

02

Identification of best practices for large-scale MTL

03

Introduction of DyPa heuristic for task-specific capacity

Abstract

By leveraging large amounts of product data collected across hundreds of live e-commerce websites, we construct 1000 unique classification tasks that share similarly-structured input data, comprised of both text and images. These classification tasks focus on learning the product hierarchy of different e-commerce websites, causing many of them to be correlated. Adopting a multi-modal transformer model, we solve these tasks in unison using multi-task learning (MTL). Extensive experiments are presented over an initial 100-task dataset to reveal best practices for "large-scale MTL" (i.e., MTL with more than 100 tasks). From these experiments, a final, unified methodology is derived, which is composed of both best practices and new proposals such as DyPa, a simple heuristic for automatically allocating task-specific parameters to tasks that could benefit from extra capacity. Using our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques