Multi-task learning for natural language processing in the 2020s: where   are we going?

Joseph Worsham; Jugal Kalita

arXiv:2007.16008·cs.CL·August 3, 2020

Multi-task learning for natural language processing in the 2020s: where are we going?

Joseph Worsham, Jugal Kalita

PDF

TL;DR

This paper surveys recent advances in multi-task learning for NLP, highlighting progress, challenges, and future directions in leveraging shared models for improved language understanding.

Contribution

It provides a comprehensive overview of recent MTL developments in NLP, emphasizing unresolved challenges and potential research directions for the next decade.

Findings

01

MTL has gained renewed interest due to successes like BERT and NLP benchmarks.

02

Shared weights and component re-usability are key focus areas in recent MTL research.

03

Persistent challenges in MTL may unlock better language understanding and natural language interfaces.

Abstract

Multi-task learning (MTL) significantly pre-dates the deep learning era, and it has seen a resurgence in the past few years as researchers have been applying MTL to deep learning solutions for natural language tasks. While steady MTL research has always been present, there is a growing interest driven by the impressive successes published in the related fields of transfer learning and pre-training, such as BERT, and the release of new challenge problems, such as GLUE and the NLP Decathlon (decaNLP). These efforts place more focus on how weights are shared across networks, evaluate the re-usability of network components and identify use cases where MTL can significantly outperform single-task solutions. This paper strives to provide a comprehensive survey of the numerous recent MTL contributions to the field of natural language processing and provide a forum to focus efforts on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Dense Connections · WordPiece · Residual Connection · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Attention Is All You Need · Adam · Dropout