Meta-learning for downstream aware and agnostic pretraining

Hongyin Luo; Shuyan Dong; Yung-Sung Chuang; Shang-Wen Li

arXiv:2106.03270·cs.CL·June 8, 2021

Meta-learning for downstream aware and agnostic pretraining

Hongyin Luo, Shuyan Dong, Yung-Sung Chuang, Shang-Wen Li

PDF

Open Access

TL;DR

This paper proposes a meta-learning approach to optimize task selection during neural network pretraining, aiming to improve efficiency and performance in natural language processing tasks.

Contribution

It introduces a novel meta-learning framework for task selection in pretraining, including downstream-aware and downstream-agnostic variants, to enhance learning efficiency.

Findings

01

Framework for task selection using meta-learning

02

Two variants: downstream-aware and downstream-agnostic

03

Preliminary algorithm discussion and future empirical validation

Abstract

Neural network pretraining is gaining attention due to its outstanding performance in natural language processing applications. However, pretraining usually leverages predefined task sequences to learn general linguistic clues. The lack of mechanisms in choosing proper tasks during pretraining makes the learning and knowledge encoding inefficient. We thus propose using meta-learning to select tasks that provide the most informative learning signals in each episode of pretraining. With the proposed method, we aim to achieve better efficiency in computation and memory usage for the pretraining process and resulting networks while maintaining the performance. In this preliminary work, we discuss the algorithm of the method and its two variants, downstream-aware and downstream-agnostic pretraining. Our experiment plan is also summarized, while empirical results will be shared in our future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Machine Learning in Healthcare