Boosting Natural Language Generation from Instructions with Meta-Learning
Budhaditya Deb, Guoqing Zheng, Ahmed Hassan Awadallah

TL;DR
This paper explores how meta-learning techniques can enhance instruction-based training of language models, significantly improving zero-shot generalization especially for out-of-distribution and difficult tasks.
Contribution
It introduces three meta-learning approaches applied to multi-task instructional learning, demonstrating substantial improvements on the Natural Instructions V2 dataset.
Findings
Meta-learning enhances instruction effectiveness in zero-shot tasks.
Significant performance gains on out-of-distribution tasks.
Meta-learning is most beneficial for hard, unseen tasks.
Abstract
Recent work has shown that language models (LMs) trained with multi-task \textit{instructional learning} (MTIL) can solve diverse NLP tasks in zero- and few-shot settings with improved performance compared to prompt tuning. MTIL illustrates that LMs can extract and use information about the task from instructions beyond the surface patterns of the inputs and outputs. This suggests that meta-learning may further enhance the utilization of instructions for effective task transfer. In this paper we investigate whether meta-learning applied to MTIL can further improve generalization to unseen tasks in a zero-shot setting. Specifically, we propose to adapt meta-learning to MTIL in three directions: 1) Model Agnostic Meta Learning (MAML), 2) Hyper-Network (HNet) based adaptation to generate task specific parameters conditioned on instructions, and 3) an approach combining HNet and MAML.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
MethodsTest · Model-Agnostic Meta-Learning
