Meta-Learning a Dynamical Language Model

Thomas Wolf; Julien Chaumond; Clement Delangue

arXiv:1803.10631·cs.CL·March 29, 2018

Meta-Learning a Dynamical Language Model

Thomas Wolf, Julien Chaumond, Clement Delangue

PDF

Open Access

TL;DR

This paper introduces a meta-learning approach for language models that combines short-term hidden states with medium-term dynamical weights, enabling continuous online adaptation through gradient-based meta-learning.

Contribution

It presents a novel online learning-to-learn framework that trains a meta-learner to dynamically update language model weights for improved adaptability.

Findings

01

Enhanced language modeling through dynamic weight updates

02

Effective integration of hidden states and dynamical weights

03

Demonstrated continuous online learning capability

Abstract

We consider the task of word-level language modeling and study the possibility of combining hidden-states-based short-term representations with medium-term representations encoded in dynamical weights of a language model. Our work extends recent experiments on language models with dynamically evolving weights by casting the language modeling problem into an online learning-to-learn framework in which a meta-learner is trained by gradient-descent to continuously update a language model weights.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis