Meta-Learning a Dynamical Language Model
Thomas Wolf, Julien Chaumond, Clement Delangue

TL;DR
This paper introduces a meta-learning approach for language models that combines short-term hidden states with medium-term dynamical weights, enabling continuous online adaptation through gradient-based meta-learning.
Contribution
It presents a novel online learning-to-learn framework that trains a meta-learner to dynamically update language model weights for improved adaptability.
Findings
Enhanced language modeling through dynamic weight updates
Effective integration of hidden states and dynamical weights
Demonstrated continuous online learning capability
Abstract
We consider the task of word-level language modeling and study the possibility of combining hidden-states-based short-term representations with medium-term representations encoded in dynamical weights of a language model. Our work extends recent experiments on language models with dynamically evolving weights by casting the language modeling problem into an online learning-to-learn framework in which a meta-learner is trained by gradient-descent to continuously update a language model weights.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis
