Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Amal Rannen-Triki, Jorg Bornschein, Razvan Pascanu, Marcus Hutter,, Andras Gy\"orgy, Alexandre Galashov, Yee Whye Teh, Michalis K. Titsias

TL;DR
This paper explores online fine-tuning of large language models during testing, highlighting its role as a form of memory extension that adapts to distributional shifts and blurs the line between in-context learning and fine-tuning.
Contribution
It provides a detailed analysis of online adaptation, emphasizing its connection to memory, and investigates its efficiency, sensitivity, and computational costs in real-world scenarios.
Findings
Online adaptation enhances model performance under distributional shifts.
Speed of adaptation varies with data and model complexity.
Online adaptation can serve as a form of memory, similar to neuroscience concepts.
Abstract
We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is generally known that this approach improves the overall predictive performance, especially when considering distributional shift between training and evaluation data, we here emphasize the perspective that online adaptation turns parameters into temporally changing states and provides a form of context-length extension with memory in weights, more in line with the concept of memory in neuroscience. We pay particular attention to the speed of adaptation (in terms of sample efficiency),sensitivity to the overall distributional drift, and the computational overhead for performing gradient computations and parameter updates. Our empirical study provides insights on when online adaptation is particularly interesting. We highlight that with online…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Tools and Methods · Topic Modeling · Recommender Systems and Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
