Revisiting Dynamic Evaluation: Online Adaptation for Large Language   Models

Amal Rannen-Triki; Jorg Bornschein; Razvan Pascanu; Marcus Hutter,; Andras Gy\"orgy; Alexandre Galashov; Yee Whye Teh; Michalis K. Titsias

arXiv:2403.01518·cs.CL·March 5, 2024·1 cites

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models

Amal Rannen-Triki, Jorg Bornschein, Razvan Pascanu, Marcus Hutter,, Andras Gy\"orgy, Alexandre Galashov, Yee Whye Teh, Michalis K. Titsias

PDF

Open Access

TL;DR

This paper explores online fine-tuning of large language models during testing, highlighting its role as a form of memory extension that adapts to distributional shifts and blurs the line between in-context learning and fine-tuning.

Contribution

It provides a detailed analysis of online adaptation, emphasizing its connection to memory, and investigates its efficiency, sensitivity, and computational costs in real-world scenarios.

Findings

01

Online adaptation enhances model performance under distributional shifts.

02

Speed of adaptation varies with data and model complexity.

03

Online adaptation can serve as a form of memory, similar to neuroscience concepts.

Abstract

We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is generally known that this approach improves the overall predictive performance, especially when considering distributional shift between training and evaluation data, we here emphasize the perspective that online adaptation turns parameters into temporally changing states and provides a form of context-length extension with memory in weights, more in line with the concept of memory in neuroscience. We pay particular attention to the speed of adaptation (in terms of sample efficiency),sensitivity to the overall distributional drift, and the computational overhead for performing gradient computations and parameter updates. Our empirical study provides insights on when online adaptation is particularly interesting. We highlight that with online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Tools and Methods · Topic Modeling · Recommender Systems and Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings