Towards Practical Tool Usage for Continually Learning LLMs

Jerry Huang; Prasanna Parthasarathi; Mehdi Rezagholizadeh; Sarath; Chandar

arXiv:2404.09339·cs.CL·April 16, 2024·2 cites

Towards Practical Tool Usage for Continually Learning LLMs

Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Sarath, Chandar

PDF

Open Access

TL;DR

This paper investigates how large language models can better adapt to changing environments by leveraging tools and continual learning techniques, demonstrating that scaling models alone is insufficient for adaptation.

Contribution

It introduces a synthetic benchmark and combines NLP tasks to evaluate continual learning in tool-using LLMs, showing that continual learning improves adaptation and reduces forgetting.

Findings

01

Scaling model size does not improve continual learning.

02

Tool use combined with continual learning enhances adaptation.

03

Continual learning reduces forgetting in LLMs.

Abstract

Large language models (LLMs) show an innate skill for solving language based tasks. But insights have suggested an inability to adjust for information or task-solving skills becoming outdated, as their knowledge, stored directly within their parameters, remains static in time. Tool use helps by offloading work to systems that the LLM can access through an interface, but LLMs that use them still must adapt to nonstationary environments for prolonged use, as new tools can emerge and existing tools can change. Nevertheless, tools require less specialized knowledge, therefore we hypothesize they are better suited for continual learning (CL) as they rely less on parametric memory for solving tasks and instead focus on learning when to apply pre-defined tools. To verify this, we develop a synthetic benchmark and follow this by aggregating existing NLP tasks to form a more realistic testing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Text Readability and Simplification

MethodsFocus