Train-Attention: Meta-Learning Where to Focus in Continual Knowledge   Learning

Yeongbin Seo; Dongha Lee; Jinyoung Yeo

arXiv:2407.16920·cs.CL·February 6, 2025

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning

Yeongbin Seo, Dongha Lee, Jinyoung Yeo

PDF

1 Repo 1 Video

TL;DR

This paper introduces TAALM, a meta-learning based method that dynamically predicts token importance to improve continual knowledge learning in language models, reducing forgetting and enhancing efficiency.

Contribution

It proposes a novel meta-learning framework for token weighting in CKL and introduces a new benchmark, LAMA-ckl, to better evaluate learning and retention trade-offs.

Findings

01

TAALM achieves state-of-the-art results on CKL benchmarks.

02

TAALM is compatible with existing CKL methods, enhancing their performance.

03

The new LAMA-ckl benchmark reveals insights into learning-retention trade-offs.

Abstract

Previous studies on continual knowledge learning (CKL) in large language models (LLMs) have predominantly focused on approaches such as regularization, architectural modifications, and rehearsal techniques to mitigate catastrophic forgetting. However, these methods naively inherit the inefficiencies of standard training procedures, indiscriminately applying uniform weight across all tokens, which can lead to unnecessary parameter updates and increased forgetting. To address these shortcomings, we propose a novel CKL approach termed Train-Attention-Augmented Language Model (TAALM), which enhances learning efficiency by dynamically predicting and applying weights to tokens based on their usefulness. This method employs a meta-learning framework that optimizes token importance predictions, facilitating targeted knowledge updates and minimizing forgetting. Also, we observe that existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ybseo-academy/TAALM
pytorchOfficial

Videos

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning· slideslive