CoLLEGe: Concept Embedding Generation for Large Language Models

Ryan Teehan; Brenden Lake; Mengye Ren

arXiv:2403.15362·cs.CL·October 18, 2024·1 cites

CoLLEGe: Concept Embedding Generation for Large Language Models

Ryan Teehan, Brenden Lake, Mengye Ren

PDF

Open Access

TL;DR

CoLLEGe introduces a meta-learning framework that enables large language models to quickly generate embeddings for new concepts using few examples, improving on existing methods for concept learning without task-specific training.

Contribution

The paper presents CoLLEGe, a novel meta-learning approach for few-shot concept embedding generation tailored for large language models, enhancing their ability to learn new concepts efficiently.

Findings

01

Successfully learns new word concepts without task-specific training

02

Effective in definition inference and verbal reasoning tasks

03

Outperforms baseline methods in real-world scenarios

Abstract

Current language models are unable to quickly learn new concepts on the fly, often requiring a more involved finetuning process to learn robustly. Prompting in-context is not robust to context distractions, and often fails to confer much information about the new concepts. Classic methods for few-shot word learning in NLP, relying on global word vectors, are less applicable to large language models. In this paper, we introduce a novel approach named CoLLEGe (Concept Learning with Language Embedding Generation) to modernize few-shot concept learning. CoLLEGe is a meta-learning framework capable of generating flexible embeddings for new concepts using a small number of example sentences or definitions. Our primary meta-learning objective is simply to facilitate a language model to make next word predictions in forthcoming sentences, making it compatible with language model pretraining. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Web Data Mining and Analysis