Get the gist? Using large language models for few-shot   decontextualization

Benjamin Kane; Lenhart Schubert

arXiv:2310.06254·cs.CL·October 11, 2023

Get the gist? Using large language models for few-shot decontextualization

Benjamin Kane, Lenhart Schubert

PDF

Open Access

TL;DR

This paper explores using large language models in a few-shot setting to perform decontextualization of sentences, enabling understanding without extensive domain-specific training data.

Contribution

It introduces a novel few-shot approach leveraging large language models for decontextualization, reducing reliance on costly annotations and improving domain transferability.

Findings

01

Achieves viable decontextualization performance with few examples

02

Demonstrates cross-domain applicability of the method

03

Reduces need for extensive dataset annotations

Abstract

In many NLP applications that involve interpreting sentences within a rich context -- for instance, information retrieval systems or dialogue systems -- it is desirable to be able to preserve the sentence in a form that can be readily understood without context, for later reuse -- a process known as ``decontextualization''. While previous work demonstrated that generative Seq2Seq models could effectively perform decontextualization after being fine-tuned on a specific dataset, this approach requires expensive human annotations and may not transfer to other domains. We propose a few-shot method of decontextualization using a large language model, and present preliminary results showing that this method achieves viable performance on multiple domains using only a small set of examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence