With Greater Text Comes Greater Necessity: Inference-Time Training Helps   Long Text Generation

Y. Wang; D. Ma; D. Cai

arXiv:2401.11504·cs.CL·September 12, 2024·2 cites

With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation

Y. Wang, D. Ma, D. Cai

PDF

Open Access 1 Repo

TL;DR

Temp-Lora is a novel method that embeds long context information into a temporary module during inference, significantly improving long text generation quality while reducing computational costs and preserving model parameters.

Contribution

Introduces Temp-Lora, a new approach that efficiently incorporates long context into language models without permanent parameter changes, enhancing quality and reducing resource usage.

Findings

01

13.2% decrease in perplexity on PG19

02

29.3% decrease in perplexity and 113.2% BLEU increase on GuoFeng

03

Reduces memory and latency by over 50% during inference

Abstract

Long text generation, such as novel writing and discourse-level translation with extremely long contexts, presents significant challenges to current language models. Existing methods mainly focus on extending the model's context window through strategies like length extrapolation. However, these approaches demand substantial hardware resources during the training and/or inference phases. Our proposed method, Temp-Lora, introduces an alternative concept. Instead of relying on the KV cache to store all context information, we embeds this information directly into a temporary Lora module. In the process of long text generation, this module is progressively trained with text generated previously. This approach not only efficiently preserves contextual knowledge but also prevents any permanent alteration to the model's parameters given that the module is discarded post-generation. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

temporarylora/temp-lora
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsFocus