With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation
Y. Wang, D. Ma, D. Cai

TL;DR
Temp-Lora is a novel method that embeds long context information into a temporary module during inference, significantly improving long text generation quality while reducing computational costs and preserving model parameters.
Contribution
Introduces Temp-Lora, a new approach that efficiently incorporates long context into language models without permanent parameter changes, enhancing quality and reducing resource usage.
Findings
13.2% decrease in perplexity on PG19
29.3% decrease in perplexity and 113.2% BLEU increase on GuoFeng
Reduces memory and latency by over 50% during inference
Abstract
Long text generation, such as novel writing and discourse-level translation with extremely long contexts, presents significant challenges to current language models. Existing methods mainly focus on extending the model's context window through strategies like length extrapolation. However, these approaches demand substantial hardware resources during the training and/or inference phases. Our proposed method, Temp-Lora, introduces an alternative concept. Instead of relying on the KV cache to store all context information, we embeds this information directly into a temporary Lora module. In the process of long text generation, this module is progressively trained with text generated previously. This approach not only efficiently preserves contextual knowledge but also prevents any permanent alteration to the model's parameters given that the module is discarded post-generation. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsFocus
