Latent Context Compilation: Distilling Long Context into Compact Portable Memory
Zeju Li, Yizhou Zhou, Qiang Xu

TL;DR
This paper introduces Latent Context Compilation, a novel framework that compresses long contexts into portable, stateless memory tokens using a disposable LoRA module, enabling efficient long-context processing without modifying model weights.
Contribution
It presents a new context compression method that distills long contexts into compact tokens, avoiding synthetic data and model modification, and maintains reasoning capabilities at high compression ratios.
Findings
Preserves fine-grained details and reasoning at 16x compression.
Decouples memory density from model parameters.
Effective with Llama-3.1-8B model.
Abstract
Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that complicate concurrent serving. We propose Latent Context Compilation, a framework that fundamentally shifts context processing from adaptation to compilation. By utilizing a disposable LoRA module as a compiler, we distill long contexts into compact buffer tokens -- stateless, portable memory artifacts that are plug-and-play compatible with frozen base models. Crucially, we introduce a self-aligned optimization strategy that eliminates the need for synthetic context-relevant QA pairs. By regularizing context reconstruction task with context-agnostic random queries, we force compressed tokens to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Parallel Computing and Optimization Techniques
