Compressing Lengthy Context With UltraGist
Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng, Dou

TL;DR
UltraGist is a novel compression method for lengthy contexts that offers high-quality, flexible, fine-grained, sample-efficient, and incrementally updatable compression, outperforming existing methods across various tasks.
Contribution
It introduces UltraGist, a new compression approach with innovative algorithms supporting diverse context lengths, fine-grained processing, and efficient training and updating.
Findings
Achieves near-lossless compression across multiple tasks.
Outperforms existing methods in document QA, summarization, and more.
Supports dynamic, incremental compression updates.
Abstract
Compressing lengthy context is a critical but technically challenging problem. In this paper, we propose a new method called UltraGist, which is distinguished for its high-quality compression of lengthy context due to the innovative design of the compression and learning algorithm. UltraGist brings forth the following important benefits. Firstly, it notably contributes to the flexibility of compression, as it can be effectively learned to support a broad range of context lengths and compression ratios. Secondly, it helps to produce fine-grained compression for the lengthy context, where each small segment of the context is progressively processed on top of a tailored cross-attention mechanism. Thirdly, it makes the training process sample-efficient and thus maximizes the use of training data. Finally, it facilitates the efficient running of compression for dynamic context, as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced MEMS and NEMS Technologies · Computer Graphics and Visualization Techniques · Parallel Computing and Optimization Techniques
