From Similarity to Structure: Training-free LLM Context Compression with Hybrid Graph Priors
Yitian Zhou, Chaoning Zhang, Jiaquan Zhang, Zhenzhen Huang, Jinyu Guo, Sung-Ho Bae, Lik-Hang Lee, Caiyan Qin, Yang Yang

TL;DR
This paper introduces a training-free, graph-based method for compressing long text inputs for large language models, improving efficiency without sacrificing relevance or coherence.
Contribution
It proposes a novel, model-agnostic compression framework using hybrid graph priors, outperforming existing methods on long-document benchmarks.
Findings
Competitive with strong extractive and abstractive baselines
Larger gains observed on long-document benchmarks
Effective in preserving task relevance and coherence
Abstract
Long-context large language models remain computationally expensive to run and often fail to reliably process very long inputs, which makes context compression an important component of many systems. Existing compression approaches typically rely on trained compressors, dense retrieval-style selection, or heuristic trimming, and they often struggle to jointly preserve task relevance, topic coverage, and cross-sentence coherence under a strict token budget. To address this, we propose a training-free and model-agnostic compression framework that selects a compact set of sentences guided by structural graph priors. Our method constructs a sparse hybrid sentence graph that combines mutual k-NN semantic edges with short-range sequential edges, extracts a topic skeleton via clustering, and ranks sentences using an interpretable score that integrates task relevance, cluster…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
