Read As Human: Compressing Context via Parallelizable Close Reading and Skimming

Jiwei Tang; Shilei Liu; Zhicheng Zhang; Qingsong Lv; Runsong Zhao; Tingwei Lu; Langming Liu; Haibin Chen; Yujin Yuan; Hai-Tao Zheng; Wenbo Su; Bo Zheng

arXiv:2602.01840·cs.CL·March 2, 2026

Read As Human: Compressing Context via Parallelizable Close Reading and Skimming

Jiwei Tang, Shilei Liu, Zhicheng Zhang, Qingsong Lv, Runsong Zhao, Tingwei Lu, Langming Liu, Haibin Chen, Yujin Yuan, Hai-Tao Zheng, Wenbo Su, Bo Zheng

PDF

Open Access

TL;DR

This paper introduces RAM, a context compression framework inspired by human reading, which selectively fully reads important segments and skims less relevant ones to improve efficiency and performance in long-input tasks.

Contribution

RAM employs an adaptive hybrid reading strategy with parallel encoding and contrastive learning to enhance long-context processing in language models.

Findings

01

RAM outperforms baselines on question answering and summarization tasks.

02

Achieves up to 12x speedup on long inputs.

03

Maintains high performance with compressed context.

Abstract

Large Language Models (LLMs) demonstrate exceptional capability across diverse tasks. However, their deployment in long-context scenarios is hindered by two challenges: computational inefficiency and redundant information. We propose RAM (Read As HuMan), a context compression framework that adopts an adaptive hybrid reading strategy, to address these challenges. Inspired by human reading behavior (i.e., close reading important content while skimming less relevant content), RAM partitions the context into segments and encodes them with the input query in parallel. High-relevance segments are fully retained (close reading), while low-relevance ones are query-guided compressed into compact summary vectors (skimming). Both explicit textual segments and implicit summary vectors are concatenated and fed into decoder to achieve both superior performance and natural language format…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Information Retrieval and Search Behavior