PRISM: Efficient Long-Range Reasoning With Short-Context LLMs

Dulhan Jayalath; James Bradley Wendt; Nicholas Monath; Sandeep Tata; Beliz Gunel

arXiv:2412.18914·cs.AI·August 26, 2025

PRISM: Efficient Long-Range Reasoning With Short-Context LLMs

Dulhan Jayalath, James Bradley Wendt, Nicholas Monath, Sandeep Tata, Beliz Gunel

PDF

Open Access 1 Video

TL;DR

PRISM is a novel in-context reasoning method that enables long-range reasoning with short-context LLMs, significantly reducing token usage and costs while maintaining high performance across diverse tasks.

Contribution

It introduces PRISM, a schema-based, token-efficient in-context approach that outperforms existing methods on long-range tasks with much shorter contexts.

Findings

01

Outperforms baselines with 4x shorter contexts

02

Reduces costs by up to 54% using KV caches

03

Generalizes to new tasks with minimal schema generation

Abstract

Long-range tasks demand reasoning over long inputs. However, existing solutions are limited, e.g., long-context models require large compute budgets, parameter-efficient fine-tuning (PEFT) needs training data, and retrieval-augmented generation (RAG) entails complex task-specific designs. Though in-context approaches overcome many of these issues, methods with short-context LLMs are inefficient, trading context for processing more tokens. We introduce PRISM, a highly token-efficient in-context method based on structured schemas that outperforms baselines on diverse tasks with 4x shorter contexts. This approach produces concise outputs and efficiently leverages key-value (KV) caches to reduce costs by up to 54%. PRISM scales down to tiny contexts without increasing costs or sacrificing quality, and generalizes to new tasks with minimal effort by generating schemas from task descriptions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

PRISM: Efficient Long-Range Reasoning With Short-Context LLMs· underline

Taxonomy

TopicsSemantic Web and Ontologies · Data Mining Algorithms and Applications · Advanced Database Systems and Queries