Capturing P: On the Expressive Power and Efficient Evaluation of Boolean Retrieval
Amir Aavani

TL;DR
This paper introduces a formal retrieval language and an efficient evaluation algorithm that enable complex polynomial-time logical queries to be executed directly over search indexes, enhancing the expressive power and efficiency of information retrieval systems.
Contribution
It defines a new retrieval language capturing class P and proposes exttt{ComputePN}, an algorithm for efficient evaluation of complex queries over indexes.
Findings
The retrieval language $oldsymbol{ ext{L}_R}$ precisely captures the complexity class P.
exttt{ComputePN} enables polynomial-time evaluation of complex queries.
The approach bridges the gap between expressive logical querying and computational efficiency.
Abstract
Modern information retrieval is transitioning from simple document filtering to complex, neuro-symbolic reasoning workflows. However, current retrieval architectures face a fundamental efficiency dilemma when handling the rigorous logical and arithmetic constraints required by this new paradigm. Standard iterator-based engines (Document-at-a-Time) do not natively support complex, nested logic graphs; forcing them to execute such queries typically results in intractable runtime performance. Conversely, naive recursive approaches (Term-at-a-Time), while capable of supporting these structures, suffer from prohibitive memory consumption when enforcing broad logical exclusions. In this paper, we propose that a retrieval engine must be capable of ``Capturing '' -- evaluating any polynomial-time property directly over its index in a computationally efficient manner. We define a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Biomedical Text Mining and Ontologies · Semantic Web and Ontologies
