ComLQ: Benchmarking Complex Logical Queries in Information Retrieval
Ganlin Xu, Zhitao Yin, Linghao Zhang, Jiaqing Liang, Weijia Lu, Xiaodong Zhang, Zhifei Yang, Sihang Jiang, Deqing Yang

TL;DR
This paper introduces ComLQ, a new benchmark dataset for complex logical queries in information retrieval, created using large language models and expert annotation, to evaluate IR models' ability to handle logical structures including negation.
Contribution
The paper presents a novel dataset for complex logical IR queries generated by LLMs with logical structure guidance and introduces a new evaluation metric for negation handling.
Findings
Existing IR models perform poorly on complex logical queries.
Negation significantly challenges current retrieval models.
The dataset enables better evaluation of logical reasoning in IR.
Abstract
Information retrieval (IR) systems play a critical role in navigating information overload across various applications. Existing IR benchmarks primarily focus on simple queries that are semantically analogous to single- and multi-hop relations, overlooking \emph{complex logical queries} involving first-order logic operations such as conjunction (), disjunction (), and negation (). Thus, these benchmarks can not be used to sufficiently evaluate the performance of IR models on complex queries in real-world scenarios. To address this problem, we propose a novel method leveraging large language models (LLMs) to construct a new IR dataset \textbf{ComLQ} for \textbf{Com}plex \textbf{L}ogical \textbf{Q}ueries, which comprises 2,909 queries and 11,251 candidate passages. A key challenge in constructing the dataset lies in capturing the underlying logical structures within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsInformation Retrieval and Search Behavior · Semantic Web and Ontologies · Advanced Graph Neural Networks
