Statistical Parsing for Logical Information Retrieval
Greg Coppola

TL;DR
This paper advances logical information retrieval by integrating a probabilistic graphical model with natural language parsing, enabling robust reasoning and semantic interpretation with high accuracy, bridging formal semantics and neural models.
Contribution
It introduces an extended QBBN with negation, a typed logical language with modal quantifiers, and a deterministic syntax parser, combining formal semantics with LLM-based disambiguation.
Findings
Engine handles 44 reasoning patterns successfully.
Grammar achieves 33/33 correct parses with zero ambiguity.
LLMs achieve 95% PP attachment accuracy but limited structured parse accuracy.
Abstract
In previous work (Coppola, 2024) we introduced the Quantified Boolean Bayesian Network (QBBN), a logical graphical model that implements the forward fragment of natural deduction (Prawitz, 1965) as a probabilistic factor graph. That work left two gaps: no negation/backward reasoning, and no parser for natural language. This paper addresses both gaps across inference, semantics, and syntax. For inference, we extend the QBBN with NEG factors enforcing P(x) + P(neg x) = 1, enabling contrapositive reasoning (modus tollens) via backward lambda messages, completing Prawitz's simple elimination rules. The engine handles 44/44 test cases spanning 22 reasoning patterns. For semantics, we present a typed logical language with role-labeled predicates, modal quantifiers, and three tiers of expressiveness following Prawitz: first-order quantification, propositions as arguments, and predicate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Natural Language Processing Techniques · Topic Modeling
