Probability Bracket Notation, Term Vector Space, Concept Fock Space and Induced Probabilistic IR Models
Xing M. Wang

TL;DR
This paper introduces a novel probabilistic framework for information retrieval using Probability Bracket Notation, deriving relevance models based on Term Vector Space and Concept Fock Space, and compares their effectiveness through testing.
Contribution
It applies PBN and Dirac notation to IR, deriving new relevance formulas from TVS and CFS, and demonstrates their application and comparison in IR scenarios.
Findings
Relevance formulas effectively model document-query relevance.
CFS-induced models incorporate features of classical IR models.
Models tested against textbook example show promising results.
Abstract
After a brief introduction to Probability Bracket Notation (PBN) for discrete random variables in time-independent probability spaces, we apply both PBN and Dirac notation to investigate probabilistic modeling for information retrieval (IR). We derive the expressions of relevance of document to query (RDQ) for various probabilistic models, induced by Term Vector Space (TVS) and by Concept Fock Space (CFS). The inference network model (INM) formula is symmetric and can be used to evaluate relevance of document to document (RDD); the CFS-induced models contain ingredients of all three classical IR models. The relevance formulas are tested and compared on different scenarios against a famous textbook example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Advanced Text Analysis Techniques · Text and Document Classification Technologies
