A Markov Random Field Topic Space Model for Document Retrieval
Scott Hand

TL;DR
This paper introduces a Markov Random Field-based model for document retrieval that improves upon traditional methods by capturing term-document relationships probabilistically and reducing dimensionality with SVD.
Contribution
It presents a novel MRF-based framework for document retrieval that extends LSA with probabilistic dependencies and a new parameter learning method.
Findings
Effective retrieval from large datasets
Efficient dimensionality reduction using SVD
Improved modeling of term-document relationships
Abstract
This paper proposes a novel statistical approach to intelligent document retrieval. It seeks to offer a more structured and extensible mathematical approach to the term generalization done in the popular Latent Semantic Analysis (LSA) approach to document indexing. A Markov Random Field (MRF) is presented that captures relationships between terms and documents as probabilistic dependence assumptions between random variables. From there, it uses the MRF-Gibbs equivalence to derive joint probabilities as well as local probabilities for document variables. A parameter learning method is proposed that utilizes rank reduction with singular value decomposition in a matter similar to LSA to reduce dimensionality of document-term relationships to that of a latent topic space. Experimental results confirm the ability of this approach to effectively and efficiently retrieve documents from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Advanced Text Analysis Techniques
