Exploiting the Bipartite Structure of Entity Grids for Document Coherence and Retrieval
Christina Lioma, Fabien Tarissan, Jakob Grue Simonsen, Casper, Petersen, Birger Larsen

TL;DR
This paper introduces three novel bipartite graph metrics for assessing document coherence, which outperform existing methods and enhance information retrieval effectiveness by capturing aspects of document quality often overlooked.
Contribution
It presents new bipartite graph-based coherence metrics that avoid information loss from projections and improve IR performance.
Findings
One metric outperforms state-of-the-art in coherence accuracy.
All three metrics improve retrieval effectiveness.
Metrics capture document quality aspects beyond keyword relevance.
Abstract
Document coherence describes how much sense text makes in terms of its logical organisation and discourse flow. Even though coherence is a relatively difficult notion to quantify precisely, it can be approximated automatically. This type of coherence modelling is not only interesting in itself, but also useful for a number of other text processing tasks, including Information Retrieval (IR), where adjusting the ranking of documents according to both their relevance and their coherence has been shown to increase retrieval effectiveness [34,37]. The state of the art in unsupervised coherence modelling represents documents as bipartite graphs of sentences and discourse entities, and then projects these bipartite graphs into one-mode undirected graphs. However, one-mode projections may incur significant loss of the information present in the original bipartite structure. To address this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Text and Document Classification Technologies
