Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text
Dean E. Alvarez, ChengXiang Zhai

TL;DR
This paper explores methods for constructing Hypergraphs of Text to improve navigation in document collections, introducing a new metric for evaluating their structural quality.
Contribution
It proposes several methods for building Hypergraphs of Text and introduces the effort ratio metric for assessing their quality.
Findings
Simple TF-IDF baselines match LLM-based methods on effort ratio
Constructed Hypergraphs can support flexible navigation
Evaluation metric effectively compares structural quality
Abstract
One reason the Web is more useful than a simple collection of documents is that the structure created by hyperlinks enables flexible navigation from one web page to another. However, hyperlinks are typically created manually and cannot fully capture a corpus' implicit semantic structures. Is there a general way to make an arbitrary collection navigable? Recent work has formalized this problem generally as constructing a Hypergraph of Text (HoT), which provides a formal mathematical structure for supporting navigation and browsing. However, how to construct and evaluate a Hypergraph of Text remains a challenge. In this paper, we propose and study several methods for constructing a HoT. We also propose a novel quantitative metric, effort ratio, for evaluating the structural quality of a constructed HoT. Experimental results show that even simple TF-IDF baselines can match LLM-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
