Taxonomy-guided Semantic Indexing for Academic Paper Search
SeongKu Kang, Yunyi Zhang, Pengcheng Jiang, Dongha Lee, Jiawei Han,, Hwanjo Yu

TL;DR
This paper introduces TaxoIndex, a taxonomy-guided semantic indexing framework that improves academic paper search by better matching underlying concepts, enhancing retrieval accuracy, interpretability, and performance with limited data.
Contribution
It proposes a novel taxonomy-guided semantic indexing method that enhances existing dense retrieval models for academic paper search.
Findings
Significant improvement in retrieval performance
Effective with limited training data
Enhanced interpretability of search results
Abstract
Academic paper search is an essential task for efficient literature discovery and scientific advancement. While dense retrieval has advanced various ad-hoc searches, it often struggles to match the underlying academic concepts between queries and documents, which is critical for paper search. To enable effective academic concept matching for paper search, we propose Taxonomy-guided Semantic Indexing (TaxoIndex) framework. TaxoIndex extracts key concepts from papers and organizes them as a semantic index guided by an academic taxonomy, and then leverages this index as foundational knowledge to identify academic concepts and link queries and documents. As a plug-and-play framework, TaxoIndex can be flexibly employed to enhance existing dense retrievers. Extensive experiments show that TaxoIndex brings significant improvements, even with highly limited training data, and greatly enhances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Text Analysis Techniques · Topic Modeling
