Taxonomy Tree Generation from Citation Graph
Yuntong Hu, Zhuofeng Li, Zheng Zhang, Chen Ling, Raasikh Kanjiani,, Boxin Zhao, Liang Zhao

TL;DR
This paper introduces HiGTL, an automated framework for generating hierarchical scientific taxonomies from citation graphs, combining clustering, language models, and joint optimization to produce coherent, meaningful structures.
Contribution
The paper presents a novel end-to-end method that integrates citation graph clustering with language model-based taxonomy verbalization, guided by human instructions.
Findings
HiGTL outperforms existing methods in producing coherent taxonomies.
The framework effectively combines textual and citation data for clustering.
Joint optimization improves the semantic quality of generated taxonomies.
Abstract
Constructing taxonomies from citation graphs is essential for organizing scientific knowledge, facilitating literature reviews, and identifying emerging research trends. However, manual taxonomy construction is labor-intensive, time-consuming, and prone to human biases, often overlooking pivotal but less-cited papers. In this paper, to enable automatic hierarchical taxonomy generation from citation graphs, we propose HiGTL (Hierarchical Graph Taxonomy Learning), a novel end-to-end framework guided by human-provided instructions or preferred topics. Specifically, we propose a hierarchical citation graph clustering method that recursively groups related papers based on both textual content and citation structure, ensuring semantically meaningful and structurally coherent clusters. Additionally, we develop a novel taxonomy node verbalization strategy that iteratively generates central…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies
