BoostTaxo: Zero-Shot Taxonomy Induction via Boosting-Style Agentic Reasoning and Constraint-Aware Calibration
Yancheng Ling, Zhenlin Qin, Leizhen Wang, Zhenliang Ma

TL;DR
BoostTaxo introduces a novel boosting-style LLM framework for zero-shot taxonomy induction, enhancing accuracy and reliability in large-scale, domain-specific scenarios through a coarse-to-fine, structure-aware approach.
Contribution
It presents a new hybrid LLM-based method that combines retrieval, candidate ranking, and structure-aware calibration for improved zero-shot taxonomy induction.
Findings
Achieves superior or comparable performance on WordNet, DBLP, and SemEval-Sci datasets.
Validates the effectiveness of hybrid candidate selection and score calibration.
Analyzes the impact of candidate size and provides case studies.
Abstract
Taxonomy induction is crucial for organizing concepts into explicit and interpretable semantic hierarchies. While existing methods have achieved promising results, their generalization, structural reliability, and efficiency remain limited, hindering their performance in zero-shot and large-scale scenarios. To overcome these limitations, we introduce BoostTaxo, a boosting-style LLM framework for zero-shot taxonomy induction. It takes a set of domain terms as inputs and performs parent identification in a coarse-to-fine manner, employing retrieval-augmented definition refinement, hybrid parent candidate selection, candidate rating, and structure-aware score calibration to improve taxonomy construction. Specifically, a lightweight LLM is used to efficiently filter candidate parents, while a large-scale LLM is employed to rank and score candidate parents for fine-grained parent selection.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
