TL;DR
This paper introduces GenTaxo, a method that enhances taxonomy completion by generating new concept names using relational information, overcoming limitations of extraction-based approaches especially for multi-word concepts with low corpus frequency.
Contribution
GenTaxo combines graph-based and language-based relational embeddings with a pre-trained concept name generator to improve taxonomy expansion beyond extraction limitations.
Findings
GenTaxo outperforms existing methods in taxonomy completeness.
Relational embeddings improve concept identification accuracy.
Generated concepts significantly enhance taxonomy coverage.
Abstract
Automatic construction of a taxonomy supports many applications in e-commerce, web search, and question answering. Existing taxonomy expansion or completion methods assume that new concepts have been accurately extracted and their embedding vectors learned from the text corpus. However, one critical and fundamental challenge in fixing the incompleteness of taxonomies is the incompleteness of the extracted concepts, especially for those whose names have multiple words and consequently low frequency in the corpus. To resolve the limitations of extraction-based methods, we propose GenTaxo to enhance taxonomy completion by identifying positions in existing taxonomies that need new concepts and then generating appropriate concept names. Instead of relying on the corpus for concept embeddings, GenTaxo learns the contextual embeddings from their surrounding graph-based and language-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
