Octet: Online Catalog Taxonomy Enrichment with Self-Supervision
Yuning Mao, Tong Zhao, Andrey Kan, Chenwei Zhang, Xin Luna Dong,, Christos Faloutsos, Jiawei Han

TL;DR
Octet is a self-supervised framework that effectively enriches online catalog taxonomies by leveraging heterogeneous online data and advanced neural models, doubling taxonomy size in real-world applications.
Contribution
The paper introduces a novel self-supervised end-to-end approach for taxonomy enrichment that requires no additional supervision and utilizes GNNs and sequence labeling for term extraction and attachment.
Findings
Octet outperforms state-of-the-art methods in automatic and human evaluations.
It doubles the size of an online catalog taxonomy in production.
The approach is effective across multiple online domains.
Abstract
Taxonomies have found wide applications in various domains, especially online for item categorization, browsing, and search. Despite the prevalent use of online catalog taxonomies, most of them in practice are maintained by humans, which is labor-intensive and difficult to scale. While taxonomy construction from scratch is considerably studied in the literature, how to effectively enrich existing incomplete taxonomies remains an open yet important research question. Taxonomy enrichment not only requires the robustness to deal with emerging terms but also the consistency between existing taxonomy structure and new term attachment. In this paper, we present a self-supervised end-to-end framework, Octet, for Online Catalog Taxonomy EnrichmenT. Octet leverages heterogeneous information unique to online catalog taxonomies such as user queries, items, and their relations to the taxonomy nodes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
