A BERT-based Hierarchical Classification Model with Applications in Chinese Commodity Classification
Kun Liu, Tuozhen Liu, Feifei Wang, and Rui Pan

TL;DR
This paper introduces a large-scale hierarchical product dataset from JD.com and proposes HFT-BERT, a BERT-based model that effectively classifies products within a hierarchical structure, especially excelling with longer texts.
Contribution
The paper provides a large, openly accessible hierarchical product dataset and develops HFT-BERT, a novel hierarchical classification model leveraging BERT for improved accuracy.
Findings
HFT-BERT achieves performance comparable to existing methods on short texts.
HFT-BERT significantly outperforms in classifying longer texts like books.
The dataset facilitates future research in product categorization.
Abstract
Existing e-commerce platforms heavily rely on manual annotation for product categorization, which is inefficient and inconsistent. These platforms often employ a hierarchical structure for categorizing products; however, few studies have leveraged this hierarchical information for classification. Furthermore, studies that consider hierarchical information fail to account for similarities and differences across various hierarchical categories. Herein, we introduce a large-scale hierarchical dataset collected from the JD e-commerce platform (www.JD.com), comprising 1,011,450 products with titles and a three-level category structure. By making this dataset openly accessible, we provide a valuable resource for researchers and practitioners to advance research and applications associated with product categorization. Moreover, we propose a novel hierarchical text classification approach based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
