An Ensemble-based approach for assigning text to correct Harmonized system code
Shubham, Avinash Arya, Subarna Roy, Sridhar Jonnala

TL;DR
This paper presents an ensemble-based method combining BERT, NER, distance metrics, and knowledge graphs to improve the classification of product descriptions into the Harmonized System codes, addressing scalability and accuracy.
Contribution
The paper introduces a novel hierarchical ensemble model that enhances HS code classification by integrating multiple AI techniques for better coverage and nuance detection.
Findings
Improved classification accuracy over existing methods
Enhanced scalability and automation capabilities
Effective handling of unknown product descriptions
Abstract
Industries must follow government rules and regulations around the world to classify products when assessing duties and taxes for international shipment. Harmonized System (HS) is the most standardized numerical method of classifying traded products among industry classification systems. A hierarchical ensemble model comprising of Bert-transformer, NER, distance-based approaches, and knowledge-graphs have been developed to address scalability, coverage, ability to capture nuances, automation and auditing requirements when classifying unknown text-descriptions as per HS method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
