Multimodal Approach for Harmonized System Code Prediction
Otmane Amel, Sedrick Stassin, Sidi Ahmed Mahmoudi, Xavier Siebert

TL;DR
This paper presents a novel multimodal deep learning approach combining image and text features to improve the accuracy of Harmonized System code prediction for customs declarations, addressing challenges in e-commerce logistics.
Contribution
The study introduces a new multimodal fusion method, MultConcat, for HS code prediction, and demonstrates its superior performance over existing methods.
Findings
Achieved top-3 accuracy of 93.5%
Achieved top-5 accuracy of 98.2%
First to analyze feature-level combination of text and image for HS prediction
Abstract
The rapid growth of e-commerce has placed considerable pressure on customs representatives, prompting advanced methods. In tackling this, Artificial intelligence (AI) systems have emerged as a promising approach to minimize the risks faced. Given that the Harmonized System (HS) code is a crucial element for an accurate customs declaration, we propose a novel multimodal HS code prediction approach using deep learning models exploiting both image and text features obtained through the customs declaration combined with e-commerce platform information. We evaluated two early fusion methods and introduced our MultConcat fusion method. To the best of our knowledge, few studies analyze the featurelevel combination of text and image in the state-of-the-art for HS code prediction, which heightens interest in our paper and its findings. The experimental results prove the effectiveness of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
