Multimodal E-Commerce Product Classification Using Hierarchical Fusion

Tsegaye Misikir Tashu; Sara Fattouh; Peter Kiss; Tomas Horvath

arXiv:2207.03305·cs.AI·July 12, 2022

Multimodal E-Commerce Product Classification Using Hierarchical Fusion

Tsegaye Misikir Tashu, Sara Fattouh, Peter Kiss, Tomas Horvath

PDF

Open Access

TL;DR

This paper introduces a multi-modal neural network model for e-commerce product classification that combines textual and visual features through hierarchical fusion, significantly improving performance over unimodal approaches.

Contribution

The work presents a novel hierarchical fusion technique that effectively combines textual and visual features for improved product classification accuracy.

Findings

01

Multi-modal fusion outperforms unimodal models.

02

Concatenation and averaging fusion techniques yield best results.

03

Adding modalities enhances classification performance.

Abstract

In this work, we present a multi-modal model for commercial product classification, that combines features extracted by multiple neural network models from textual (CamemBERT and FlauBERT) and visual data (SE-ResNeXt-50), using simple fusion techniques. The proposed method significantly outperformed the unimodal models' performance and the reported performance of similar models on our specific task. We did experiments with multiple fusing techniques and found, that the best performing technique to combine the individual embedding of the unimodal network is based on combining concatenation and averaging the feature vectors. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the performance of multi-label and multimodal classification problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies