Knowledge Perceived Multi-modal Pretraining in E-commerce

Yushan Zhu; Huaixiao Tou; Wen Zhang; Ganqiang Ye; Hui Chen; Ningyu; Zhang; Huajun Chen

arXiv:2109.00895·cs.CV·September 3, 2021

Knowledge Perceived Multi-modal Pretraining in E-commerce

Yushan Zhu, Huaixiao Tou, Wen Zhang, Ganqiang Ye, Hui Chen, Ningyu, Zhang, Huajun Chen

PDF

1 Repo

TL;DR

This paper introduces K3M, a multi-modal pretraining method for E-commerce product data that incorporates knowledge modality to improve robustness against missing or noisy modalities, leading to better performance.

Contribution

K3M is a novel multi-modal pretraining approach that integrates knowledge modality and models modality interactions to enhance robustness in E-commerce scenarios.

Findings

01

K3M outperforms baseline methods under modality-noise conditions.

02

K3M achieves significant improvements on real-world E-commerce datasets.

03

The method effectively handles missing and noisy modalities in product data.

Abstract

In this paper, we address multi-modal pretraining of product data in the field of E-commerce. Current multi-modal pretraining methods proposed for image and text modalities lack robustness in the face of modality-missing and modality-noise, which are two pervasive problems of multi-modal product data in real E-commerce scenarios. To this end, we propose a novel method, K3M, which introduces knowledge modality in multi-modal pretraining to correct the noise and supplement the missing of image and text modalities. The modal-encoding layer extracts the features of each modality. The modal-interaction layer is capable of effectively modeling the interaction of multiple modalities, where an initial-interactive feature fusion model is designed to maintain the independence of image modality and text modality, and a structure aggregation module is designed to fuse the information of image,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yushanzhu/k3m
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsK3M