Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Predictions

Thong Nguyen; Xiaobao Wu; Anh-Tuan Luu; Cong-Duy Nguyen; Zhen Hai; Lidong Bing

arXiv:2211.03524·cs.CL·May 13, 2026·6 cites

Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Predictions

Thong Nguyen, Xiaobao Wu, Anh-Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing

PDF

TL;DR

This paper introduces a novel multimodal contrastive learning framework with adaptive weighting and interaction modules to improve review helpfulness prediction accuracy by better modeling cross-modal relations.

Contribution

It proposes a new contrastive learning approach with adaptive weighting and interaction modules specifically designed for multimodal review helpfulness prediction.

Findings

01

Outperforms prior baselines on benchmark datasets.

02

Achieves state-of-the-art results in MRHP.

03

Enhances cross-modal relation modeling.

Abstract

Modern Review Helpfulness Prediction systems are dependent upon multiple modalities, typically texts and images. Unfortunately, those contemporary approaches pay scarce attention to polish representations of cross-modal relations and tend to suffer from inferior optimization. This might cause harm to model's predictions in numerous cases. To overcome the aforementioned issues, we propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations. In addition, we introduce Adaptive Weighting scheme for our contrastive learning approach in order to increase flexibility in optimization. Lastly, we propose Multimodal Interaction module to address the unalignment nature of multimodal data, thereby assisting the model in producing more reasonable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.