An Efficient Multimodal Learning Framework to Comprehend Consumer Preferences Using BERT and Cross-Attention
Junichiro Niimi

TL;DR
This paper introduces a context-aware multimodal deep learning framework combining BERT and cross-attention mechanisms to better understand consumer preferences by dynamically adjusting attention based on background information, outperforming traditional fusion methods.
Contribution
The study presents a novel multimodal model that integrates BERT with cross-attention to enhance analysis of consumer data, offering improved flexibility and accuracy over existing feature fusion approaches.
Findings
The proposed model outperforms six reference models across three evaluation categories.
Dynamic attention adjustment improves prediction accuracy with consumer background data.
Efficient training methods depend on optimizer choice and text token length.
Abstract
Today, the acquisition of various behavioral log data has enabled deeper understanding of customer preferences and future behaviors in the marketing field. In particular, multimodal deep learning has achieved highly accurate predictions by combining multiple types of data. Many of these studies utilize with feature fusion to construct multimodal models, which combines extracted representations from each modality. However, since feature fusion treats information from each modality equally, it is difficult to perform flexible analysis such as the attention mechanism that has been used extensively in recent years. Therefore, this study proposes a context-aware multimodal deep learning model that combines Bidirectional Encoder Representations from Transformers (BERT) and cross-attention Transformer, which dynamically changes the attention of deep-contextualized word representations based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
