Multimodal Sentiment Analysis Based on BERT and ResNet
JiaLe Ren

TL;DR
This paper presents a multimodal sentiment analysis framework combining BERT for text and ResNet for images, using attention-based feature fusion to improve accuracy over single-modal models.
Contribution
It introduces a novel fusion strategy with attention mechanisms to effectively combine text and image features for sentiment analysis.
Findings
Achieved 74.5% accuracy on MAVA-single dataset.
Outperformed single-modal models using BERT or ResNet.
Demonstrated the effectiveness of attention-based feature fusion.
Abstract
With the rapid development of the Internet and social media, multi-modal data (text and image) is increasingly important in sentiment analysis tasks. However, the existing methods are difficult to effectively fuse text and image features, which limits the accuracy of analysis. To solve this problem, a multimodal sentiment analysis framework combining BERT and ResNet was proposed. BERT has shown strong text representation ability in natural language processing, and ResNet has excellent image feature extraction performance in the field of computer vision. Firstly, BERT is used to extract the text feature vector, and ResNet is used to extract the image feature representation. Then, a variety of feature fusion strategies are explored, and finally the fusion model based on attention mechanism is selected to make full use of the complementary information between text and image. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Technology and Security Systems · Computational and Text Analysis Methods
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Convolution · Max Pooling · Average Pooling · Linear Layer · Softmax · Linear Warmup With Linear Decay · Multi-Head Attention · WordPiece · Dropout
