SentiFormer: Metadata Enhanced Transformer for Image Sentiment Analysis

Bin Feng; Shulan Ruan; Mingzheng Yang; Dongxuan Han; Huijie Liu; Kai; Zhang; Qi Liu

arXiv:2502.15322·cs.CV·February 24, 2025

SentiFormer: Metadata Enhanced Transformer for Image Sentiment Analysis

Bin Feng, Shulan Ruan, Mingzheng Yang, Dongxuan Han, Huijie Liu, Kai, Zhang, Qi Liu

PDF

Open Access 1 Repo

TL;DR

SentiFormer introduces a novel transformer-based framework that effectively integrates multiple metadata types with images for improved sentiment analysis, outperforming existing methods on public datasets.

Contribution

The paper proposes a new metadata-enhanced transformer model that adaptively learns and fuses diverse metadata with images for sentiment analysis.

Findings

01

Outperforms existing methods on three datasets

02

Effectively integrates multiple metadata types

03

Demonstrates superior accuracy in sentiment prediction

Abstract

As more and more internet users post images online to express their daily emotions, image sentiment analysis has attracted increasing attention. Recently, researchers generally tend to design different neural networks to extract visual features from images for sentiment analysis. Despite the significant progress, metadata, the data (e.g., text descriptions and keyword tags) for describing the image, has not been sufficiently explored in this task. In this paper, we propose a novel Metadata Enhanced Transformer for sentiment analysis (SentiFormer) to fuse multiple metadata and the corresponding image into a unified framework. Specifically, we first obtain multiple metadata of the image and unify the representations of diverse data. To adaptively learn the appropriate weights for each metadata, we then design an adaptive relevance learning module to highlight more effective information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MET4ISA/SentiFormer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Emotion and Mood Recognition · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Absolute Position Encodings · Linear Layer · Layer Normalization · Byte Pair Encoding · Dense Connections · Residual Connection · Label Smoothing · Multi-Head Attention · Position-Wise Feed-Forward Layer