Multimodal Classification for Analysing Social Media
Chi Thang Duong, Remi Lebret, Karl Aberer

TL;DR
This paper introduces simple multimodal models for social media classification that effectively combine different data types, handle missing modalities, and outperform traditional fusion methods in accuracy.
Contribution
The paper proposes straightforward multimodal fusion models with an auxiliary task, improving robustness and accuracy in social media content classification.
Findings
Models outperform traditional fusion approaches in accuracy.
Robustness to missing modalities demonstrated in emotion classification.
Achieve comparable results with single modalities.
Abstract
Classification of social media data is an important approach in understanding user behavior on the Web. Although information on social media can be of different modalities such as texts, images, audio or videos, traditional approaches in classification usually leverage only one prominent modality. Techniques that are able to leverage multiple modalities are often complex and susceptible to the absence of some modalities. In this paper, we present simple models that combine information from different modalities to classify social media content and are able to handle the above problems with existing techniques. Our models combine information from different modalities using a pooling layer and an auxiliary learning task is used to learn a common feature space. We demonstrate the performance of our models and their robustness to the missing of some modalities in the emotion classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining · Topic Modeling
