Contrastive Learning-based Multi Modal Architecture for Emoticon   Prediction by Employing Image-Text Pairs

Ananya Pandey; Dinesh Kumar Vishwakarma

arXiv:2408.02571·cs.CV·August 6, 2024

Contrastive Learning-based Multi Modal Architecture for Emoticon Prediction by Employing Image-Text Pairs

Ananya Pandey, Dinesh Kumar Vishwakarma

PDF

Open Access

TL;DR

This paper introduces a contrastive learning-based multimodal architecture that combines image and text data to improve emoticon prediction accuracy on social media content.

Contribution

It proposes a novel dual-branch encoder with contrastive learning for joint text-image feature mapping, outperforming existing methods in emoticon prediction.

Findings

01

Achieved 91% accuracy and 90% MCC-score on Twitter dataset.

02

Demonstrated superior robustness and generalization over previous multimodal approaches.

03

Deep features from contrastive learning enhance emoticon recognition across modalities.

Abstract

The emoticons are symbolic representations that generally accompany the textual content to visually enhance or summarize the true intention of a written message. Although widely utilized in the realm of social media, the core semantics of these emoticons have not been extensively explored based on multiple modalities. Incorporating textual and visual information within a single message develops an advanced way of conveying information. Hence, this research aims to analyze the relationship among sentences, visuals, and emoticons. For an orderly exposition, this paper initially provides a detailed examination of the various techniques for extracting multimodal features, emphasizing the pros and cons of each method. Through conducting a comprehensive examination of several multimodal algorithms, with specific emphasis on the fusion approaches, we have proposed a novel contrastive learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Communication and Language · Sentiment Analysis and Opinion Mining · Hate Speech and Cyberbullying Detection

MethodsContrastive Learning