3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos
Vikram Gupta, Trisha Mittal, Puneet Mathur, Vaibhav Mishra, Mayank, Maheshwari, Aniket Bera, Debdoot Mukherjee, Dinesh Manocha

TL;DR
3MASSIV is a comprehensive multilingual, multimodal dataset of 50,000 short social media videos from Moj, annotated for concepts, affect, media types, and audio language, enabling advanced semantic and cross-lingual analysis.
Contribution
This paper introduces 3MASSIV, a novel large-scale dataset capturing diverse social media short videos with rich annotations for multimodal and multilingual research.
Findings
Dataset covers 11 languages and various video trends.
Strong baselines demonstrate potential for semantic understanding.
Highlights the dynamic and temporal nature of social media videos.
Abstract
We present 3MASSIV, a multilingual, multimodal and multi-aspect, expertly-annotated dataset of diverse short videos extracted from short-video social media platform - Moj. 3MASSIV comprises of 50k short videos (20 seconds average duration) and 100K unlabeled videos in 11 different languages and captures popular short video trends like pranks, fails, romance, comedy expressed via unique audio-visual formats like self-shot videos, reaction videos, lip-synching, self-sung songs, etc. 3MASSIV presents an opportunity for multimodal and multilingual semantic understanding on these unique videos by annotating them for concepts, affective states, media types, and audio language. We present a thorough analysis of 3MASSIV and highlight the variety and unique aspects of our dataset compared to other contemporary popular datasets with strong baselines. We also show how the social media content in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Sentiment Analysis and Opinion Mining
