TL;DR
GAViD introduces a large-scale multimodal dataset with annotations for group affect, enabling improved context-aware recognition of emotions in social videos, and presents a neural network model achieving competitive accuracy.
Contribution
The paper provides the GAViD dataset with multimodal, annotated social videos and proposes CAGNet, a model for context-aware group affect recognition, advancing research in social affect analysis.
Findings
CAGNet achieves 63.20% accuracy on GAViD.
GAViD dataset contains 5091 annotated video clips.
Code and dataset are publicly available at github.com/deepakkumar-iitr/GAViD.
Abstract
Understanding affective dynamics in real-world social systems is fundamental to modeling and analyzing human-human interactions in complex environments. Group affect emerges from intertwined human-human interactions, contextual influences, and behavioral cues, making its quantitative modeling a challenging computational social systems problem. However, computational modeling of group affect in in-the-wild scenarios remains challenging due to limited large-scale annotated datasets and the inherent complexity of multimodal social interactions shaped by contextual and behavioral variability. The lack of comprehensive datasets annotated with multimodal and contextual information further limits advances in the field. To address this, we introduce the Group Affect from ViDeos (GAViD) dataset, comprising 5091 video clips with multimodal data (video, audio and context), annotated with ternary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
