VICSOM: VIsual Clues from SOcial Media for psychological assessment
Mohammad Mahdi Dehshibi, Gerard Pons, Bita Baiani, David Masip

TL;DR
This paper introduces VICSOM, a multimodal dataset from Instagram and a baseline method to predict human psychological needs based on social media content, leveraging visual and textual features.
Contribution
It provides a large annotated multimodal Instagram dataset and proposes a multimodal fusion approach for automatic psychological need recognition.
Findings
Promising results in multi-label classification of psychological needs.
Effective multimodal fusion of visual and textual features.
Baseline performance established for future research.
Abstract
Sharing multimodal information (typically images, videos or text) in Social Network Sites (SNS) occupies a relevant part of our time. The particular way how users expose themselves in SNS can provide useful information to infer human behaviors. This paper proposes to use multimodal data gathered from Instagram accounts to predict the perceived prototypical needs described in Glasser's choice theory. The contribution is two-fold: (i) we provide a large multimodal database from Instagram public profiles (more than 30,000 images and text captions) annotated by expert Psychologists on each perceived behavior according to Glasser's theory, and (ii) we propose to automate the recognition of the (unconsciously) perceived needs by the users. Particularly, we propose a baseline using three different feature sets: visual descriptors based on pixel images (SURF and Visual Bag of Words), a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Complex Network Analysis Techniques · Digital Mental Health Interventions
