Mental-Perceiver: Audio-Textual Multi-Modal Learning for Estimating Mental Disorders
Jinghui Qin, Changsong Liu, Tianchi Tang, Dahuang Liu, Minghao Wang,, Qianying Huang, Rumin Zhang

TL;DR
This paper introduces Mental-Perceiver, a deep learning model that utilizes a new large-scale multi-modal dataset to improve early detection of mental disorders like anxiety and depression from audio and text data.
Contribution
The paper presents a novel multi-modal dataset (MMPsy) and a deep learning model (Mental-Perceiver) for more accurate mental disorder estimation from audio and textual inputs.
Findings
Mental-Perceiver outperforms existing models on MMPsy and DAIC-WOZ datasets.
Multi-modal data improves detection accuracy for anxiety and depression.
The dataset enables large-scale training for mental health assessment models.
Abstract
Mental disorders, such as anxiety and depression, have become a global concern that affects people of all ages. Early detection and treatment are crucial to mitigate the negative effects these disorders can have on daily life. Although AI-based detection methods show promise, progress is hindered by the lack of publicly available large-scale datasets. To address this, we introduce the Multi-Modal Psychological assessment corpus (MMPsy), a large-scale dataset containing audio recordings and transcripts from Mandarin-speaking adolescents undergoing automated anxiety/depression assessment interviews. MMPsy also includes self-reported anxiety/depression evaluations using standardized psychological questionnaires. Leveraging this dataset, we propose Mental-Perceiver, a deep learning model for estimating mental disorders from audio and textual data. Extensive experiments on MMPsy and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCommunication in Education and Healthcare
