A Multimodal Framework for the Assessment of the Schizophrenia Spectrum
Gowtham Premananth, Yashish M.Siriwardena, Philip Resnik, Sonia, Bansal, Deanna L.Kelly, Carol Espy-Wilson

TL;DR
This paper introduces a multimodal framework combining audio, video, and text data with novel fusion techniques to improve classification of schizophrenia spectrum symptoms versus healthy controls.
Contribution
The paper proposes a new multimodal fusion approach using minimal Gated multimodal units (mGMU) for better symptom classification in schizophrenia spectrum disorders.
Findings
Improved weighted F1-score with mGMU fusion
Enhanced weighted AUC-ROC scores using the proposed framework
Effective integration of audio, video, and text modalities
Abstract
This paper presents a novel multimodal framework to distinguish between different symptom classes of subjects in the schizophrenia spectrum and healthy controls using audio, video, and text modalities. We implemented Convolution Neural Network and Long Short Term Memory based unimodal models and experimented on various multimodal fusion approaches to come up with the proposed framework. We utilized a minimal Gated multimodal unit (mGMU) to obtain a bi-modal intermediate fusion of the features extracted from the input modalities before finally fusing the outputs of the bimodal fusions to perform subject-wise classifications. The use of mGMU units in the multimodal framework improved the performance in both weighted f1-score and weighted AUC-ROC scores.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational and Psychological Assessments · Mental Health and Psychiatry
MethodsConvolution
