A Frequency-aware Augmentation Network for Mental Disorders Assessment   from Audio

Shuanglin Li; Siyang Song; Rajesh Nair; and Syed Mohsen Naqvi

arXiv:2501.02516·eess.AS·March 5, 2025

A Frequency-aware Augmentation Network for Mental Disorders Assessment from Audio

Shuanglin Li, Siyang Song, Rajesh Nair, and Syed Mohsen Naqvi

PDF

Open Access

TL;DR

This paper introduces a frequency-aware augmentation network with dynamic convolution for improved assessment of depression and ADHD from speech spectrograms, outperforming existing methods in accuracy and robustness.

Contribution

It presents a novel frequency-aware network with multi-scale and dynamic convolution modules, enhancing feature extraction for mental disorder assessment from audio signals.

Findings

01

Achieved RMSE of 9.23 for depression severity estimation

02

Attained 89.8% accuracy in ADHD detection

03

Proved robustness on AVEC 2014 and ADHD datasets

Abstract

Depression and Attention Deficit Hyperactivity Disorder (ADHD) stand out as the common mental health challenges today. In affective computing, speech signals serve as effective biomarkers for mental disorder assessment. Current research, relying on labor-intensive hand-crafted features or simplistic time-frequency representations, often overlooks critical details by not accounting for the differential impacts of various frequency bands and temporal fluctuations. Therefore, we propose a frequency-aware augmentation network with dynamic convolution for depression and ADHD assessment. In the proposed method, the spectrogram is used as the input feature and adopts a multi-scale convolution to help the network focus on discriminative frequency bands related to mental disorders. A dynamic convolution is also designed to aggregate multiple convolution kernels dynamically based upon their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · EEG and Brain-Computer Interfaces · ECG Monitoring and Analysis

MethodsSoftmax · Attention Is All You Need · Convolution · Focus