Learning to Attend to Depression-Related Patterns: An Adaptive Cross-Modal Gating Network for Depression Detection
Hangbin Yu, Yudong Yang, Rongfeng Su, Nan Yan, Lan Wang

TL;DR
This paper introduces an adaptive cross-modal gating network that improves depression detection by focusing on relevant speech segments, outperforming previous methods and highlighting meaningful patterns.
Contribution
The novel ACMG network adaptively reweights speech and text features to better detect depression-related patterns, addressing sparsity in diagnostic signals.
Findings
ACMG outperforms baseline depression detection models.
Visualization shows ACMG attends to meaningful acoustic and textual segments.
The method enhances early diagnosis capabilities.
Abstract
Automatic depression detection using speech signals with acoustic and textual modalities is a promising approach for early diagnosis. Depression-related patterns exhibit sparsity in speech: diagnostically relevant features occur in specific segments rather than being uniformly distributed. However, most existing methods treat all frames equally, assuming depression-related information is uniformly distributed and thus overlooking this sparsity. To address this issue, we proposes a depression detection network based on Adaptive Cross-Modal Gating (ACMG) that adaptively reassigns frame-level weights across both modalities, enabling selective attention to depression-related segments. Experimental results show that the depression detection system with ACMG outperforms baselines without it. Visualization analyses further confirm that ACMG automatically attends to clinically meaningful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
