When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate   Speech into Large Language Models for Depression Detection

Xiangyu Zhang; Hexin Liu; Kaishuai Xu; Qiquan Zhang; Daijiao Liu,; Beena Ahmed; Julien Epps

arXiv:2402.13276·eess.AS·September 25, 2024·1 cites

When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection

Xiangyu Zhang, Hexin Liu, Kaishuai Xu, Qiquan Zhang, Daijiao Liu,, Beena Ahmed, Julien Epps

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel method that integrates acoustic landmarks into Large Language Models to improve depression detection from speech, achieving state-of-the-art results on the DAIC-WOZ dataset.

Contribution

The paper presents an innovative approach to incorporate acoustic landmarks into LLMs, enhancing multimodal depression detection from speech and text.

Findings

01

Achieved state-of-the-art results on DAIC-WOZ dataset.

02

Demonstrated the effectiveness of acoustic landmarks in depression detection.

03

Enhanced LLMs' ability to process speech signals for mental health analysis.

Abstract

Depression is a critical concern in global mental health, prompting extensive research into AI-based detection methods. Among various AI technologies, Large Language Models (LLMs) stand out for their versatility in mental healthcare applications. However, their primary limitation arises from their exclusive dependence on textual input, which constrains their overall capabilities. Furthermore, the utilization of LLMs in identifying and analyzing depressive states is still relatively untapped. In this paper, we present an innovative approach to integrating acoustic speech information into the LLMs framework for multimodal depression detection. We investigate an efficient method for depression detection by integrating speech signals into LLMs utilizing Acoustic Landmarks. By incorporating acoustic landmarks, which are specific to the pronunciation of spoken words, our method adds critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection· underline

Taxonomy

TopicsMental Health via Writing · Topic Modeling