Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples' Therapy
Haoqi Li, Brian Baucom, Panayiotis Georgiou

TL;DR
This paper introduces a novel SD-DNN framework that improves behavioral annotation accuracy from speech in couples' therapy by using sparse, disjoint training to handle limited data effectively.
Contribution
The paper proposes a sparsely-connected, disjointly-trained DNN approach that enhances behavior classification in low-resource settings by combining multiple classifiers into a unified network.
Findings
Improved behavior classification accuracy in couples' therapy data.
Demonstrated viability for real-time behavior annotation.
Effective handling of limited and noisy data.
Abstract
Observational studies are based on accurate assessment of human state. A behavior recognition system that models interlocutors' state in real-time can significantly aid the mental health domain. However, behavior recognition from speech remains a challenging task since it is difficult to find generalizable and representative features because of noisy and high-dimensional data, especially when data is limited and annotated coarsely and subjectively. Deep Neural Networks (DNN) have shown promise in a wide range of machine learning tasks, but for Behavioral Signal Processing (BSP) tasks their application has been constrained due to limited quantity of data. We propose a Sparsely-Connected and Disjointly-Trained DNN (SD-DNN) framework to deal with limited data. First, we break the acoustic feature set into subsets and train multiple distinct classifiers. Then, the hidden layers of these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Emotion and Mood Recognition · Speech Recognition and Synthesis
