A Novel Fusion of Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech
Shahin Amiriparian, Pawel Winokurow, Vincent Karas, Sandra Ottl,, Maurice Gerczuk, Bj\"orn W. Schuller

TL;DR
This paper introduces a novel attention-based sequence-to-sequence autoencoder approach for unsupervised learning from speech audio to predict sleepiness levels, demonstrating effective fusion of features for improved accuracy.
Contribution
The paper presents a new fusion method combining attention and non-attention autoencoders for speech-based sleepiness prediction, advancing unsupervised audio representation learning.
Findings
Fusion of autoencoder representations improves sleepiness prediction accuracy.
Autoencoder activations effectively capture relevant speech features.
Proposed method achieves higher correlation coefficients than individual autoencoders.
Abstract
Motivated by the attention mechanism of the human visual system and recent developments in the field of machine translation, we introduce our attention-based and recurrent sequence to sequence autoencoders for fully unsupervised representation learning from audio files. In particular, we test the efficacy of our novel approach on the task of speech-based sleepiness recognition. We evaluate the learnt representations from both autoencoders, and then conduct an early fusion to ascertain possible complementarity between them. In our frameworks, we first extract Mel-spectrograms from raw audio files. Second, we train recurrent autoencoders on these spectrograms which are considered as time-dependent frequency vectors. Afterwards, we extract the activations of specific fully connected layers of the autoencoders which represent the learnt features of spectrograms for the corresponding audio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSleep and Work-Related Fatigue · Emotion and Mood Recognition · Human Pose and Action Recognition
