# Voice Pathology Detection Using Deep Learning: a Preliminary Study

**Authors:** Pavol Harar, Jesus B. Alonso-Hernandez, Jiri Mekyska, Zoltan Galaz,, Radim Burget, Zdenek Smekal

arXiv: 1907.05905 · 2019-07-16

## TL;DR

This study explores using deep neural networks, combining convolutional and LSTM layers, for detecting voice pathologies from sustained vowel recordings, achieving promising accuracy comparable to prior methods.

## Contribution

It introduces a novel deep learning approach with convolutional and LSTM layers applied directly to raw audio signals for voice pathology detection.

## Key findings

- Achieved 71.36% accuracy on validation data
- Achieved 68.08% accuracy on testing data
- Results are comparable to previous methodologies

## Abstract

This paper describes a preliminary investigation of Voice Pathology Detection using Deep Neural Networks (DNN). We used voice recordings of sustained vowel /a/ produced at normal pitch from German corpus Saarbruecken Voice Database (SVD). This corpus contains voice recordings and electroglottograph signals of more than 2 000 speakers. The idea behind this experiment is the use of convolutional layers in combination with recurrent Long-Short-Term-Memory (LSTM) layers on raw audio signal. Each recording was split into 64 ms Hamming windowed segments with 30 ms overlap. Our trained model achieved 71.36% accuracy with 65.04% sensitivity and 77.67% specificity on 206 validation files and 68.08% accuracy with 66.75% sensitivity and 77.89% specificity on 874 testing files. This is a promising result in favor of this approach because it is comparable to similar previously published experiment that used different methodology. Further investigation is needed to achieve the state-of-the-art results.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.05905/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.05905/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1907.05905/full.md

---
Source: https://tomesphere.com/paper/1907.05905