Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound   Classification

Lam Pham

arXiv:2107.09268·cs.SD·July 21, 2021·1 cites

Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification

Lam Pham

PDF

Open Access

TL;DR

This thesis develops a robust deep learning framework for acoustic scene classification using multiple spectrograms and a novel encoder architecture, successfully extending to respiratory disease detection in real-world biomedical data.

Contribution

Introduces a new encoder-decoder architecture with multi-spectrogram input for improved acoustic scene classification and applies it effectively to respiratory sound analysis.

Findings

01

Enhanced classification accuracy with multi-spectrogram features

02

Reduced computational cost through the encoder-decoder framework

03

Effective detection of respiratory anomalies in real-life data

Abstract

This thesis focuses on dealing with the task of acoustic scene classification (ASC), and then applied the techniques developed for ASC to a real-life application of detecting respiratory disease. To deal with ASC challenges, this thesis addresses three main factors that directly affect the performance of an ASC system. Firstly, this thesis explores input features by making use of multiple spectrograms (log-mel, Gamma, and CQT) for low-level feature extraction to tackle the issue of insufficiently discriminative or descriptive input features. Next, a novel Encoder network architecture is introduced. The Encoder firstly transforms each low-level spectrogram into high-level intermediate features, or embeddings, and thus combines these high-level features to form a very distinct composite feature. The composite or combined feature is then explored in terms of classification performance,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis