Automatic Environmental Sound Recognition: Performance versus   Computational Cost

Siddharth Sigtia; Adam M. Stark; Sacha Krstulovic; Mark D. Plumbley

arXiv:1607.04589·cs.SD·September 9, 2016

Automatic Environmental Sound Recognition: Performance versus Computational Cost

Siddharth Sigtia, Adam M. Stark, Sacha Krstulovic, Mark D. Plumbley

PDF

TL;DR

This paper evaluates various AESR algorithms to determine which offers the best balance of sound classification accuracy and computational cost for IoT embedded platforms.

Contribution

It compares the performance of Deep Neural Networks, Gaussian Mixture Models, and Support Vector Machines under limited computational resources.

Findings

01

DNNs provide the best accuracy-to-cost ratio.

02

GMMs offer reasonable accuracy with low computational cost.

03

SVMs balance accuracy and computational efficiency.

Abstract

In the context of the Internet of Things (IoT), sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this article seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance em as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.