Background-tracking Acoustic Features for Genre Identification of Broadcast Shows
Oscar Saz, Mortaza Doulaty, Thomas Hain

TL;DR
This paper introduces a new background-tracking acoustic feature extraction method for broadcast show genre identification, outperforming traditional features and enhancing classification accuracy with advanced classifiers.
Contribution
The paper presents a novel background-tracking feature extraction technique based on alignment with multiple background models, improving genre classification accuracy over existing methods.
Findings
Background-tracking features outperform short-term features in genre classification.
Using HMMs and SVMs with these features increases accuracy to over 79%.
Method demonstrates potential for audiovisual data analysis and broadcast archive classification.
Abstract
This paper presents a novel method for extracting acoustic features that characterise the background environment in audio recordings. These features are based on the output of an alignment that fits multiple parallel background--based Constrained Maximum Likelihood Linear Regression transformations asynchronously to the input audio signal. With this setup, the resulting features can track changes in the audio background like appearance and disappearance of music, applause or laughter, independently of the speakers in the foreground of the audio. The ability to provide this type of acoustic description in audiovisual data has many potential applications, including automatic classification of broadcast archives or improving automatic transcription and subtitling. In this paper, the performance of these features in a genre identification task in a set of 332 BBC shows is explored. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
