Automatic Organisation, Segmentation, and Filtering of User-Generated   Audio Content

Gon\c{c}alo Mordido; Jo\~ao Magalh\~aes; Sofia Cavaco

arXiv:1708.05302·eess.AS·September 18, 2017

Automatic Organisation, Segmentation, and Filtering of User-Generated Audio Content

Gon\c{c}alo Mordido, Jo\~ao Magalh\~aes, Sofia Cavaco

PDF

TL;DR

This paper introduces methods for organizing, segmenting, and filtering large datasets of user-generated audio content using audio fingerprinting and supervised learning, validated on concert recordings from YouTube.

Contribution

It presents novel techniques for grouping and analyzing user-generated audio files based solely on fingerprinting data, including error detection with supervised learning.

Findings

01

Effective grouping of audio files from large datasets

02

Supervised learning reduces incorrect fingerprint matches

03

Validated methods on YouTube concert recordings

Abstract

Using solely the information retrieved by audio fingerprinting techniques, we propose methods to treat a possibly large dataset of user-generated audio content, that (1) enable the grouping of several audio files that contain a common audio excerpt (i.e., are relative to the same event), and (2) give information about how those files are correlated in terms of time and quality inside each event. Furthermore, we use supervised learning to detect incorrect matches that may arise from the audio fingerprinting algorithm itself, whilst ensuring our model learns with previous predictions. All the presented methods were further validated by user-generated recordings of several different concerts manually crawled from YouTube.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.