Deep Neural Network approaches for Analysing Videos of Music   Performances

Foteini Simistira Liwicki; Richa Upadhyay; Prakash Chandra Chhipa,; Killian Murphy; Federico Visi; Stefan \"Ostersj\"o; Marcus Liwicki

arXiv:2205.11232·cs.CV·May 25, 2022·1 cites

Deep Neural Network approaches for Analysing Videos of Music Performances

Foteini Simistira Liwicki, Richa Upadhyay, Prakash Chandra Chhipa,, Killian Murphy, Federico Visi, Stefan \"Ostersj\"o, Marcus Liwicki

PDF

Open Access

TL;DR

This paper introduces a novel 3D CNN framework with class imbalance handling and multi-video analysis for gesture recognition in musical performance videos, significantly improving identification accuracy over previous methods.

Contribution

It presents new techniques for class balancing, detailed gesture categorization, and multi-video analysis, advancing automated gesture labeling in musical videos.

Findings

01

Gesture recognition accuracy improved by 12%

02

Validated methods on multiple gesture categories and videos

03

Achieved up to 75% accuracy in extended tests

Abstract

This paper presents a framework to automate the labelling process for gestures in musical performance videos with a 3D Convolutional Neural Network (CNN). While this idea was proposed in a previous study, this paper introduces several novelties: (i) Presents a novel method to overcome the class imbalance challenge and make learning possible for co-existent gestures by batch balancing approach and spatial-temporal representations of gestures. (ii) Performs a detailed study on 7 and 18 categories of gestures generated during the performance (guitar play) of musical pieces that have been video-recorded. (iii) Investigates the possibility to use audio features. (iv) Extends the analysis to multiple videos. The novel methods significantly improve the performance of gesture identification by 12 %, when compared to the previous work (51 % in this study over 39 % in previous work). We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing