Audio Matters Too! Enhancing Markerless Motion Capture with Audio Signals for String Performance Capture
Yitong Jin, Zhiping Qiu, Yi Shi, Shuangpeng Sun, Chongwu Wang, Donghao, Pan, Jiachen Zhao, Zhenghao Liang, Yuan Wang, Xiaobing Li, Feng Yu, Tao Yu,, Qionghai Dai

TL;DR
This paper introduces a multi-modal, markerless motion capture framework that leverages audio signals to improve the accuracy of capturing subtle string instrument performances, validated on a novel large-scale dataset.
Contribution
It presents the first multi-view, multi-modal dataset for string instrument performance and proposes an audio-guided motion capture method that enhances visual results without external markers.
Findings
Outperforms state-of-the-art vision-based algorithms
Effectively captures subtle hand-string contact movements
Demonstrates the benefit of integrating audio cues into motion capture
Abstract
In this paper, we touch on the problem of markerless multi-modal human motion capture especially for string performance capture which involves inherently subtle hand-string contacts and intricate movements. To fulfill this goal, we first collect a dataset, named String Performance Dataset (SPD), featuring cello and violin performances. The dataset includes videos captured from up to 23 different views, audio signals, and detailed 3D motion annotations of the body, hands, instrument, and bow. Moreover, to acquire the detailed motion annotations, we propose an audio-guided multi-modal motion capture framework that explicitly incorporates hand-string contacts detected from the audio signals for solving detailed hand poses. This framework serves as a baseline for string performance capture in a completely markerless manner without imposing any external devices on performers, eliminating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
