SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

Xinchi Zhou; Dongzhan Zhou; Wanli Ouyang; Hang Zhou; Ziwei Liu; and Di; Hu

arXiv:2203.13535·cs.MM·March 28, 2022·1 cites

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Ziwei Liu, and Di, Hu

PDF

Open Access 1 Video

TL;DR

This paper introduces SeCo, a framework for separating sounds of unknown musical instruments by leveraging consistency constraints and an online matching strategy, addressing limitations of previous methods that only work with known categories.

Contribution

SeCo is the first approach to effectively separate unknown musical instrument sounds by exploiting consistency constraints and an online matching strategy, enhancing versatility in sound separation tasks.

Findings

01

SeCo outperforms baseline methods significantly.

02

The online matching strategy improves separation stability.

03

SeCo demonstrates strong adaptation to new musical categories.

Abstract

Recent years have witnessed the success of deep learning on the visual sound separation task. However, existing works follow similar settings where the training and testing datasets share the same musical instrument categories, which to some extent limits the versatility of this task. In this work, we focus on a more general and challenging scenario, namely the separation of unknown musical instruments, where the categories in training and testing phases have no overlap with each other. To tackle this new setting, we propose the Separation-with-Consistency (SeCo) framework, which can accomplish the separation on unknown categories by exploiting the consistency constraints. Furthermore, to capture richer characteristics of the novel melodies, we devise an online matching strategy, which can bring stable enhancements with no cost of extra parameters. Experiments demonstrate that our SeCo…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance· youtube

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Music and Audio Processing