Fx-Encoder++: Extracting Instrument-Wise Audio Effects Representations from Mixtures
Yen-Tung Yeh, Junghyun Koo, Marco A. Mart\'inez-Ram\'irez, Wei-Hsiang Liao, Yi-Hsuan Yang, and Yuki Mitsufuji

TL;DR
Fx-Encoder++ is a novel model that effectively extracts instrument-specific audio effects representations from music mixtures, enhancing intelligent music production capabilities.
Contribution
The paper introduces Fx-Encoder++, a contrastive learning-based model that transforms mixture-level effects into instrument-wise effects using instrument queries.
Findings
Outperforms previous methods at the mixture level
Successfully extracts instrument-wise effects representations
Improves automatic mixing and effects parameter matching
Abstract
General-purpose audio representations have proven effective across diverse music information retrieval applications, yet their utility in intelligent music production remains limited by insufficient understanding of audio effects (Fx). Although previous approaches have emphasized audio effects analysis at the mixture level, this focus falls short for tasks demanding instrument-wise audio effects understanding, such as automatic mixing. In this work, we present Fx-Encoder++, a novel model designed to extract instrument-wise audio effects representations from music mixtures. Our approach leverages a contrastive learning framework and introduces an "extractor" mechanism that, when provided with instrument queries (audio or text), transforms mixture-level audio effects embeddings into instrument-wise audio effects embeddings. We evaluated our model across retrieval and audio effects…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
