Fx-Encoder++: Extracting Instrument-Wise Audio Effects Representations from Mixtures

Yen-Tung Yeh; Junghyun Koo; Marco A. Mart\'inez-Ram\'irez; Wei-Hsiang Liao; Yi-Hsuan Yang; and Yuki Mitsufuji

arXiv:2507.02273·cs.SD·July 4, 2025

Fx-Encoder++: Extracting Instrument-Wise Audio Effects Representations from Mixtures

Yen-Tung Yeh, Junghyun Koo, Marco A. Mart\'inez-Ram\'irez, Wei-Hsiang Liao, Yi-Hsuan Yang, and Yuki Mitsufuji

PDF

TL;DR

Fx-Encoder++ is a novel model that effectively extracts instrument-specific audio effects representations from music mixtures, enhancing intelligent music production capabilities.

Contribution

The paper introduces Fx-Encoder++, a contrastive learning-based model that transforms mixture-level effects into instrument-wise effects using instrument queries.

Findings

01

Outperforms previous methods at the mixture level

02

Successfully extracts instrument-wise effects representations

03

Improves automatic mixing and effects parameter matching

Abstract

General-purpose audio representations have proven effective across diverse music information retrieval applications, yet their utility in intelligent music production remains limited by insufficient understanding of audio effects (Fx). Although previous approaches have emphasized audio effects analysis at the mixture level, this focus falls short for tasks demanding instrument-wise audio effects understanding, such as automatic mixing. In this work, we present Fx-Encoder++, a novel model designed to extract instrument-wise audio effects representations from music mixtures. Our approach leverages a contrastive learning framework and introduces an "extractor" mechanism that, when provided with instrument queries (audio or text), transforms mixture-level audio effects embeddings into instrument-wise audio effects embeddings. We evaluated our model across retrieval and audio effects…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.