Can Large Language Models Predict Audio Effects Parameters from Natural Language?

Seungheon Doh; Junghyun Koo; Marco A. Mart\'inez-Ram\'irez; Wei-Hsiang Liao; Juhan Nam; Yuki Mitsufuji

arXiv:2505.20770·cs.SD·July 18, 2025

Can Large Language Models Predict Audio Effects Parameters from Natural Language?

Seungheon Doh, Junghyun Koo, Marco A. Mart\'inez-Ram\'irez, Wei-Hsiang Liao, Juhan Nam, Yuki Mitsufuji

PDF

Open Access

TL;DR

This paper introduces LLM2Fx, a framework using Large Language Models to predict audio effects parameters from natural language descriptions without training, improving accessibility in music production.

Contribution

The paper presents a novel zero-shot approach leveraging LLMs for text-to-effect parameter prediction, incorporating in-context examples to enhance performance.

Findings

01

LLMs can predict audio effects parameters from natural language effectively.

02

The approach outperforms previous optimization methods in accuracy.

03

In-context examples improve the quality of parameter predictions.

Abstract

In music production, manipulating audio effects (Fx) parameters through natural language has the potential to reduce technical barriers for non-experts. We present LLM2Fx, a framework leveraging Large Language Models (LLMs) to predict Fx parameters directly from textual descriptions without requiring task-specific training or fine-tuning. Our approach address the text-to-effect parameter prediction (Text2Fx) task by mapping natural language descriptions to the corresponding Fx parameters for equalization and reverberation. We demonstrate that LLMs can generate Fx parameters in a zero-shot manner that elucidates the relationship between timbre semantics and audio effects in music production. To enhance performance, we introduce three types of in-context examples: audio Digital Signal Processing (DSP) features, DSP function code, and few-shot examples. Our results demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing