TimberAgent: Gram-Guided Retrieval for Executable Music Effect Control
Shihao He, Yihan Xia, Fang Liu, Taotao Wang, and Shengli Zhang

TL;DR
This paper introduces Texture Resonance Retrieval (TRR), a novel audio representation leveraging Gram matrices of Wav2Vec2 activations, to improve retrieval-based control of audio effects in digital audio workstations, validated through benchmark and perceptual studies.
Contribution
The paper presents TRR, a new texture-aware retrieval method for audio effect control, demonstrating superior performance over existing baselines in a guitar-effects benchmark.
Findings
TRR achieves lowest normalized parameter error among methods.
A listening study confirms perceptual relevance of TRR.
Ablation studies validate core design choices of TRR.
Abstract
Digital audio workstations expose rich effect chains, yet a semantic gap remains between perceptual user intent and low-level signal-processing parameters. We study retrieval-grounded audio effect control, where the output is an editable plugin configuration rather than a finalized waveform. Our focus is Texture Resonance Retrieval (TRR), an audio representation built from Gram matrices of projected mid-level Wav2Vec2 activations. This design preserves texture-relevant co-activation structure. We evaluate TRR on a guitar-effects benchmark with 1,063 candidate presets and 204 queries. The evaluation follows Protocol-A, a cross-validation scheme that prevents train-test leakage. We compare TRR against CLAP and internal retrieval baselines (Wav2Vec-RAG, Text-RAG, FeatureNN-RAG), using min-max normalized metrics grounded in physical DSP parameter ranges. Ablation studies validate TRR's core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
