Empirical Results for Adjusting Truncated Backpropagation Through Time while Training Neural Audio Effects
Yann Bourdin (ASTRAL), Pierrick Legrand (ASTRAL, ENSC, IMS), Fanny Roche

TL;DR
This paper explores how to optimize Truncated Backpropagation Through Time (TBPTT) hyperparameters for training neural networks in digital audio effects, improving accuracy, stability, and efficiency.
Contribution
It provides empirical insights into tuning TBPTT parameters specifically for neural audio effect modeling, especially for dynamic range compression.
Findings
Optimized TBPTT hyperparameters improve model accuracy.
Careful tuning reduces computational costs.
Subjective tests confirm maintained audio quality.
Abstract
This paper investigates the optimization of Truncated Backpropagation Through Time (TBPTT) for training neural networks in digital audio effect modeling, with a focus on dynamic range compression. The study evaluates key TBPTT hyperparameters -- sequence number, batch size, and sequence length -- and their influence on model performance. Using a convolutional-recurrent architecture, we conduct extensive experiments across datasets with and without conditionning by user controls. Results demonstrate that carefully tuning these parameters enhances model accuracy and training stability, while also reducing computational demands. Objective evaluations confirm improved performance with optimized settings, while subjective listening tests indicate that the revised TBPTT configuration maintains high perceptual quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Music and Audio Processing
