Empirical Results for Adjusting Truncated Backpropagation Through Time while Training Neural Audio Effects
By: Yann Bourdin, Pierrick Legrand, Fanny Roche
Potential Business Impact:
Trains AI to make music sound better, faster.
This paper investigates the optimization of Truncated Backpropagation Through Time (TBPTT) for training neural networks in digital audio effect modeling, with a focus on dynamic range compression. The study evaluates key TBPTT hyperparameters -- sequence number, batch size, and sequence length -- and their influence on model performance. Using a convolutional-recurrent architecture, we conduct extensive experiments across datasets with and without conditionning by user controls. Results demonstrate that carefully tuning these parameters enhances model accuracy and training stability, while also reducing computational demands. Objective evaluations confirm improved performance with optimized settings, while subjective listening tests indicate that the revised TBPTT configuration maintains high perceptual quality.
Similar Papers
Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation
CV and Pattern Recognition
Makes AI learn faster and use less memory.
Test-Time Training for Speech Enhancement
Audio and Speech Processing
Cleans up noisy speech on the fly.
E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models
Machine Learning (CS)
Makes talking computers work better in noisy places.