Differentiable Signal Processing With Black-Box Audio Effects (2105.04752v1)

Published 11 May 2021 in eess.AS, cs.LG, cs.SD, and eess.SP

Abstract: We present a data-driven approach to automate audio signal processing by incorporating stateful third-party, audio effects as layers within a deep neural network. We then train a deep encoder to analyze input audio and control effect parameters to perform the desired signal manipulation, requiring only input-target paired audio data as supervision. To train our network with non-differentiable black-box effects layers, we use a fast, parallel stochastic gradient approximation scheme within a standard auto differentiation graph, yielding efficient end-to-end backpropagation. We demonstrate the power of our approach with three separate automatic audio production applications: tube amplifier emulation, automatic removal of breaths and pops from voice recordings, and automatic music mastering. We validate our results with a subjective listening test, showing our approach not only can enable new automatic audio effects tasks, but can yield results comparable to a specialized, state-of-the-art commercial solution for music mastering.

Citations (28)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

YouTube

Show All Videos

Differentiable Signal Processing With Black-Box Audio Effects (2105.04752v1)

Summary

Related Papers

YouTube