Learning Interpretable Representation for Controllable Polyphonic Music Generation (2008.07122v1)

Published 17 Aug 2020 in cs.SD, cs.CL, cs.LG, and eess.AS

Abstract: While deep generative models have become the leading methods for algorithmic composition, it remains a challenging problem to control the generation process because the latent variables of most deep-learning models lack good interpretability. Inspired by the content-style disentanglement idea, we design a novel architecture, under the VAE framework, that effectively learns two interpretable latent factors of polyphonic music: chord and texture. The current model focuses on learning 8-beat long piano composition segments. We show that such chord-texture disentanglement provides a controllable generation pathway leading to a wide spectrum of applications, including compositional style transfer, texture variation, and accompaniment arrangement. Both objective and subjective evaluations show that our method achieves a successful disentanglement and high quality controlled music generation.

Authors (4)

Ziyu Wang (137 papers)
Dingsu Wang (5 papers)
Yixiao Zhang (44 papers)
Gus Xia (57 papers)

Citations (62)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Learning Interpretable Representation for Controllable Polyphonic Music Generation (2008.07122v1)

Summary

Related Papers