Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 171 tok/s

Gemini 2.5 Pro 41 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 31 tok/s Pro

GPT-4o 92 tok/s Pro

Kimi K2 202 tok/s Pro

GPT OSS 120B 435 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

The Power of Fragmentation: A Hierarchical Transformer Model for Structural Segmentation in Symbolic Music Generation (2205.08579v2)

Published 17 May 2022 in cs.SD, cs.LG, and eess.AS

Abstract: Symbolic Music Generation relies on the contextual representation capabilities of the generative model, where the most prevalent approach is the Transformer-based model. The learning of musical context is also related to the structural elements in music, i.e. intro, verse, and chorus, which are currently overlooked by the research community. In this paper, we propose a hierarchical Transformer model to learn multi-scale contexts in music. In the encoding phase, we first designed a Fragment Scope Localization layer to syncopate the music into chords and sections. Then, we use a multi-scale attention mechanism to learn note-, chord-, and section-level contexts. In the decoding phase, we proposed a hierarchical Transformer model that uses fine-decoders to generate sections in parallel and a coarse-decoder to decode the combined music. We also designed a Music Style Normalization layer to achieve a consistent music style between the generated sections. Our model is evaluated on two open MIDI datasets, and experiments show that our model outperforms the best contemporary music generative models. More excitingly, the visual evaluation shows that our model is superior in melody reuse, resulting in more realistic music.

Citations (10)

View on Semantic Scholar