Generating Piano Music with Transformers: A Comparative Study of Scale, Data, and Metrics
By: Jonathan Lehmkuhl , Ábel Ilyés-Kun , Nico Bremes and more
Potential Business Impact:
Makes computer-made music sound like a human wrote it.
Although a variety of transformers have been proposed for symbolic music generation in recent years, there is still little comprehensive study on how specific design choices affect the quality of the generated music. In this work, we systematically compare different datasets, model architectures, model sizes, and training strategies for the task of symbolic piano music generation. To support model development and evaluation, we examine a range of quantitative metrics and analyze how well they correlate with human judgment collected through listening studies. Our best-performing model, a 950M-parameter transformer trained on 80K MIDI files from diverse genres, produces outputs that are often rated as human-composed in a Turing-style listening survey.
Similar Papers
Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training
Sound
Makes music sound like a real person played it.
Difficulty-Controlled Simplification of Piano Scores with Synthetic Data for Inclusive Music Education
Sound
Makes learning piano easier for everyone.
Difficulty-Controlled Simplification of Piano Scores with Synthetic Data for Inclusive Music Education
Sound
Makes learning piano easier by simplifying music.