A Survey on Evaluation Metrics for Music Generation
By: Faria Binte Kader, Santu Karmaker
Potential Business Impact:
Helps judge if computer-made music sounds good.
Despite significant advancements in music generation systems, the methodologies for evaluating generated music have not progressed as expected due to the complex nature of music, with aspects such as structure, coherence, creativity, and emotional expressiveness. In this paper, we shed light on this research gap, introducing a detailed taxonomy for evaluation metrics for both audio and symbolic music representations. We include a critical review identifying major limitations in current evaluation methodologies which includes poor correlation between objective metrics and human perception, cross-cultural bias, and lack of standardization that hinders cross-model comparisons. Addressing these gaps, we further propose future research directions towards building a comprehensive evaluation framework for music generation evaluation.
Similar Papers
Survey on the Evaluation of Generative Models in Music
Sound
Helps judge if computer-made music sounds good.
Multidimensional Music Aesthetic Evaluation via Semantically Consistent C-Mixup Augmentation
Sound
Makes music sound better by learning what people like.
Factual and Musical Evaluation Metrics for Music Language Models
Sound
Tests if music AI answers questions correctly.