Contrastive timbre representations for musical instrument and synthesizer retrieval
By: Gwendal Le Vaillant, Yannick Molle
Potential Business Impact:
Find specific instrument sounds in music.
Efficiently retrieving specific instrument timbres from audio mixtures remains a challenge in digital music production. This paper introduces a contrastive learning framework for musical instrument retrieval, enabling direct querying of instrument databases using a single model for both single- and multi-instrument sounds. We propose techniques to generate realistic positive/negative pairs of sounds for virtual musical instruments, such as samplers and synthesizers, addressing limitations in common audio data augmentation methods. The first experiment focuses on instrument retrieval from a dataset of 3,884 instruments, using single-instrument audio as input. Contrastive approaches are competitive with previous works based on classification pre-training. The second experiment considers multi-instrument retrieval with a mixture of instruments as audio input. In this case, the proposed contrastive framework outperforms related works, achieving 81.7\% top-1 and 95.7\% top-5 accuracies for three-instrument mixtures.
Similar Papers
Learning Separated Representations for Instrument-based Music Similarity
Sound
Find songs by their instruments, not just the whole song.
Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learning
Sound
Detects fake music made by computers.
Automatic Music Sample Identification with Multi-Track Contrastive Learning
Sound
Finds original songs used in new music.