Score: 0

Contrastive timbre representations for musical instrument and synthesizer retrieval

Published: September 16, 2025 | arXiv ID: 2509.13285v1

By: Gwendal Le Vaillant, Yannick Molle

Potential Business Impact:

Find specific instrument sounds in music.

Business Areas:
Musical Instruments Media and Entertainment, Music and Audio

Efficiently retrieving specific instrument timbres from audio mixtures remains a challenge in digital music production. This paper introduces a contrastive learning framework for musical instrument retrieval, enabling direct querying of instrument databases using a single model for both single- and multi-instrument sounds. We propose techniques to generate realistic positive/negative pairs of sounds for virtual musical instruments, such as samplers and synthesizers, addressing limitations in common audio data augmentation methods. The first experiment focuses on instrument retrieval from a dataset of 3,884 instruments, using single-instrument audio as input. Contrastive approaches are competitive with previous works based on classification pre-training. The second experiment considers multi-instrument retrieval with a mixture of instruments as audio input. In this case, the proposed contrastive framework outperforms related works, achieving 81.7\% top-1 and 95.7\% top-5 accuracies for three-instrument mixtures.

Page Count
5 pages

Category
Computer Science:
Sound