Score: 0

Learning Separated Representations for Instrument-based Music Similarity

Published: March 21, 2025 | arXiv ID: 2503.17281v2

By: Yuka Hashizume , Li Li , Atsushi Miyashita and more

Potential Business Impact:

Find songs by their instruments, not just the whole song.

Business Areas:

Musical Instruments Media and Entertainment, Music and Audio

A flexible recommendation and retrieval system requires music similarity in terms of multiple partial elements of musical pieces to allow users to select the element they want to focus on. A method for music similarity learning using multiple networks with individual instrumental signals is effective but faces the problem that using each clean instrumental signal as a query is impractical for retrieval systems and using separated instrumental signals reduces accuracy owing to artifacts. In this paper, we present instrumental-part-based music similarity learning with a single network that takes mixed signals as input instead of individual instrumental signals. Specifically, we designed a single similarity embedding space with separated subspaces for each instrument, extracted by Conditional Similarity Networks, which are trained using the triplet loss with masks. Experimental results showed that (1) the proposed method can obtain more accurate embedding representation than using individual networks using separated signals as input in the evaluation of an instrument that had low accuracy, (2) each sub-embedding space can hold the characteristics of the corresponding instrument, and (3) the selection of similar musical pieces focusing on each instrumental sound by the proposed method can obtain human acceptance, especially when focusing on timbre.

Music Similarity Representation Learning Focusing on Individual Instruments with Source Separation and Human Preference

Sound

Helps computers find songs that sound alike.

24 Mar 2025 0

87%

Contrastive timbre representations for musical instrument and synthesizer retrieval

Sound

Find specific instrument sounds in music.

16 Sep 2025 0

87%

Automatic Live Music Song Identification Using Multi-level Deep Sequence Similarity Learning

Audio and Speech Processing

Finds the original song from a live performance.

14 Jan 2025 1

View PDF Login to Bookmark

Page Count

26 pages

Learning Separated Representations for Instrument-based Music Similarity

Find songs by their instruments, not just the whole song.

Technical Abstract

Music Similarity Representation Learning Focusing on Individual Instruments with Source Separation and Human Preference

Contrastive timbre representations for musical instrument and synthesizer retrieval

Automatic Live Music Song Identification Using Multi-level Deep Sequence Similarity Learning