Location-Oriented Sound Event Localization and Detection with Spatial Mapping and Regression Localization
By: Xueping Zhang , Yaxiong Chen , Ruilin Yao and more
Potential Business Impact:
Hears many sounds at once, even overlapping.
Sound Event Localization and Detection (SELD) combines the Sound Event Detection (SED) with the corresponding Direction Of Arrival (DOA). Recently, adopted event oriented multi-track methods affect the generality in polyphonic environments due to the limitation of the number of tracks. To enhance the generality in polyphonic environments, we propose Spatial Mapping and Regression Localization for SELD (SMRL-SELD). SMRL-SELD segments the 3D spatial space, mapping it to a 2D plane, and a new regression localization loss is proposed to help the results converge toward the location of the corresponding event. SMRL-SELD is location-oriented, allowing the model to learn event features based on orientation. Thus, the method enables the model to process polyphonic sounds regardless of the number of overlapping events. We conducted experiments on STARSS23 and STARSS22 datasets and our proposed SMRL-SELD outperforms the existing SELD methods in overall evaluation and polyphony environments.
Similar Papers
Reverberation-based Features for Sound Event Localization and Detection with Distance Estimation
Audio and Speech Processing
Helps robots hear where sounds are coming from.
Integrating Spatial and Semantic Embeddings for Stereo Sound Event Localization in Videos
Audio and Speech Processing
Helps computers understand sounds and sights together.
Generating Diverse Audio-Visual 360 Soundscapes for Sound Event Localization and Detection
Sound
Makes robots hear and see where sounds come from.