Infant Cry Detection Using Causal Temporal Representation
By: Minghao Fu , Danning Li , Aryan Gadhiya and more
Potential Business Impact:
Helps machines hear baby cries in noisy places.
This paper addresses a major challenge in acoustic event detection, in particular infant cry detection in the presence of other sounds and background noises: the lack of precise annotated data. We present two contributions for supervised and unsupervised infant cry detection. The first is an annotated dataset for cry segmentation, which enables supervised models to achieve state-of-the-art performance. Additionally, we propose a novel unsupervised method, Causal Representation Spare Transition Clustering (CRSTC), based on causal temporal representation, which helps address the issue of data scarcity more generally. By integrating the detected cry segments, we significantly improve the performance of downstream infant cry classification, highlighting the potential of this approach for infant care applications.
Similar Papers
Speech transformer models for extracting information from baby cries
Sound
Helps computers understand baby cries and emotions.
Infant Cry Detection In Noisy Environment Using Blueprint Separable Convolutions and Time-Frequency Recurrent Neural Network
Sound
Helps machines tell if a baby is crying.
Making deep neural networks work for medical audio: representation, compression and domain adaptation
Sound
Helps doctors hear sickness in baby cries.