Real-Time Pitch/F0 Detection Using Spectrogram Images and Convolutional Neural Networks
By: Xufang Zhao, Omer Tsimhoni
Potential Business Impact:
Helps computers hear singing pitch better.
This paper presents a novel approach to detect F0 through Convolutional Neural Networks and image processing techniques to directly estimate pitch from spectrogram images. Our new approach demonstrates a very good detection accuracy; a total of 92% of predicted pitch contours have strong or moderate correlations to the true pitch contours. Furthermore, the experimental comparison between our new approach and other state-of-the-art CNN methods reveals that our approach can enhance the detection rate by approximately 5% across various Signal-to-Noise Ratio conditions.
Similar Papers
Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach
Audio and Speech Processing
Helps computers understand all sounds by their shape.
SwiftF0: Fast and Accurate Monophonic Pitch Detection
Sound
Finds the exact musical note in noisy songs.
Predicting Music Track Popularity by Convolutional Neural Networks on Spotify Features and Spectrogram of Audio Waveform
Sound
Predicts which songs will be hits.