FCPE: A Fast Context-based Pitch Estimation Model
By: Yuxin Luo , Ruoyi Zhang , Lu-Chuan Liu and more
Potential Business Impact:
Helps computers hear singing clearly, even with noise.
Pitch estimation (PE) in monophonic audio is crucial for MIDI transcription and singing voice conversion (SVC), but existing methods suffer significant performance degradation under noise. In this paper, we propose FCPE, a fast context-based pitch estimation model that employs a Lynx-Net architecture with depth-wise separable convolutions to effectively capture mel spectrogram features while maintaining low computational cost and robust noise tolerance. Experiments show that our method achieves 96.79\% Raw Pitch Accuracy (RPA) on the MIR-1K dataset, on par with the state-of-the-art methods. The Real-Time Factor (RTF) is 0.0062 on a single RTX 4090 GPU, which significantly outperforms existing algorithms in efficiency. Code is available at https://github.com/CNChTu/FCPE.
Similar Papers
SwiftF0: Fast and Accurate Monophonic Pitch Detection
Sound
Finds the exact musical note in noisy songs.
BERT-APC: A Reference-free Framework for Automatic Pitch Correction via Musical Context Inference
Audio and Speech Processing
Makes singing sound better without losing emotion.
Real-Time Pitch/F0 Detection Using Spectrogram Images and Convolutional Neural Networks
Sound
Helps computers hear singing pitch better.