HiFi-Stream: Streaming Speech Enhancement with Generative Adversarial Networks
By: Ekaterina Dmitrieva, Maksim Kaledin
Potential Business Impact:
Makes phone calls clearer on slow phones.
Speech Enhancement techniques have become core technologies in mobile devices and voice software. Still, modern deep learning solutions often require high amount of computational resources what makes their usage on low-resource devices challenging. We present HiFi-Stream, an optimized version of recently published HiFi++ model. Our experiments demonstrate that HiFi-Stream saves most of the qualities of the original model despite its size and computational complexity improved in comparison to the original HiFi++ making it one of the smallest and fastest models available. The model is evaluated in streaming setting where it demonstrates its superior performance in comparison to modern baselines.
Similar Papers
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution
Sound
Makes bad audio sound clear and loud.
Real-Time Streamable Generative Speech Restoration with Flow Matching
Signal Processing
Makes computers talk clearly in real-time.
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
CV and Pattern Recognition
Makes super clear videos much, much faster.