Deep Feed-Forward Neural Network for Bangla Isolated Speech Recognition
By: Dipayan Bhadra, Mehrab Hosain, Fatema Alam
Potential Business Impact:
Lets computers understand spoken Bengali words.
As the most important human-machine interfacing tool, an insignificant amount of work has been carried out on Bangla Speech Recognition compared to the English language. Motivated by this, in this work, the performance of speaker-independent isolated speech recognition systems has been implemented and analyzed using a dataset that is created containing both isolated Bangla and English spoken words. An approach using the Mel Frequency Cepstral Coefficient (MFCC) and Deep Feed-Forward Fully Connected Neural Network (DFFNN) of 7 layers as a classifier is proposed in this work to recognize isolated spoken words. This work shows 93.42% recognition accuracy which is better compared to most of the works done previously on Bangla speech recognition considering the number of classes and dataset size.
Similar Papers
Zero-Shot to Zero-Lies: Detecting Bengali Deepfake Audio through Transfer Learning
Sound
Finds fake Bengali voices in audio recordings.
Isolated Bangla Handwritten Character Classification using Transfer Learning
CV and Pattern Recognition
Helps computers read handwritten Bangla words.
Enhancing Neural Spoken Language Recognition: An Exploration with Multilingual Datasets
Sound
Lets computers understand many languages spoken.