Score: 0

A Multi-Stage Hybrid CNN-Transformer Network for Automated Pediatric Lung Sound Classification

Published: July 27, 2025 | arXiv ID: 2507.20408v1

By: Samiul Based Shuvo, Taufiq Hasan

Potential Business Impact:

Helps doctors find sick kids' lung problems.

Business Areas:
Speech Recognition Data and Analytics, Software

Automated analysis of lung sound auscultation is essential for monitoring respiratory health, especially in regions facing a shortage of skilled healthcare workers. While respiratory sound classification has been widely studied in adults, its ap plication in pediatric populations, particularly in children aged <6 years, remains an underexplored area. The developmental changes in pediatric lungs considerably alter the acoustic proper ties of respiratory sounds, necessitating specialized classification approaches tailored to this age group. To address this, we propose a multistage hybrid CNN-Transformer framework that combines CNN-extracted features with an attention-based architecture to classify pediatric respiratory diseases using scalogram images from both full recordings and individual breath events. Our model achieved an overall score of 0.9039 in binary event classifi cation and 0.8448 in multiclass event classification by employing class-wise focal loss to address data imbalance. At the recording level, the model attained scores of 0.720 for ternary and 0.571 for multiclass classification. These scores outperform the previous best models by 3.81% and 5.94%, respectively. This approach offers a promising solution for scalable pediatric respiratory disease diagnosis, especially in resource-limited settings.

Country of Origin
🇧🇩 Bangladesh

Page Count
9 pages

Category
Electrical Engineering and Systems Science:
Signal Processing