Accurate Diagnosis of Respiratory Viruses Using an Explainable Machine Learning with Mid-Infrared Biomolecular Fingerprinting of Nasopharyngeal Secretions
By: Wenwen Zhang , Zhouzhuo Tang , Yingmei Feng and more
Potential Business Impact:
Finds viruses in mucus fast and accurately.
Accurate identification of respiratory viruses (RVs) is critical for outbreak control and public health. This study presents a diagnostic system that combines Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy (ATR-FTIR) from nasopharyngeal secretions with an explainable Rotary Position Embedding-Sparse Attention Transformer (RoPE-SAT) model to accurately identify multiple RVs within 10 minutes. Spectral data (4000-00 cm-1) were collected, and the bio-fingerprint region (1800-900 cm-1) was employed for analysis. Standard normal variate (SNV) normalization and second-order derivation were applied to reduce scattering and baseline drift. Gradient-weighted class activation mapping (Grad-CAM) was employed to generate saliency maps, highlighting spectral regions most relevant to classification and enhancing the interpretability of model outputs. Two independent cohorts from Beijing Youan Hospital, processed with different viral transport media (VTMs) and drying methods, were evaluated, with one including influenza B, SARS-CoV-2, and healthy controls, and the other including mycoplasma, SARS-CoV-2, and healthy controls. The model achieved sensitivity and specificity above 94.40% across both cohorts. By correlating model-selected infrared regions with known biomolecular signatures, we verified that the system effectively recognizes virus-specific spectral fingerprints, including lipids, Amide I, Amide II, Amide III, nucleic acids, and carbohydrates, and leverages their weighted contributions for accurate classification.
Similar Papers
Prediction of Respiratory Syncytial Virus-Associated Hospitalizations Using Machine Learning Models Based on Environmental Data
Machine Learning (CS)
Predicts baby sickness outbreaks using trash and weather.
A Robust Support Vector Machine Approach for Raman COVID-19 Data Classification
Quantitative Methods
Helps doctors find sickness faster with light.
Inferring Transmission Dynamics of Respiratory Syncytial Virus from Houston Wastewater
Applications
Tracks sickness spread using sewer water.