DiffNMR: Diffusion Models for Nuclear Magnetic Resonance Spectra Elucidation
By: Qingsong Yang , Binglan Wu , Xuwei Liu and more
Potential Business Impact:
Helps scientists figure out what molecules look like.
Nuclear Magnetic Resonance (NMR) spectroscopy is a central characterization method for molecular structure elucidation, yet interpreting NMR spectra to deduce molecular structures remains challenging due to the complexity of spectral data and the vastness of the chemical space. In this work, we introduce DiffNMR, a novel end-to-end framework that leverages a conditional discrete diffusion model for de novo molecular structure elucidation from NMR spectra. DiffNMR refines molecular graphs iteratively through a diffusion-based generative process, ensuring global consistency and mitigating error accumulation inherent in autoregressive methods. The framework integrates a two-stage pretraining strategy that aligns spectral and molecular representations via diffusion autoencoder (Diff-AE) and contrastive learning, the incorporation of retrieval initialization and similarity filtering during inference, and a specialized NMR encoder with radial basis function (RBF) encoding for chemical shifts, preserving continuity and chemical correlation. Experimental results demonstrate that DiffNMR achieves competitive performance for NMR-based structure elucidation, offering an efficient and robust solution for automated molecular analysis.
Similar Papers
Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra
Machine Learning (CS)
Finds drug structures automatically from simple tests.
DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models
Machine Learning (CS)
**Finds molecule shapes from sound and light.**
Equivariant Neural Diffusion for Molecule Generation
Machine Learning (CS)
Builds new molecules that fit perfectly.