Active Learning for Machine Learning Driven Molecular Dynamics
By: Kevin Bachelor , Sanya Murdeshwar , Daniel Sabo and more
Potential Business Impact:
Trains computer models to better understand molecules.
Machine learned coarse grained (CG) potentials are fast, but degrade over time when simulations reach undersampled biomolecular conformations, and generating widespread all atom (AA) data to combat this is computationally infeasible. We propose a novel active learning framework for CG neural network potentials in molecular dynamics (MD). Building on the CGSchNet model, our method employs root mean squared deviation (RMSD) based frame selection from MD simulations in order to generate data on the fly by querying an oracle during the training of a neural network potential. This framework preserves CG level efficiency while correcting the model at precise, RMSD identified coverage gaps. By training CGSchNet, a coarse grained neural network potential, we empirically show that our framework explores previously unseen configurations and trains the model on unexplored regions of conformational space. Our active learning framework enables a CGSchNet model trained on the Chignolin protein to achieve a 33.05% improvement in the Wasserstein 1 (W1) metric in Time lagged Independent Component Analysis (TICA) space on an in house benchmark suite.
Similar Papers
Enhanced Sampling for Efficient Learning of Coarse-Grained Machine Learning Potentials
Chemical Physics
Makes computer models of tiny things more accurate faster.
Physics- and data-driven Active Learning of neural network representations for free energy functions of materials from statistical mechanics
Computational Physics
Predicts how materials change with less guessing.
Energy-Based Coarse-Graining in Molecular Dynamics: A Flow-Based Framework Without Data
Chemical Physics
Creates accurate molecule pictures without needing real experiments.