Improving Predictions of Molecular Properties with Graph Featurisation and Heterogeneous Ensemble Models
By: Michael L. Parker , Samar Mahmoud , Bailey Montefiore and more
Potential Business Impact:
Predicts how molecules will act better.
We explore a "best-of-both" approach to modelling molecular properties by combining learned molecular descriptors from a graph neural network (GNN) with general-purpose descriptors and a mixed ensemble of machine learning (ML) models. We introduce a MetaModel framework to aggregate predictions from a diverse set of leading ML models. We present a featurisation scheme for combining task-specific GNN-derived features with conventional molecular descriptors. We demonstrate that our framework outperforms the cutting-edge ChemProp model on all regression datasets tested and 6 of 9 classification datasets. We further show that including the GNN features derived from ChemProp boosts the ensemble model's performance on several datasets where it otherwise would have underperformed. We conclude that to achieve optimal performance across a wide set of problems, it is vital to combine general-purpose descriptors with task-specific learned features and use a diverse set of ML models to make the predictions.
Similar Papers
Enhancing Molecular Property Prediction with Knowledge from Large Language Models
Computation and Language
Finds new medicines faster using smart computer knowledge.
Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding
Machine Learning (CS)
Helps computers understand chemicals better using words and shapes.
A Survey of Graph Neural Networks for Drug Discovery: Recent Developments and Challenges
Machine Learning (CS)
Helps find new medicines faster using computer models.