Combining Graph Neural Networks and Mixed Integer Linear Programming for Molecular Inference under the Two-Layered Model
By: Jianshen Zhu , Naveed Ahmed Azam , Kazuya Haraguchi and more
Potential Business Impact:
Designs new molecules with useful properties.
Recently, a novel two-phase framework named mol-infer for inference of chemical compounds with prescribed abstract structures and desired property values has been proposed. The framework mol-infer is primarily based on using mixed integer linear programming (MILP) to simulate the computational process of machine learning methods and describe the necessary and sufficient conditions to ensure such a chemical graph exists. The existing approaches usually first convert the chemical compounds into handcrafted feature vectors to construct prediction functions, but because of the limit on the kinds of descriptors originated from the need for tractability in the MILP formulation, the learning performances on datasets of some properties are not good enough. A lack of good learning performance can greatly lower the quality of the inferred chemical graphs, and thus improving learning performance is of great importance. On the other hand, graph neural networks (GNN) offer a promising machine learning method to directly utilize the chemical graphs as the input, and many existing GNN-based approaches to the molecular property prediction problem have shown that they can enjoy better learning performances compared to the traditional approaches that are based on feature vectors. In this study, we develop a molecular inference framework based on mol-infer, namely mol-infer-GNN, that utilizes GNN as the learning method while keeping the great flexibility originated from the two-layered model on the abstract structure of the chemical graph to be inferred. We conducted computational experiments on the QM9 dataset to show that our proposed GNN model can obtain satisfying learning performances for some properties despite its simple structure, and can infer small chemical graphs comprising up to 20 non-hydrogen atoms within reasonable computational time.
Similar Papers
Improvement of Optimization using Learning Based Models in Mixed Integer Linear Programming Tasks
Machine Learning (CS)
Helps computers solve tough planning problems faster.
Enhancing Molecular Property Prediction with Knowledge from Large Language Models
Computation and Language
Finds new medicines faster using smart computer knowledge.
Multi-Level Fusion Graph Neural Network for Molecule Property Prediction
Machine Learning (CS)
Finds new medicines faster by understanding molecules.