MolGA: Molecular Graph Adaptation with Pre-trained 2D Graph Encoder
By: Xingtong Yu , Chang Zhou , Xinming Zhang and more
Potential Business Impact:
Helps computers understand molecules better for science.
Molecular graph representation learning is widely used in chemical and biomedical research. While pre-trained 2D graph encoders have demonstrated strong performance, they overlook the rich molecular domain knowledge associated with submolecular instances (atoms and bonds). While molecular pre-training approaches incorporate such knowledge into their pre-training objectives, they typically employ designs tailored to a specific type of knowledge, lacking the flexibility to integrate diverse knowledge present in molecules. Hence, reusing widely available and well-validated pre-trained 2D encoders, while incorporating molecular domain knowledge during downstream adaptation, offers a more practical alternative. In this work, we propose MolGA, which adapts pre-trained 2D graph encoders to downstream molecular applications by flexibly incorporating diverse molecular domain knowledge. First, we propose a molecular alignment strategy that bridge the gap between pre-trained topological representations with domain-knowledge representations. Second, we introduce a conditional adaptation mechanism that generates instance-specific tokens to enable fine-grained integration of molecular domain knowledge for downstream tasks. Finally, we conduct extensive experiments on eleven public datasets, demonstrating the effectiveness of MolGA.
Similar Papers
GeoRecon: Graph-Level Representation Learning for 3D Molecules via Reconstruction-Based Pretraining
Machine Learning (CS)
Helps computers understand how molecules fit together.
MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design
Machine Learning (CS)
Designs new medicines faster with less data.
Bridging Molecular Graphs and Large Language Models
Machine Learning (CS)
Lets computers understand chemical structures like words.