AdaptMol: Adaptive Fusion from Sequence String to Topological Structure for Few-shot Drug Discovery
By: Yifan Dai , Xuanbai Ren , Tengfei Ma and more
Potential Business Impact:
Helps find new medicines faster with smart computer guesses.
Accurate molecular property prediction (MPP) is a critical step in modern drug development. However, the scarcity of experimental validation data poses a significant challenge to AI-driven research paradigms. Under few-shot learning scenarios, the quality of molecular representations directly dictates the theoretical upper limit of model performance. We present AdaptMol, a prototypical network integrating Adaptive multimodal fusion for Molecular representation. This framework employs a dual-level attention mechanism to dynamically integrate global and local molecular features derived from two modalities: SMILES sequences and molecular graphs. (1) At the local level, structural features such as atomic interactions and substructures are extracted from molecular graphs, emphasizing fine-grained topological information; (2) At the global level, the SMILES sequence provides a holistic representation of the molecule. To validate the necessity of multimodal adaptive fusion, we propose an interpretable approach based on identifying molecular active substructures to demonstrate that multimodal adaptive fusion can efficiently represent molecules. Extensive experiments on three commonly used benchmarks under 5-shot and 10-shot settings demonstrate that AdaptMol achieves state-of-the-art performance in most cases. The rationale-extracted method guides the fusion of two modalities and highlights the importance of both modalities.
Similar Papers
M-GLC: Motif-Driven Global-Local Context Graphs for Few-shot Molecular Property Prediction
Machine Learning (CS)
Finds new medicines with less data.
Adaptive Substructure-Aware Expert Model for Molecular Property Prediction
Machine Learning (CS)
Helps find good medicines by understanding molecule parts.
Enhancing Molecular Property Prediction with Knowledge from Large Language Models
Computation and Language
Finds new medicines faster using smart computer knowledge.