Few-shot Molecular Property Prediction: A Survey
By: Zeyu Wang , Tianyi Jiang , Huanchang Ma and more
Potential Business Impact:
Teaches computers to guess molecule traits with few examples.
AI-assisted molecular property prediction has become a promising technique in early-stage drug discovery and materials design in recent years. However, due to high-cost and complex wet-lab experiments, real-world molecules usually experience the issue of scarce annotations, leading to limited labeled data for effective supervised AI model learning. In light of this, few-shot molecular property prediction (FSMPP) has emerged as an expressive paradigm that enables learning from only a few labeled examples. Despite rapidly growing attention, existing FSMPP studies remain fragmented, without a coherent framework to capture methodological advances and domain-specific challenges. In this work, we present the first comprehensive and systematic survey of few-shot molecular property prediction. We begin by analyzing the few-shot phenomenon in molecular datasets and highlighting two core challenges: (1) cross-property generalization under distribution shifts, where each task corresponding to each property, may follow a different data distribution or even be inherently weakly related to others from a biochemical perspective, requiring the model to transfer knowledge across heterogeneous prediction tasks, and (2) cross-molecule generalization under structural heterogeneity, where molecules involved in different or same properties may exhibit significant structural diversity, making model difficult to achieve generalization. Then, we introduce a unified taxonomy that organizes existing methods into data, model, and learning paradigm levels, reflecting their strategies for extracting knowledge from scarce supervision in few-shot molecular property prediction. Next, we compare representative methods, summarize benchmark datasets and evaluation protocols. In the end, we identify key trends and future directions for advancing the continued research on FSMPP.
Similar Papers
M-GLC: Motif-Driven Global-Local Context Graphs for Few-shot Molecular Property Prediction
Machine Learning (CS)
Finds new medicines with less data.
Enhancing Molecular Property Prediction with Knowledge from Large Language Models
Computation and Language
Finds new medicines faster using smart computer knowledge.
AdaptMol: Adaptive Fusion from Sequence String to Topological Structure for Few-shot Drug Discovery
Machine Learning (CS)
Helps find new medicines faster with smart computer guesses.