GeoPep: A geometry-aware masked language model for protein-peptide binding site prediction
By: Dian Chen , Yunkai Chen , Tong Lin and more
Potential Business Impact:
Finds where tiny protein pieces attach.
Multimodal approaches that integrate protein structure and sequence have achieved remarkable success in protein-protein interface prediction. However, extending these methods to protein-peptide interactions remains challenging due to the inherent conformational flexibility of peptides and the limited availability of structural data that hinder direct training of structure-aware models. To address these limitations, we introduce GeoPep, a novel framework for peptide binding site prediction that leverages transfer learning from ESM3, a multimodal protein foundation model. GeoPep fine-tunes ESM3's rich pre-learned representations from protein-protein binding to address the limited availability of protein-peptide binding data. The fine-tuned model is further integrated with a parameter-efficient neural network architecture capable of learning complex patterns from sparse data. Furthermore, the model is trained using distance-based loss functions that exploit 3D structural information to enhance binding site prediction. Comprehensive evaluations demonstrate that GeoPep significantly outperforms existing methods in protein-peptide binding site prediction by effectively capturing sparse and heterogeneous binding patterns.
Similar Papers
Morphology-Specific Peptide Discovery via Masked Conditional Generative Modeling
Biomolecules
Creates new materials that build themselves into shapes.
PepTriX: A Framework for Explainable Peptide Analysis through Protein Language Models
Artificial Intelligence
Helps find new medicines by understanding tiny protein parts.
CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization
Biomolecules
Designs new medicines from nature's building blocks.