Score: 0

GeoPep: A geometry-aware masked language model for protein-peptide binding site prediction

Published: October 30, 2025 | arXiv ID: 2510.27040v1

By: Dian Chen , Yunkai Chen , Tong Lin and more

Potential Business Impact:

Finds where tiny protein pieces attach.

Business Areas:
Geospatial Data and Analytics, Navigation and Mapping

Multimodal approaches that integrate protein structure and sequence have achieved remarkable success in protein-protein interface prediction. However, extending these methods to protein-peptide interactions remains challenging due to the inherent conformational flexibility of peptides and the limited availability of structural data that hinder direct training of structure-aware models. To address these limitations, we introduce GeoPep, a novel framework for peptide binding site prediction that leverages transfer learning from ESM3, a multimodal protein foundation model. GeoPep fine-tunes ESM3's rich pre-learned representations from protein-protein binding to address the limited availability of protein-peptide binding data. The fine-tuned model is further integrated with a parameter-efficient neural network architecture capable of learning complex patterns from sparse data. Furthermore, the model is trained using distance-based loss functions that exploit 3D structural information to enhance binding site prediction. Comprehensive evaluations demonstrate that GeoPep significantly outperforms existing methods in protein-peptide binding site prediction by effectively capturing sparse and heterogeneous binding patterns.

Page Count
11 pages

Category
Electrical Engineering and Systems Science:
Signal Processing