Learning the PTM Code through a Coarse-to-Fine, Mechanism-Aware Framework
By: Jingjie Zhang , Hanqun Cao , Zijun Gao and more
Potential Business Impact:
Finds which cell parts control protein jobs.
Post-translational modifications (PTMs) form a combinatorial "code" that regulates protein function, yet deciphering this code - linking modified sites to their catalytic enzymes - remains a central unsolved problem in understanding cellular signaling and disease. We introduce COMPASS-PTM, a mechanism-aware, coarse-to-fine learning framework that unifies residue-level PTM profiling with enzyme-substrate assignment. COMPASS-PTM integrates evolutionary representations from protein language models with physicochemical priors and a crosstalk-aware prompting mechanism that explicitly models inter-PTM dependencies. This design allows the model to learn biologically coherent patterns of cooperative and antagonistic modifications while addressing the dual long-tail distribution of PTM data. Across multiple proteome-scale benchmarks, COMPASS-PTM establishes new state-of-the-art performance, including a 122% relative F1 improvement in multi-label site prediction and a 54% gain in zero-shot enzyme assignment. Beyond accuracy, the model demonstrates interpretable generalization, recovering canonical kinase motifs and predicting disease-associated PTM rewiring caused by missense variants. By bridging statistical learning with biochemical mechanism, COMPASS-PTM unifies site-level and enzyme-level prediction into a single framework that learns the grammar underlying protein regulation and signaling.
Similar Papers
UniPTMs: The First Unified Multi-type PTM Site Prediction Model via Master-Slave Architecture-Based Multi-Stage Fusion Strategy and Hierarchical Contrastive Loss
Machine Learning (CS)
Predicts how proteins change to control life.
Accurate de novo sequencing of the modified proteome with OmniNovo
Quantitative Methods
Finds hidden protein changes in cells.
InstructPLM-mu: 1-Hour Fine-Tuning of ESM2 Beats ESM3 in Protein Mutation Predictions
Quantitative Methods
Teaches computers to predict protein changes faster.