Score: 1

Learning the PTM Code through a Coarse-to-Fine, Mechanism-Aware Framework

Published: October 27, 2025 | arXiv ID: 2510.23492v1

By: Jingjie Zhang , Hanqun Cao , Zijun Gao and more

Potential Business Impact:

Finds which cell parts control protein jobs.

Business Areas:
Bioinformatics Biotechnology, Data and Analytics, Science and Engineering

Post-translational modifications (PTMs) form a combinatorial "code" that regulates protein function, yet deciphering this code - linking modified sites to their catalytic enzymes - remains a central unsolved problem in understanding cellular signaling and disease. We introduce COMPASS-PTM, a mechanism-aware, coarse-to-fine learning framework that unifies residue-level PTM profiling with enzyme-substrate assignment. COMPASS-PTM integrates evolutionary representations from protein language models with physicochemical priors and a crosstalk-aware prompting mechanism that explicitly models inter-PTM dependencies. This design allows the model to learn biologically coherent patterns of cooperative and antagonistic modifications while addressing the dual long-tail distribution of PTM data. Across multiple proteome-scale benchmarks, COMPASS-PTM establishes new state-of-the-art performance, including a 122% relative F1 improvement in multi-label site prediction and a 54% gain in zero-shot enzyme assignment. Beyond accuracy, the model demonstrates interpretable generalization, recovering canonical kinase motifs and predicting disease-associated PTM rewiring caused by missense variants. By bridging statistical learning with biochemical mechanism, COMPASS-PTM unifies site-level and enzyme-level prediction into a single framework that learns the grammar underlying protein regulation and signaling.

Country of Origin
🇨🇳 China

Page Count
47 pages

Category
Computer Science:
Computational Engineering, Finance, and Science