Score: 1

Guide your favorite protein sequence generative model

Published: May 7, 2025 | arXiv ID: 2505.04823v3

By: Junhao Xiong , Hunter Nisonoff , Maria Lukarska and more

BigTech Affiliations: University of California, Berkeley

Potential Business Impact:

Designs new proteins with special abilities.

Business Areas:
Guides Media and Entertainment

Generative machine learning models on sequences are transforming protein engineering. However, no principled framework exists for conditioning these models on auxiliary information, such as experimental data, in a plug-and-play manner. Herein, we present ProteinGuide -- a principled and general method for conditioning -- by unifying a broad class of protein generative models under a single framework. We demonstrate the applicability of ProteinGuide by guiding two protein generative models, ProteinMPNN and ESM3, to generate amino acid and structure token sequences, conditioned on several user-specified properties such as enhanced stability, enzyme classes, and CATH-labeled folds. We also used ProteinGuide with inverse folding models and our own experimental assay to design adenine base editor sequences for high activity.

Country of Origin
🇺🇸 United States

Page Count
61 pages

Category
Computer Science:
Machine Learning (CS)