Morphology-Specific Peptide Discovery via Masked Conditional Generative Modeling
By: Nuno Costa, Julija Zavadlav
Potential Business Impact:
Creates new materials that build themselves into shapes.
Peptide self-assembly prediction offers a powerful bottom-up strategy for designing biocompatible, low-toxicity materials for large-scale synthesis in a broad range of biomedical and energy applications. However, screening the vast sequence space for categorization of aggregate morphology remains intractable. We introduce PepMorph, an end-to-end peptide discovery pipeline that generates novel sequences that are not only prone to aggregate but self-assemble into a specified fibrillar or spherical morphology. We compiled a new dataset by leveraging existing aggregation propensity datasets and extracting geometric and physicochemical isolated peptide descriptors that act as proxies for aggregate morphology. This dataset is then used to train a Transformer-based Conditional Variational Autoencoder with a masking mechanism, which generates novel peptides under arbitrary conditioning. After filtering to ensure design specifications and validation of generated sequences through coarse-grained molecular dynamics simulations, PepMorph yielded 83% accuracy in intended morphology generation, showcasing its promise as a framework for application-driven peptide discovery.
Similar Papers
CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization
Biomolecules
Designs new medicines from nature's building blocks.
Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing
Biomolecules
Teaches computers to read protein codes better.
GeoPep: A geometry-aware masked language model for protein-peptide binding site prediction
Signal Processing
Finds where tiny protein pieces attach.