Demystify Protein Generation with Hierarchical Conditional Diffusion Models
By: Zinan Ling , Yi Shi , Da Yan and more
Potential Business Impact:
Designs new proteins that work as intended.
Generating novel and functional protein sequences is critical to a wide range of applications in biology. Recent advancements in conditional diffusion models have shown impressive empirical performance in protein generation tasks. However, reliable generations of protein remain an open research question in de novo protein design, especially when it comes to conditional diffusion models. Considering the biological function of a protein is determined by multi-level structures, we propose a novel multi-level conditional diffusion model that integrates both sequence-based and structure-based information for efficient end-to-end protein design guided by specified functions. By generating representations at different levels simultaneously, our framework can effectively model the inherent hierarchical relations between different levels, resulting in an informative and discriminative representation of the generated protein. We also propose a Protein-MMD, a new reliable evaluation metric, to evaluate the quality of generated protein with conditional diffusion models. Our new metric is able to capture both distributional and functional similarities between real and generated protein sequences while ensuring conditional consistency. We experiment with the benchmark datasets, and the results on conditional protein generation tasks demonstrate the efficacy of the proposed generation framework and evaluation metric.
Similar Papers
The Dance of Atoms-De Novo Protein Design with Diffusion Model
Biomolecules
Creates new proteins for medicine and science.
Seek and You Shall Fold
Machine Learning (CS)
Creates protein shapes from experimental clues.
Distilled Protein Backbone Generation
Machine Learning (CS)
Designs new proteins much faster for medicine.