Improved Molecular Generation through Attribute-Driven Integrative Embeddings and GAN Selectivity
By: Nandan Joshi, Erhan Guven
Potential Business Impact:
Creates new molecules with special features.
The growing demand for molecules with tailored properties in fields such as drug discovery and chemical engineering has driven advancements in computational methods for molecular design. Machine learning-based approaches for de-novo molecular generation have recently garnered significant attention. This paper introduces a transformer-based vector embedding generator combined with a modified Generative Adversarial Network (GAN) to generate molecules with desired properties. The embedding generator utilizes a novel molecular descriptor, integrating Morgan fingerprints with global molecular attributes, enabling the transformer to capture local functional groups and broader molecular characteristics. Modifying the GAN generator loss function ensures the generation of molecules with specific desired properties. The transformer achieves a reconversion accuracy of 94% while translating molecular descriptors back to SMILES strings, validating the utility of the proposed embeddings for generative tasks. The approach is validated by generating novel odorant molecules using a labeled dataset of odorant and non-odorant compounds. With the modified range-loss function, the GAN exclusively generates odorant molecules. This work underscores the potential of combining novel vector embeddings with transformers and modified GAN architectures to accelerate the discovery of tailored molecules, offering a robust tool for diverse molecular design applications.
Similar Papers
A Reinforcement Learning-Driven Transformer GAN for Molecular Generation
Machine Learning (CS)
Creates new medicines with AI.
MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design
Machine Learning (CS)
Designs new medicines faster with less data.
Synergistic Benefits of Joint Molecule Generation and Property Prediction
Machine Learning (CS)
Builds new medicines by learning and predicting.