Emergent Mind

Deep Denerative Models for Drug Design and Response

(2109.06469)

Published Sep 14, 2021 in cs.LG and cs.AI

Abstract

Designing new chemical compounds with desired pharmaceutical properties is a challenging task and takes years of development and testing. Still, a majority of new drugs fail to prove efficient. Recent success of deep generative modeling holds promises of generation and optimization of new molecules. In this review paper, we provide an overview of the current generative models, and describe necessary biological and chemical terminology, including molecular representations needed to understand the field of drug design and drug response. We present commonly used chemical and biological databases, and tools for generative modeling. Finally, we summarize the current state of generative modeling for drug design and drug response prediction, highlighting the state-of-art approaches and limitations the field is currently facing.

Comparison of GAN, Autoencoder, VAE, and AAE generative models.

Overview

The paper provides a comprehensive review of deep generative models (DGMs) and their applications in drug design, highlighting significant advancements and methodologies employed in the field.
It describes various generative model architectures such as GANs, autoencoders, and transformers, detailing their roles and mechanisms in generating viable drug candidates.
The review discusses the challenges faced in drug design and response prediction and outlines prospects for future research aiming to improve data representations and model accuracy.

Deep Generative Models for Drug Design and Response

Overview

The paper "Deep Generative Models for Drug Design and Response" by Karina Zadorozhny and Lada Nuzhna provides a comprehensive review of current advancements in using deep generative models (DGMs) to expedite and optimize the process of drug design and drug response prediction. By synthesizing existing research and outlining current methodologies, the authors highlight significant strides in the field, particularly in leveraging neural networks to model complex biochemical interactions. The review encompasses various types of generative models, practical applications in pharmaceuticals, and prospects for future investigations.

Introduction

The traditional drug discovery process, spanning roughly 20 years and costing billions, necessitates a robust framework for generating viable drug candidates. The authors introduce the fundamental challenge of designing new chemical compounds by employing computational chemistry techniques, emphasizing the role of high-throughput screening (HTS). They categorize drug design methodologies into ligand-based and structure-based approaches, each with its respective techniques like pharmacophore modeling, QSAR, molecular docking, and dynamic simulations.

However, the complexity of molecular structures and the need for chemical validity pose significant hurdles in drug design. The review underscores the disappointing statistic that 75-97% of new drugs fail in clinical trials, primarily due to inefficacy and safety concerns. Given this backdrop, DGMs offer a promising avenue to address these challenges by generating drug candidates with optimized chemical properties and favorable responses.

Generative Models for Drug Design

The paper provides a detailed exploration of fundamental and advanced DGM architectures:

GANs: Introduced by Goodfellow et al., GANs utilize a generator and discriminator in a competitive setup to produce realistic samples, though implicit in nature.
Autoencoders: These models, including standard AEs, VAEs, and AAEs, transform input into latent representations to reconstruct and generate plausible molecular structures. Each variant incorporates different levels of regularization and adversarial techniques to refine the latent space.
Transformers: Leveraging attention mechanisms, transformers, exemplified by BERT and GPT, excel in capturing dependencies between distant elements in sequences, making them suitable for protein structure prediction.

Databases and Tools

The authors meticulously list prevalent databases for chemical compounds and cell line screenings. Importance is placed on databases like BindingDB, ZINC, and PubChem for their extensive repositories of bioactivity and chemical structures. In tandem, tools such as QSAR and RDKit are described for their utility in chemical informatics and molecular operations.

Data Representations

The section on data representations explore various encoding methods crucial for DGM applications in drug design. Notable representations include:

SMILES and Its Variants: Widely used string representations.
Molecular Fingerprints: Binary or circular encodings capturing substructural information.
Peptide and Protein Representations: Sequences representing peptide chains and structural maps.
Graph Representations: Direct graphical encodings preserving atomic connectivity and bond types.

Applications in Drug Design

Focusing on small molecule design, gene therapy, and protein and peptide generation, the review showcases how DGMs navigate complexities like syntactic validity and chiral molecule generation. Prominent approaches include:

Conditional VAEs targeting specific growth inhibition properties.
Transformer architectures for sequence-to-structure predictions.
Enhanced graph neural networks for full molecular structure generation, like Junction Tree VAE and GraphVAE.

For gene therapy, DGMs offer novel solutions in designing diversified and stable adeno-associated viral capsids.

Drug Response Prediction

DGMs are also pivotal for drug response prediction, offering methods for gene expression profile analysis and cell line screenings. Seminal works, such as those by Dincer et al. and Rampasek et al., introduce VAEs and Dr.VAE to correlate gene expression changes with drug sensitivity. Moreover, innovations like AD-AE facilitate deconfounding and domain adaptation, enhancing the reliability of predictive models.

Challenges and Future Directions

Despite advancements, the review candidly discusses several ongoing challenges:

Loss of molecular information in traditional representations.
The necessity for better modelling of graph structures.
Issues surrounding data quality, reproducibility, and intellectual property rights in generated molecules.

Future development in the field would benefit from improved data representations, rigorous benchmarking, and addressing reproducibility concerns. Published works should focus on generating molecular structures that accurately reflect chemical and biological properties.

Conclusion

The review underscores the nascent but rapidly evolving synergy between AI and drug discovery. While current DGMs show promise, their potential will be fully realized with continued advancements in data quality, model sophistication, and comprehensive validation processes. The integration of DGMs into drug discovery pipelines beckons a transformative impact on the pharmaceutical landscape, with AI-driven approaches poised to enhance the efficiency and success rates of new drug development.

Create an account to read this summary for free: