Generating 3D Molecules for Target Protein Binding (2204.09410v2)

Published 19 Apr 2022 in q-bio.BM, cs.AI, and cs.LG

Abstract: A fundamental problem in drug discovery is to design molecules that bind to specific proteins. To tackle this problem using machine learning methods, here we propose a novel and effective framework, known as GraphBP, to generate 3D molecules that bind to given proteins by placing atoms of specific types and locations to the given binding site one by one. In particular, at each step, we first employ a 3D graph neural network to obtain geometry-aware and chemically informative representations from the intermediate contextual information. Such context includes the given binding site and atoms placed in the previous steps. Second, to preserve the desirable equivariance property, we select a local reference atom according to the designed auxiliary classifiers and then construct a local spherical coordinate system. Finally, to place a new atom, we generate its atom type and relative location w.r.t. the constructed local coordinate system via a flow model. We also consider generating the variables of interest sequentially to capture the underlying dependencies among them. Experiments demonstrate that our GraphBP is effective to generate 3D molecules with binding ability to target protein binding sites. Our implementation is available at https://github.com/divelab/GraphBP.

Authors (5)

Meng Liu (112 papers)
Youzhi Luo (17 papers)
Kanji Uchino (5 papers)
Koji Maruhashi (9 papers)
Shuiwang Ji (123 papers)

Citations (96)

View on Semantic Scholar

Summary

The paper introduces GraphBP, a machine learning framework that sequentially generates 3D molecules with enhanced protein-binding affinity.
The approach employs a 3D graph neural network for robust context encoding and uses local spherical coordinates to ensure equivariance during atom placement.
Evaluations on the CrossDocked2020 dataset show GraphBP outperforms baselines by achieving a 27% improvement in generating molecules with superior binding affinities.

Generating 3D Molecules for Target Protein Binding: A Machine Learning Approach

The paper presents a novel machine learning-based framework, GraphBP, designed to tackle a fundamental challenge in drug discovery: generating 3D molecules that bind specifically to target protein sites. This framework stands out by addressing three primary considerations—complex conditional information, the enormity of chemical and spatial search spaces, and the critical requirement for equivariance in generated molecule structures.

Context and Motivation

Designing molecules that bind to specific proteins is a cornerstone of structure-based drug design. Recent advancements in datasets, such as PDBbind and CrossDocked2020, coupled with machine learning breakthroughs, enable novel approaches to this challenge. However, most existing methods focus on generating molecules using 1D or 2D representations, falling short when it comes to capturing intricate 3D geometric and chemical contexts essential for effective molecular interactions with protein targets.

Approach: GraphBP Framework

GraphBP leverages a 3D graph neural network (GNN) to encode the spatial and chemical context of the binding site and the previously placed atoms iteratively. It then utilizes an autoregressive model to sequentially generate atoms, ensuring that each new atom's type and position are informed by both the existing context and inherent molecular dependencies.

Context Encoding: This is achieved using a 3D GNN to produce rich, invariant representations that capture both the geometric and chemical environments of the binding site, ensuring robustness to rotations and translations.
Local Reference Selection: At each step of atom placement, GraphBP selects a local reference atom to establish a local spherical coordinate system. This ensures that the generation process remains equivariant—any transformation of the binding site results in a corresponding transformation of the generated molecule.
Sequential Atom Placement: Using a flow model, GraphBP generates the atom type and then its coordinates within the local coordinate system, preserving the equivariance property. The process considers underlying dependencies between atom types and their geometric arrangements, enhancing the generative model's capacity to produce chemically valid and structurally accurate molecules.

Results and Implications

The effectiveness of GraphBP is substantiated through extensive evaluations using the CrossDocked2020 dataset, revealing its superiority over comparable baselines, such as LiGAN variants, in generating valid molecules with higher predicted binding affinities. GraphBP achieves a 27% success rate in generating molecules with better binding affinity than references, a notable improvement over existing methods.

This framework's ability to model complex dependencies within the molecular structure while maintaining critical geometric properties has significant implications:

Theoretical Impact: GraphBP demonstrates that integrating 3D geometric shifts alongside intricate chemical interactions is feasible and effective in generative models, opening pathways for further exploration in molecular geometry generation.
Practical Applications: By enhancing the capability to generate novel molecules that exhibit strong affinity to target binding sites, GraphBP can accelerate the drug discovery pipeline, improving computational efficiency and potentially yielding unique therapeutic candidates.

Future Directions

Looking ahead, research could explore scaling the model to accommodate larger molecular complexes and extending its application to other biomolecular interactions beyond protein-ligand systems. Furthermore, integrating GraphBP with reinforcement learning techniques may enhance its ability to search the vast chemical space more efficiently, offering exciting prospects for automated drug discovery.

PDF Markdown

Related Papers

GitHub

GitHub - divelab/GraphBP: Official implementation of "Generating 3D Molecules for Target Protein Binding" [ICML2022 Long Presentation] (101 stars)