Equivariant Scalar Fields for Molecular Docking with Fast Fourier Transforms (2312.04323v2)
Abstract: Molecular docking is critical to structure-based virtual screening, yet the throughput of such workflows is limited by the expensive optimization of scoring functions involved in most docking algorithms. We explore how machine learning can accelerate this process by learning a scoring function with a functional form that allows for more rapid optimization. Specifically, we define the scoring function to be the cross-correlation of multi-channel ligand and protein scalar fields parameterized by equivariant graph neural networks, enabling rapid optimization over rigid-body degrees of freedom with fast Fourier transforms. The runtime of our approach can be amortized at several levels of abstraction, and is particularly favorable for virtual screening settings with a common binding pocket. We benchmark our scoring functions on two simplified docking-related tasks: decoy pose scoring and rigid conformer docking. Our method attains similar but faster performance on crystal structures compared to the widely-used Vina and Gnina scoring functions, and is more robust on computationally predicted structures. Code is available at https://github.com/bjing2016/scalar-fields.
- Docking unbound proteins using shape complementarity, desolvation, and electrostatics. Proteins: Structure, Function, and Bioinformatics, 47(3):281–294, 2002.
- Diffdock: Diffusion steps, twists, and turns for molecular docking. In The Eleventh International Conference on Learning Representations, 2023.
- Machine-learning methods for ligand–protein molecular docking. Drug discovery today, 27(1):151–164, 2022.
- Accelerated cdocker with gpus, parallel simulated annealing, and fast fourier transforms. Journal of chemical theory and computation, 16(6):3910–3919, 2020.
- Progress in molecular docking. Quantitative Biology, 7:83–89, 2019.
- Molecular docking and structure-based drug design strategies. Molecules, 20(7):13384–13421, 2015.
- Modelling protein docking using shape complementarity, electrostatics and biochemical information. Journal of molecular biology, 272(1):106–120, 1997.
- e3nn: Euclidean neural networks. arXiv preprint arXiv:2207.09453, 2022.
- Glide: a new approach for rapid, accurate docking and scoring. 2. enrichment factors in database screening. Journal of medicinal chemistry, 2004.
- Guiding conventional protein–ligand docking software with convolutional neural networks. Journal of Chemical Information and Modeling, 60(10):4594–4602, 2020.
- Torsional diffusion for molecular conformer generation. arXiv preprint arXiv:2206.01729, 2022.
- Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation techniques. Proceedings of the National Academy of Sciences, 89(6):2195–2199, 1992.
- Fast rotational matching. Acta Crystallographica Section D: Biological Crystallography, 58(8):1282–1286, 2002.
- Piper: an fft-based protein docking program with pairwise potentials. Proteins: Structure, Function, and Bioinformatics, 65(2):392–406, 2006.
- The cluspro web server for protein–protein docking. Nature protocols, 12(2):255–278, 2017.
- Machine-learning scoring functions for structure-based virtual screening. Wiley Interdisciplinary Reviews: Computational Molecular Science, 11(1):e1478, 2021.
- Forging the basis for developing protein–ligand interaction scoring functions. Accounts of Chemical Research, 50(2):302–309, 2017.
- Tankbind: Trigonometry-aware neural networks for drug-protein binding structure prediction. Advances in neural information processing systems, 2022.
- Protein docking using continuum electrostatics and geometric fit. Protein engineering, 14(2):105–113, 2001.
- Gnina 1.0: molecular docking with deep learning. Journal of cheminformatics, 13(1):1–20, 2021.
- A geometric deep learning approach to predict binding conformations of bioactive molecules. Nature Machine Intelligence, 3(12):1033–1039, 2021.
- Automated docking with grid-based energy evaluation. Journal of computational chemistry, 13(4):505–524, 1992.
- Molecular docking. In Molecular modeling of proteins, pp. 365–382. Springer, 2008.
- Automated docking using a lamarckian genetic algorithm and an empirical binding free energy function. Journal of computational chemistry, 19(14):1639–1662, 1998.
- Using the fast fourier transform in binding free energy calculations. Journal of computational chemistry, 39(11):621–636, 2018.
- Protein–protein docking by fast generalized fourier transforms on 5d rotational manifolds. Proceedings of the National Academy of Sciences, 113(30):E4286–E4293, 2016.
- Protein–ligand docking using fft based sampling: D3r case study. Journal of computer-aided molecular design, 32:225–230, 2018.
- Vinardo: A scoring function based on autodock vina improves scoring, docking, and virtual screening. PloS one, 11(5):e0155183, 2016.
- Protein–ligand scoring with convolutional neural networks. Journal of chemical information and modeling, 57(4):942–957, 2017.
- Protein docking using spherical polar fourier correlations. Proteins: Structure, Function, and Bioinformatics, 39(2):178–194, 2000.
- Accelerating and focusing protein–protein docking correlations using multi-dimensional rotational fft generating functions. Bioinformatics, 24(17):1865–1873, 2008.
- Molecular docking using shape descriptors. Journal of computational chemistry, 13(3):380–397, 1992.
- Equibind: Geometric deep learning for drug binding structure prediction. In International Conference on Machine Learning, pp. 20503–20521. PMLR, 2022.
- Comparative assessment of scoring functions: the casf-2016 update. Journal of chemical information and modeling, 59(2):895–913, 2018.
- Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint, 2018.
- Zinc-22– a free multi-billion-scale database of tangible compounds for ligand discovery. Journal of Chemical Information and Modeling, 63(4):1166–1176, 2023.
- Key topics in molecular docking for drug design. International journal of molecular sciences, 20(18):4574, 2019.
- A high quality, industrial data set for binding affinity prediction: performance comparison in different early drug discovery scenarios. Journal of Computer-Aided Molecular Design, 36(10):753–765, 2022.
- Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2):455–461, 2010.
- Fast convolution on the sphere. Physical review D, 63(12):123002, 2001.
- Wikipedia. Hankel transform — Wikipedia, the free encyclopedia, 2023. URL https://en.wikipedia.org/wiki/Hankel_transform#Fourier_transform_in_three_dimensions.
- The hdock server for integrated protein–protein docking. Nature protocols, 15(5):1829–1852, 2020.
- Protein–ligand docking in the machine-learning era. Molecules, 27(14):4568, 2022.
- Generating uniform incremental grids on so (3) using the hopf fibration. The International journal of robotics research, 29(7):801–812, 2010.
- E3bind: An end-to-end equivariant network for protein-ligand docking. arXiv preprint arXiv:2210.06069, 2022.
- Reconstructing continuous distributions of 3d protein structure from cryo-em images. In International Conference on Learning Representations, 2019.