PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep Pharmacophore Modeling (2310.00681v3)
Abstract: As the size of accessible compound libraries expands to over 10 billion, the need for more efficient structure-based virtual screening methods is emerging. Different pre-screening methods have been developed for rapid screening, but there is still a lack of structure-based methods applicable to various proteins that perform protein-ligand binding conformation prediction and scoring in an extremely short time. Here, we describe for the first time a deep-learning framework for structure-based pharmacophore modeling to address this challenge. We frame pharmacophore modeling as an instance segmentation problem to determine each protein hotspot and the location of corresponding pharmacophores, and protein-ligand binding pose prediction as a graph-matching problem. PharmacoNet is significantly faster than state-of-the-art structure-based approaches, yet reasonably accurate with a simple scoring function. Furthermore, we show the promising result that PharmacoNet effectively retains hit candidates even under the high pre-screening filtration rates. Overall, our study uncovers the hitherto untapped potential of a pharmacophore modeling approach in deep learning-based drug discovery.
- Ultra-large library docking for discovering new chemotypes. Nature, 566(7743):224–229, 2019.
- Virtual discovery of melatonin receptor ligands to modulate circadian rhythms. Nature, 579(7800):609–614, 2020.
- An open-source drug discovery platform enables ultra-large virtual screens. Nature, 580(7805):663–668, 2020.
- Ultralarge virtual screening identifies sars-cov-2 main protease inhibitors with broad-spectrum activity against coronaviruses. Journal of the American Chemical Society, 144(7):2905–2920, 2022.
- Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking. Nature Protocols, 17(3):672–697, 2022.
- Synthon-based ligand discovery in virtual libraries of over 11 billion compounds. Nature, 601(7893):452–459, 2022.
- Glide: a new approach for rapid, accurate docking and scoring. 1. method and assessment of docking accuracy. Journal of medicinal chemistry, 47(7):1739–1749, 2004.
- Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2):455–461, 2010.
- Lessons learned in empirical scoring with smina from the csar 2011 benchmarking exercise. Journal of chemical information and modeling, 53(8):1893–1904, 2013.
- Equibind: Geometric deep learning for drug binding structure prediction. In International conference on machine learning, pages 20503–20521. PMLR, 2022.
- Tankbind: Trigonometry-aware neural networks for drug-protein binding structure prediction. Advances in neural information processing systems, 35:7236–7249, 2022.
- Diffdock: Diffusion steps, twists, and turns for molecular docking. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=kKF8_K-mBbS.
- Reducing false positive rate of docking-based virtual screening by active learning. Briefings in Bioinformatics, 24(1):bbac626, 2023.
- Generating multibillion chemical space of readily accessible screening compounds. iScience, 23(11):101681, 2020. ISSN 2589-0042. doi: https://doi.org/10.1016/j.isci.2020.101681. URL https://www.sciencedirect.com/science/article/pii/S2589004220308737.
- Accelerating high-throughput virtual screening through molecular pool-based active learning. Chemical science, 12(22):7866–7881, 2021.
- Structure-based pharmacophore modeling, virtual screening, molecular docking, admet, and molecular dynamics (md) simulation of potential inhibitors of pd-l1 from the library of marine natural products. Marine Drugs, 20(1):29, 2021.
- Deepbindgcn: Integrating molecular vector representation with graph convolutional neural networks for protein–ligand interaction prediction. Molecules, 28(12):4691, 2023.
- Glossary of terms used in medicinal chemistry (iupac recommendations 1998). Pure and applied Chemistry, 70(5):1129–1143, 1998.
- Next generation 3d pharmacophore modeling. Wiley Interdisciplinary Reviews: Computational Molecular Science, 10(4):e1468, 2020.
- Hot-spots-guided receptor-based pharmacophores (hs-pharm): a knowledge-based approach to identify ligand-anchoring atoms in protein cavities and prioritize structure-based pharmacophores. Journal of chemical information and modeling, 48(7):1396–1410, 2008.
- Ligandscout: 3-d pharmacophores derived from protein-bound ligands and their use as virtual screening filters. Journal of chemical information and modeling, 45(1):160–169, 2005.
- Pocket v. 2: further developments on receptor-based pharmacophore modeling. Journal of chemical information and modeling, 46(6):2684–2691, 2006.
- Application of structure-based focusing to the estrogen receptor. Journal of Computational Chemistry, 22(10):993–1003, 2001.
- All in one: Cavity detection, druggability estimate, cavity-based pharmacophore perception, and virtual screening. Journal of chemical information and modeling, 59(1):573–585, 2018.
- Autoph4: An automated method for generating pharmacophore models from protein binding pockets. Journal of Chemical Information and Modeling, 60(9):4326–4338, 2020.
- Sheng-Yong Yang. Pharmacophore modeling and applications in drug discovery: challenges and recent advances. Drug discovery today, 15(11-12):444–450, 2010.
- Ligvoxel: inpainting binding pockets using 3d-convolutional neural networks. Bioinformatics, 35(2):243–250, 2019.
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
- Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
- Pdb-wide collection of binding data: current status of the pdbbind database. Bioinformatics, 31(3):405–412, 2015.
- Plip: fully automated protein–ligand interaction profiler. Nucleic acids research, 43(W1):W443–W447, 2015.
- A generalized protein–ligand scoring framework with balanced scoring, docking, ranking and screening powers. Chemical Science, 14(30):8129–8146, 2023.
- Comparative assessment of scoring functions: the casf-2016 update. Journal of chemical information and modeling, 59(2):895–913, 2018.
- Directory of useful decoys, enhanced (dud-e): better ligands and decoys for better benchmarking. Journal of medicinal chemistry, 55(14):6582–6594, 2012.
- Evaluation and optimization of virtual screening workflows with dekois 2.0–a public library of challenging docking benchmark sets. Journal of chemical information and modeling, 53(6):1447–1462, 2013.
- Gnina 1.0: molecular docking with deep learning. Journal of cheminformatics, 13(1):1–20, 2021.
- Greg Landrum et al. Rdkit: Open-source cheminformatics, 2006.
- Better informed distance geometry: using what we know to improve conformation generation. Journal of chemical information and modeling, 55(12):2562–2574, 2015.
- Boosting protein–ligand binding pose prediction and virtual screening based on residue–atom distance likelihood potential and graph transformer. Journal of Medicinal Chemistry, 65(15):10691–10706, 2022.
- Virtual screening with gnina 1.0. Molecules, 26(23):7369, 2021.
- Conformer generation with omega: algorithm and validation using high quality structures from the protein databank and cambridge structural database. Journal of chemical information and modeling, 50(4):572–584, 2010.
- Multiple active site corrections for docking and virtual screening. Journal of medicinal chemistry, 47(1):80–89, 2004.
- Comparison of topological, shape, and docking methods in virtual screening. Journal of chemical information and modeling, 47(4):1504–1519, 2007.
- Pignet: a physics-informed deep learning model toward generalized drug–target interaction predictions. Chemical Science, 13(13):3661–3673, 2022.
- Uff, a full periodic table force field for molecular mechanics and molecular dynamics simulations. Journal of the American chemical society, 114(25):10024–10035, 1992.
- Thomas A Halgren. Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of computational chemistry, 17(5-6):490–519, 1996.
- A versatile deep learning-based protein-ligand interaction prediction model for accurate binding affinity scoring and virtual screening. arXiv preprint arXiv:2307.01066, 2023.
- Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
- Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12009–12019, 2022.
- The protein data bank. Nucleic acids research, 28(1):235–242, 2000.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Open babel: An open chemical toolbox. Journal of cheminformatics, 3(1):1–14, 2011.
- A geometric deep learning approach to predict binding conformations of bioactive molecules. Nature Machine Intelligence, 3(12):1033–1039, 2021.
- Do deep learning models really outperform traditional approaches in molecular docking? arXiv preprint arXiv:2302.07134, 2023.
- Deep docking: a deep learning platform for augmentation of structure based drug discovery. ACS central science, 6(6):939–949, 2020.
- Predicting drug–target interaction using a novel graph neural network with 3d structure-embedded graph representation. Journal of chemical information and modeling, 59(9):3981–3988, 2019.
- In silico structure-based prediction of receptor–ligand binding affinity: current progress and challenges. Structural Bioinformatics: Applications in Preclinical Drug Discovery Process, pages 109–175, 2019.
- Deep learning for ligand-based virtual screening in drug discovery. In 2018 3rd international conference on pattern analysis and intelligent systems (PAIS), pages 1–5. IEEE, 2018.
- Combining structure-based pharmacophore modeling and machine learning for the identification of novel btk inhibitors. International Journal of Biological Macromolecules, 222:239–250, 2022.
- Hidden bias in the dud-e dataset leads to misleading performance of deep learning in structure-based virtual screening. PloS one, 14(8):e0220113, 2019.
- A protein-ligand interaction-focused 3d molecular generative framework for generalizable structure-based drug design. chemrxiv, 2023.
- Deep generative design with 3d pharmacophoric constraints. Chemical science, 12(43):14577–14589, 2021.
- Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1290–1299, 2022.
- Oneformer: One transformer to rule universal image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2989–2998, 2023.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.