Machine-learned molecular mechanics force field for the simulation of protein-ligand systems and beyond (2307.07085v4)
Abstract: The development of reliable and extensible molecular mechanics (MM) force fields -- fast, empirical models characterizing the potential energy surface of molecular systems -- is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, \texttt{espaloma-0.3}, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1M energy and force calculations, \texttt{espaloma-0.3} reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
- Amber force field parameters for the naturally occurring modified nucleosides in rna. Journal of chemical theory and computation, 3(4):1464–1475.
 - Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261.
 - Machine learning directed optimization of classical molecular modeling force fields. Journal of Chemical Information and Modeling, 61(9):4400–4414.
 - openforcefield/openff-forcefields (2023.05.1). Zenodo. https://doi.org/10.5281/zenodo.7889050.
 - Benchmarking qm theory for drug-like molecules to train force fields. OpenEye CUP XII, Santa Fe, NM. Zenodo. https://doi.org/10.5281/zenodo.7548777.
 - A practical guide to large-scale docking. Nature protocols, 16(10):4799–4832.
 - Pressure control using stochastic cell rescaling. The Journal of Chemical Physics, 153:114107.
 - Paramfit: Automated optimization of force field parameters for molecular dynamics simulations. Journal of computational chemistry, 36(2):79–87.
 - Development and benchmarking of open force field 2.0.0: The sage small molecule force field. Journal of Chemical Theory and Computation, 19(11):3251–3275.
 - Improving force field accuracy by training against condensed-phase mixture properties. Journal of Chemical Information and Modeling, 18(6):3577–3592.
 - Open force field evaluator: An automated, efficient, and scalable framework for the estimation of physical properties from molecular simulation. Journal of Chemical Theory and Computation, 18(6):3566––3576.
 - Optimized lennard-jones parameters for druglike small molecules. Journal of chemical theory and computation, 14(6):3121–3131.
 - Amber 2023.
 - Development and benchmarking of an open, self-consistent force field for proteins and small molecules from the open force field initiative. Zenodo. https://doi.org/10.5281/zenodo.7696579.
 - choderalab/openmmtools: 0.22.1 (0.22.1). Zenodo. https://doi.org/10.5281/zenodo.7843902.
 - openmm/openmm-forcefields: Fix gaff am1-bcc charging bug for some molecules (0.7.1). Zenodo. https://doi.org/10.5281/zenodo.3627391.
 - Replica exchange and expanded ensemble simulations as gibbs sampling: Simple improvements for enhanced mixing. The Journal of chemical physics, 135(19):194110.
 - The nucleic acid database: new features and capabilities. Nucleic acids research, 42(D1):D114–D122.
 - Exhaustive conformational sampling of complex fused ring macrocycles using inverse kinematics. Journal of chemical theory and computation, 12(9):4674–4687.
 - Collaborative assessment of molecular geometries and energies from the open force field. Journal of Chemical Information and Modeling, 62(23):6094–6104.
 - Biomolecular force fields: where have we been, where are we now, where do we need to go and how do we get there? Journal of computer-aided molecular design, 33(2):133–203.
 - Structure-based design of a potent purine-based cyclin-dependent kinase inhibitor. Nature structural biology, 9(10):745–749.
 - Inadequacy of the lorentz-berthelot combining rules for accurate predictions of equilibrium properties by molecular simulation. Molecular Physics, 99(8):619–625.
 - Atomic-resolution conformational analysis of the gm3 ganglioside in a lipid bilayer and its implications for ganglioside–protein recognition at membrane surfaces. Glycobiology, 19(4):344–355.
 - Presentation of membrane-anchored glycosphingolipids determined from molecular dynamics simulations and nmr paramagnetic relaxation rate enhancement. Journal of the American Chemical Society, 132(4):1334–1338.
 - Protein backbone 1hn 13calpha and 15n 13calpha residual dipolar and j couplings: New constraints for nmr structure determination. Journal of the American Chemical Society, 126(20):6232–6233.
 - Quantum chemical benchmark databases of gold-standard dimer interaction energies. Scientific data, 8(1):55.
 - Topology Adaptive Graph Convolutional Networks. arXiv:1710.10370 [cs, stat].
 - Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pages 2224–2232.
 - Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Scientific Data, 10(1):11.
 - Openmm 8: Molecular dynamics simulation with machine learning potentials. arXiv preprint arXiv:2310.03121.
 - Openmm 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7):e1005659.
 - Discovery of potent myeloid cell leukemia 1 (mcl-1) inhibitors using fragment-based methods and structure-based design. Journal of medicinal chemistry, 56(1):15–30.
 - Forces are not enough: Benchmark and critical evaluation for machine learning force fields with molecular simulations. arXiv preprint arXiv:2210.07237.
 - Assessing the current state of amber force field modifications for dna. Journal of chemical theory and computation, 12(8):4114–4127.
 - Pre-exascale computing of protein–ligand binding free energies with open source software for drug design. Journal of chemical information and modeling, 62(5):1172–1177.
 - pmx: Automated protein structure and topology generation for alchemical perturbations. Journal of Computational Chemistry, 19(5):348–354.
 - A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342.
 - Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR.
 - Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. Journal of chemical information and computer sciences, 43(6):1982–1997.
 - Lipid17: A comprehensive amber force field for the simulation of zwitterionic and anionic lipids. Manuscript in preparation.
 - Structure and dynamics of the homologous series of alanine peptides: A joint molecular dynamics/nmr study. Journal of the American Chemical Society, 129(5):1179–1189.
 - Intrinsic propensities of amino acid residues in gxg peptides inferred from amide i’ band profiles and nmr scalar coupling constants. Journal of the American Chemical Society, 132(2):540–551.
 - Hagler, A. T. (2019). Force field development phase ii: Relaxation of physics-based criteria… or inclusion of more rigorous physics into the representation of molecular energetics. Journal of computer-aided molecular design, 33(2):205–264.
 - Halgren, T. A. (1996). Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of Computational Chemistry, 17(5-6):490–519.
 - Inductive representation learning on large graphs. In Advances in neural information processing systems, pages 1024–1034.
 - Opls3: a force field providing broad coverage of drug-like small molecules and proteins. Journal of chemical theory and computation, 12(1):281–296.
 - Acemd: accelerating biomolecular dynamics in the microsecond time scale. Journal of chemical theory and computation, 5(6):1632–1639.
 - A fast and high-quality charge model for the next generation general amber force field. The Journal of Chemical Physics, 153:114502.
 - Determination of psi torsion angle restraints from 3j(calpha,calpha) and 3j(calpha,hn) coupling constants in proteins. Journal of the American Chemical Society, 122(26):6268–6277.
 - Uncertainty quantification using neural networks for molecular property prediction. Journal of Chemical Information and Modeling, 60(8):3770–3780.
 - Development of an improved four-site water model for biomolecular simulations: Tip4p-ew. The Journal of chemical physics, 120(20):9665–9678.
 - Horton, J. (2022). openforcefield/openff-qcsubmit: 0.3.1 (0.3.1). Zenodo. https://doi.org/10.5281/zenodo.6338096.
 - Open force field bespokefit: Automating bespoke torsion parametrization at scale. Journal of Chemical Information and Modeling, 62(22):5622–5633.
 - Determination of phi and chi angles in proteins from 13c - 13c three-bond j couplings measured by three-dimensional heteronuclear nmr. how planar is the peptide bond? Journal of the American Chemical Society, 119(27):6360–6368.
 - Building water models: A different approach. The Journal of Physical Chemistry Letters, 5(21):3863–3871.
 - Fast, efficient generation of high-quality atomic charges. am1-bcc model: I. method. Journal of computational chemistry, 21(2):132–146.
 - Fast, efficient generation of high-quality atomic charges. am1-bcc model: Ii. parameterization and validation. Journal of computational chemistry, 23(16):1623–1641.
 - Comparison of simple potential functions for simulating liquid water. The Journal of chemical physics, 79(2):926–935.
 - On the expressive power of geometric graph neural networks. arXiv preprint arXiv:2301.09308.
 - Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. The journal of physical chemistry B, 112(30):9020–9041.
 - Molecular dynamics simulations of the dynamic and energetic properties of alkali and halide ions using water-model-specific ion parameters. The Journal of Physical Chemistry B, 113(40):13279–13290.
 - Improvements to the apbs biomolecular solvation software suite. Protein Science, 27(1):112–128.
 - Karplus, M. (1963). Vicinal proton coupling in nuclear magnetic resonance. Journal of the American Chemical Society, 85(18):2870–2871.
 - Forcefield_ptm: Ab initio charge and amber forcefield parameters for frequently occurring post-translational modifications. Journal of chemical theory and computation, 9(12):5653–5674.
 - Pubchem 2023 update. Nucleic Acids Research, 51(D1):D1373–D1380.
 - Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
 - Semi-supervised classification with graph convolutional networks. CoRR, abs/1609.02907.
 - Glycam06: a generalizable biomolecular force field. carbohydrates. Journal of computational chemistry, 29(4):622–655.
 - Optimizing simulations protocols for relative free energy calculations. In Free Energy Methods in Drug Discovery: Current State and Future Directions, pages 227–245. ACS Publications.
 - rdkit/rdkit: 2023_03_2 (q1 2023) release (release_2023_03_2). Zenodo. https://doi.org/10.5281/zenodo.8053810.
 - Leach, A. R. (2001). Molecular modelling: principles and applications. Pearson education.
 - Efficient molecular dynamics using geodesic integration and solvent–solute splitting. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 472(2189):20160138.
 - Taking into account the ion-induced dipole interaction in the nonbonded model of ions. Journal of chemical theory and computation, 10(1):289–297.
 - Rational design of particle mesh ewald compatible lennard-jones parameters for+ 2 metal cations in explicit solvent. Journal of chemical theory and computation, 9(6):2733–2748.
 - Parameterization of highly charged metal ions using the 12-6-4 lj-type nonbonded model in explicit water. The Journal of Physical Chemistry B, 119(3):883–895.
 - Pubchem as a public resource for drug discovery. Drug discovery today, 15(23-24):1052–1057.
 - Lead identification of novel and selective tyk2 inhibitors. European journal of medicinal chemistry, 67:175–187.
 - Benchmark assessment of molecular geometries and energies from small molecule force fields. F1000Research, 9:1390.
 - openforcefield/openff-arsenic: v0.2.1 (0.2.1). Zenodo. https://doi.org/10.5281/zenodo.6210305.
 - ff14sb: improving the accuracy of protein side chain and backbone parameters from ff99sb. Journal of chemical theory and computation, 11(8):3696–3713.
 - Best practices for alchemical free energy calculations [article v1. 0]. Living journal of computational molecular science, 2(1).
 - Escaping atom types in force fields using direct chemical perception. Journal of chemical theory and computation, 14(11):6076–6092.
 - Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs. CoRR, abs/1811.01900.
 - Optimizing protein- solvent force fields to reproduce intrinsic conformational preferences of model peptides. Journal of Chemical Theory and Computation, 7(4):1220–1230.
 - Folding simulations for proteins with diverse topologies are accessible in days with a physics-based force field and implicit solvent. Journal of the American Chemical Society, 136(40):13959–13962.
 - The rna 3d motif atlas: Computational methods for extraction, organization and evaluation of rna motifs. Methods, 103:99–119.
 - Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.
 - Best practices for foundations in molecular simulations [article v1.0]. Living Journal of Computational Molecular Science, 1:1–28.
 - Pepconf, a diverse data set of peptide conformational energies. Scientific data, 6(1):1–9.
 - Development and benchmarking of open force field v1.0.0—the Parsley small-molecule force field. Journal of chemical theory and computation, 17(10):6262–6280.
 - RDKit, online (2013). RDKit: Open-source cheminformatics. http://www.rdkit.org. [Online; accessed 11-April-2013].
 - Lightweight object oriented structure analysis: tools for building tools to analyze molecular dynamics simulations. Journal of Computational Chemistry, 35(32):2305–2318.
 - Perses (0.10.1). Zenodo. https://doi.org/10.5281/zenodo.6757402.
 - Routine microsecond molecular dynamics simulations with amber on gpus. 2. explicit solvent particle mesh ewald. Journal of chemical theory and computation, 9(9):3878–3888.
 - Large-scale assessment of binding free energy calculations in active drug discovery projects. Journal of Chemical Information and Modeling, 60(11):5457–5474.
 - Schlick, T. (2010). Molecular modeling and simulation: an interdisciplinary guide, volume 2. Springer.
 - Tfd: Torsion fingerprints as a new measure to compare small molecule conformations. Journal of Chemical Information and Modeling, 52(6):1499–1512.
 - Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks. Nature Communications, 12(5104).
 - Tuning potential functions to host–guest binding data.
 - Statistically optimal analysis of samples from multiple equilibrium states. The Journal of chemical physics, 129(12):124105.
 - The molssi qcarchive project: An open-source platform to compute, organize, and share quantum chemistry data. Wiley Interdisciplinary Reviews: Computational Molecular Science, 11(2):e1491.
 - Psi4 1.4: Open-source software for high-throughput quantum chemistry. The Journal of chemical physics, 152(18).
 - Less is more: Sampling chemical space with active learning. The Journal of Chemical Physics, 148:241733.
 - Improved treatment of ligands and coupling effects in empirical calculation and rationalization of p k a values. Journal of Chemical Theory and Computation, 7(7):2284–2295.
 - Regularized by physics: Graph neural network parametrized potentials for the description of intermolecular interactions. Journal of Chemical Theory and Computation, 19(2):562–579.
 - ff19sb: Amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. Journal of chemical theory and computation, 16(1):528–552.
 - Affordable membrane permeability calculations: permeation of short-chain alcohols through pure-lipid bilayers and a mammalian cell membrane. Journal of Chemical Theory and Computation, 15(5):2913–2924.
 - Limits on variations in protein backbone dynamics from precise measurements of scalar couplings. Journal of the American Chemical Society, 129(30):9377–9385.
 - openforcefield/openff-forcefields: Version 2.0.0 "sage" (2.0.0). Zenodo. https://doi.org/10.5281/zenodo.5214478.
 - openforcefield/openff-toolkit: 0.10.6 bugfix release (0.10.6). Zenodo. https://doi.org/10.5281/zenodo.6483648.
 - Automatic atom type and bond type perception in molecular mechanical calculations. Journal of molecular graphics and modelling, 25(2):247–260.
 - Development and testing of a general amber force field. Journal of computational chemistry, 25(9):1157–1174.
 - Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. Journal of the American Chemical Society, 137(7):2695–2703.
 - Building force fields: An automatic, systematic, and reproducible approach. The journal of physical chemistry letters, 5(11):1885–1891.
 - Building a more predictive protein force field: A systematic and reproducible route to amber-fb15. The Journal of Physical Chemical B, 121(16):4023–4039.
 - Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315.
 - Dmff: An open-source automatic differentiable platform for molecular force field development and molecular dynamics simulation. Journal of Chemical Theory and Computation, 19(17):5897–5909.
 - Wang, Y. (2023). Graph Machine Learning for (Bio)Molecular Modeling and Force Field Construction. PhD thesis. Copyright - Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works; Last updated - 2023-03-24.
 - Spatial attention kinetic networks with e(n)-equivariance.
 - End-to-end differentiable construction of molecular mechanics force fields. Chem. Sci., 13:12016–12033.
 - Graph nets for partial charge prediction. arXiv preprint arXiv:1909.07903.
 - Stochastic aggregation in graph neural networks.
 - Espalomacharge: Machine learning-enabled ultra-fast partial charge assignment. arXiv preprint arXiv:2302.06758.
 - Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12.
 - Denoise pretraining on nonequilibrium molecules for accurate and transferable neural potentials. Journal of Chemical Theory and Computation, 19(15):5077–5087.
 - Evaluating the performance of the ff99sb force field based on nmr scalar coupling data. Biophysical Journal, 97(3):853–856.
 - Fitting quantum machine learning potentials to experimental free energy data: Predicting tautomer ratios in solution. Chemical science, 12(34):11364–11381.
 - Teaching free energy calculations to learn from experimental data. bioRxiv, pages 2021–08.
 - Angular dependence of 1j(ni,calphai) and 2j(ni,calpha(i-1)) coupling constants measured in j-modulated hsqcs. Journal of Biomolecular NMR, 23:47–55.
 - Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153.
 - Xu, H. (2019). Optimal measurement network of pairwise differences. Journal of Chemical Information and Modeling, 59(11):4720–4728.
 - How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.
 - Refinement of the cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. Journal of Chemical Theory and Computation, 7(9):2886–2902.
 - Refinement of the sugar–phosphate backbone torsion beta for amber force fields improves the description of z-and b-dna. Journal of chemical theory and computation, 11(12):5723–5736.
 - Unified efficient thermostat scheme for the canonical ensemble with holonomic or isokinetic constraints via molecular dynamics. The Journal of Physical Chemistry A, 123(28):6056–6079.
 
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.