Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 137 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 90 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Machine-learned molecular mechanics force field for the simulation of protein-ligand systems and beyond (2307.07085v4)

Published 13 Jul 2023 in physics.chem-ph and cs.AI

Abstract: The development of reliable and extensible molecular mechanics (MM) force fields -- fast, empirical models characterizing the potential energy surface of molecular systems -- is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, \texttt{espaloma-0.3}, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1M energy and force calculations, \texttt{espaloma-0.3} reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (136)
  1. Amber force field parameters for the naturally occurring modified nucleosides in rna. Journal of chemical theory and computation, 3(4):1464–1475.
  2. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261.
  3. Machine learning directed optimization of classical molecular modeling force fields. Journal of Chemical Information and Modeling, 61(9):4400–4414.
  4. openforcefield/openff-forcefields (2023.05.1). Zenodo. https://doi.org/10.5281/zenodo.7889050.
  5. Benchmarking qm theory for drug-like molecules to train force fields. OpenEye CUP XII, Santa Fe, NM. Zenodo. https://doi.org/10.5281/zenodo.7548777.
  6. A practical guide to large-scale docking. Nature protocols, 16(10):4799–4832.
  7. Pressure control using stochastic cell rescaling. The Journal of Chemical Physics, 153:114107.
  8. Paramfit: Automated optimization of force field parameters for molecular dynamics simulations. Journal of computational chemistry, 36(2):79–87.
  9. Development and benchmarking of open force field 2.0.0: The sage small molecule force field. Journal of Chemical Theory and Computation, 19(11):3251–3275.
  10. Improving force field accuracy by training against condensed-phase mixture properties. Journal of Chemical Information and Modeling, 18(6):3577–3592.
  11. Open force field evaluator: An automated, efficient, and scalable framework for the estimation of physical properties from molecular simulation. Journal of Chemical Theory and Computation, 18(6):3566––3576.
  12. Optimized lennard-jones parameters for druglike small molecules. Journal of chemical theory and computation, 14(6):3121–3131.
  13. Amber 2023.
  14. Development and benchmarking of an open, self-consistent force field for proteins and small molecules from the open force field initiative. Zenodo. https://doi.org/10.5281/zenodo.7696579.
  15. choderalab/openmmtools: 0.22.1 (0.22.1). Zenodo. https://doi.org/10.5281/zenodo.7843902.
  16. openmm/openmm-forcefields: Fix gaff am1-bcc charging bug for some molecules (0.7.1). Zenodo. https://doi.org/10.5281/zenodo.3627391.
  17. Replica exchange and expanded ensemble simulations as gibbs sampling: Simple improvements for enhanced mixing. The Journal of chemical physics, 135(19):194110.
  18. The nucleic acid database: new features and capabilities. Nucleic acids research, 42(D1):D114–D122.
  19. Exhaustive conformational sampling of complex fused ring macrocycles using inverse kinematics. Journal of chemical theory and computation, 12(9):4674–4687.
  20. Collaborative assessment of molecular geometries and energies from the open force field. Journal of Chemical Information and Modeling, 62(23):6094–6104.
  21. Biomolecular force fields: where have we been, where are we now, where do we need to go and how do we get there? Journal of computer-aided molecular design, 33(2):133–203.
  22. Structure-based design of a potent purine-based cyclin-dependent kinase inhibitor. Nature structural biology, 9(10):745–749.
  23. Inadequacy of the lorentz-berthelot combining rules for accurate predictions of equilibrium properties by molecular simulation. Molecular Physics, 99(8):619–625.
  24. Atomic-resolution conformational analysis of the gm3 ganglioside in a lipid bilayer and its implications for ganglioside–protein recognition at membrane surfaces. Glycobiology, 19(4):344–355.
  25. Presentation of membrane-anchored glycosphingolipids determined from molecular dynamics simulations and nmr paramagnetic relaxation rate enhancement. Journal of the American Chemical Society, 132(4):1334–1338.
  26. Protein backbone 1hn 13calpha and 15n 13calpha residual dipolar and j couplings: New constraints for nmr structure determination. Journal of the American Chemical Society, 126(20):6232–6233.
  27. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Scientific data, 8(1):55.
  28. Topology Adaptive Graph Convolutional Networks. arXiv:1710.10370 [cs, stat].
  29. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pages 2224–2232.
  30. Spice, a dataset of drug-like molecules and peptides for training machine learning potentials. Scientific Data, 10(1):11.
  31. Openmm 8: Molecular dynamics simulation with machine learning potentials. arXiv preprint arXiv:2310.03121.
  32. Openmm 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7):e1005659.
  33. Discovery of potent myeloid cell leukemia 1 (mcl-1) inhibitors using fragment-based methods and structure-based design. Journal of medicinal chemistry, 56(1):15–30.
  34. Forces are not enough: Benchmark and critical evaluation for machine learning force fields with molecular simulations. arXiv preprint arXiv:2210.07237.
  35. Assessing the current state of amber force field modifications for dna. Journal of chemical theory and computation, 12(8):4114–4127.
  36. Pre-exascale computing of protein–ligand binding free energies with open source software for drug design. Journal of chemical information and modeling, 62(5):1172–1177.
  37. pmx: Automated protein structure and topology generation for alchemical perturbations. Journal of Computational Chemistry, 19(5):348–354.
  38. A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342.
  39. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR.
  40. Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. Journal of chemical information and computer sciences, 43(6):1982–1997.
  41. Lipid17: A comprehensive amber force field for the simulation of zwitterionic and anionic lipids. Manuscript in preparation.
  42. Structure and dynamics of the homologous series of alanine peptides: A joint molecular dynamics/nmr study. Journal of the American Chemical Society, 129(5):1179–1189.
  43. Intrinsic propensities of amino acid residues in gxg peptides inferred from amide i’ band profiles and nmr scalar coupling constants. Journal of the American Chemical Society, 132(2):540–551.
  44. Hagler, A. T. (2019). Force field development phase ii: Relaxation of physics-based criteria… or inclusion of more rigorous physics into the representation of molecular energetics. Journal of computer-aided molecular design, 33(2):205–264.
  45. Halgren, T. A. (1996). Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of Computational Chemistry, 17(5-6):490–519.
  46. Inductive representation learning on large graphs. In Advances in neural information processing systems, pages 1024–1034.
  47. Opls3: a force field providing broad coverage of drug-like small molecules and proteins. Journal of chemical theory and computation, 12(1):281–296.
  48. Acemd: accelerating biomolecular dynamics in the microsecond time scale. Journal of chemical theory and computation, 5(6):1632–1639.
  49. A fast and high-quality charge model for the next generation general amber force field. The Journal of Chemical Physics, 153:114502.
  50. Determination of psi torsion angle restraints from 3j(calpha,calpha) and 3j(calpha,hn) coupling constants in proteins. Journal of the American Chemical Society, 122(26):6268–6277.
  51. Uncertainty quantification using neural networks for molecular property prediction. Journal of Chemical Information and Modeling, 60(8):3770–3780.
  52. Development of an improved four-site water model for biomolecular simulations: Tip4p-ew. The Journal of chemical physics, 120(20):9665–9678.
  53. Horton, J. (2022). openforcefield/openff-qcsubmit: 0.3.1 (0.3.1). Zenodo. https://doi.org/10.5281/zenodo.6338096.
  54. Open force field bespokefit: Automating bespoke torsion parametrization at scale. Journal of Chemical Information and Modeling, 62(22):5622–5633.
  55. Determination of phi and chi angles in proteins from 13c - 13c three-bond j couplings measured by three-dimensional heteronuclear nmr. how planar is the peptide bond? Journal of the American Chemical Society, 119(27):6360–6368.
  56. Building water models: A different approach. The Journal of Physical Chemistry Letters, 5(21):3863–3871.
  57. Fast, efficient generation of high-quality atomic charges. am1-bcc model: I. method. Journal of computational chemistry, 21(2):132–146.
  58. Fast, efficient generation of high-quality atomic charges. am1-bcc model: Ii. parameterization and validation. Journal of computational chemistry, 23(16):1623–1641.
  59. Comparison of simple potential functions for simulating liquid water. The Journal of chemical physics, 79(2):926–935.
  60. On the expressive power of geometric graph neural networks. arXiv preprint arXiv:2301.09308.
  61. Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. The journal of physical chemistry B, 112(30):9020–9041.
  62. Molecular dynamics simulations of the dynamic and energetic properties of alkali and halide ions using water-model-specific ion parameters. The Journal of Physical Chemistry B, 113(40):13279–13290.
  63. Improvements to the apbs biomolecular solvation software suite. Protein Science, 27(1):112–128.
  64. Karplus, M. (1963). Vicinal proton coupling in nuclear magnetic resonance. Journal of the American Chemical Society, 85(18):2870–2871.
  65. Forcefield_ptm: Ab initio charge and amber forcefield parameters for frequently occurring post-translational modifications. Journal of chemical theory and computation, 9(12):5653–5674.
  66. Pubchem 2023 update. Nucleic Acids Research, 51(D1):D1373–D1380.
  67. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  68. Semi-supervised classification with graph convolutional networks. CoRR, abs/1609.02907.
  69. Glycam06: a generalizable biomolecular force field. carbohydrates. Journal of computational chemistry, 29(4):622–655.
  70. Optimizing simulations protocols for relative free energy calculations. In Free Energy Methods in Drug Discovery: Current State and Future Directions, pages 227–245. ACS Publications.
  71. rdkit/rdkit: 2023_03_2 (q1 2023) release (release_2023_03_2). Zenodo. https://doi.org/10.5281/zenodo.8053810.
  72. Leach, A. R. (2001). Molecular modelling: principles and applications. Pearson education.
  73. Efficient molecular dynamics using geodesic integration and solvent–solute splitting. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 472(2189):20160138.
  74. Taking into account the ion-induced dipole interaction in the nonbonded model of ions. Journal of chemical theory and computation, 10(1):289–297.
  75. Rational design of particle mesh ewald compatible lennard-jones parameters for+ 2 metal cations in explicit solvent. Journal of chemical theory and computation, 9(6):2733–2748.
  76. Parameterization of highly charged metal ions using the 12-6-4 lj-type nonbonded model in explicit water. The Journal of Physical Chemistry B, 119(3):883–895.
  77. Pubchem as a public resource for drug discovery. Drug discovery today, 15(23-24):1052–1057.
  78. Lead identification of novel and selective tyk2 inhibitors. European journal of medicinal chemistry, 67:175–187.
  79. Benchmark assessment of molecular geometries and energies from small molecule force fields. F1000Research, 9:1390.
  80. openforcefield/openff-arsenic: v0.2.1 (0.2.1). Zenodo. https://doi.org/10.5281/zenodo.6210305.
  81. ff14sb: improving the accuracy of protein side chain and backbone parameters from ff99sb. Journal of chemical theory and computation, 11(8):3696–3713.
  82. Best practices for alchemical free energy calculations [article v1. 0]. Living journal of computational molecular science, 2(1).
  83. Escaping atom types in force fields using direct chemical perception. Journal of chemical theory and computation, 14(11):6076–6092.
  84. Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs. CoRR, abs/1811.01900.
  85. Optimizing protein- solvent force fields to reproduce intrinsic conformational preferences of model peptides. Journal of Chemical Theory and Computation, 7(4):1220–1230.
  86. Folding simulations for proteins with diverse topologies are accessible in days with a physics-based force field and implicit solvent. Journal of the American Chemical Society, 136(40):13959–13962.
  87. The rna 3d motif atlas: Computational methods for extraction, organization and evaluation of rna motifs. Methods, 103:99–119.
  88. Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.
  89. Best practices for foundations in molecular simulations [article v1.0]. Living Journal of Computational Molecular Science, 1:1–28.
  90. Pepconf, a diverse data set of peptide conformational energies. Scientific data, 6(1):1–9.
  91. Development and benchmarking of open force field v1.0.0—the Parsley small-molecule force field. Journal of chemical theory and computation, 17(10):6262–6280.
  92. RDKit, online (2013). RDKit: Open-source cheminformatics. http://www.rdkit.org. [Online; accessed 11-April-2013].
  93. Lightweight object oriented structure analysis: tools for building tools to analyze molecular dynamics simulations. Journal of Computational Chemistry, 35(32):2305–2318.
  94. Perses (0.10.1). Zenodo. https://doi.org/10.5281/zenodo.6757402.
  95. Routine microsecond molecular dynamics simulations with amber on gpus. 2. explicit solvent particle mesh ewald. Journal of chemical theory and computation, 9(9):3878–3888.
  96. Large-scale assessment of binding free energy calculations in active drug discovery projects. Journal of Chemical Information and Modeling, 60(11):5457–5474.
  97. Schlick, T. (2010). Molecular modeling and simulation: an interdisciplinary guide, volume 2. Springer.
  98. Tfd: Torsion fingerprints as a new measure to compare small molecule conformations. Journal of Chemical Information and Modeling, 52(6):1499–1512.
  99. Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks. Nature Communications, 12(5104).
  100. Tuning potential functions to host–guest binding data.
  101. Statistically optimal analysis of samples from multiple equilibrium states. The Journal of chemical physics, 129(12):124105.
  102. The molssi qcarchive project: An open-source platform to compute, organize, and share quantum chemistry data. Wiley Interdisciplinary Reviews: Computational Molecular Science, 11(2):e1491.
  103. Psi4 1.4: Open-source software for high-throughput quantum chemistry. The Journal of chemical physics, 152(18).
  104. Less is more: Sampling chemical space with active learning. The Journal of Chemical Physics, 148:241733.
  105. Improved treatment of ligands and coupling effects in empirical calculation and rationalization of p k a values. Journal of Chemical Theory and Computation, 7(7):2284–2295.
  106. Regularized by physics: Graph neural network parametrized potentials for the description of intermolecular interactions. Journal of Chemical Theory and Computation, 19(2):562–579.
  107. ff19sb: Amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. Journal of chemical theory and computation, 16(1):528–552.
  108. Affordable membrane permeability calculations: permeation of short-chain alcohols through pure-lipid bilayers and a mammalian cell membrane. Journal of Chemical Theory and Computation, 15(5):2913–2924.
  109. Limits on variations in protein backbone dynamics from precise measurements of scalar couplings. Journal of the American Chemical Society, 129(30):9377–9385.
  110. openforcefield/openff-forcefields: Version 2.0.0 "sage" (2.0.0). Zenodo. https://doi.org/10.5281/zenodo.5214478.
  111. openforcefield/openff-toolkit: 0.10.6 bugfix release (0.10.6). Zenodo. https://doi.org/10.5281/zenodo.6483648.
  112. Automatic atom type and bond type perception in molecular mechanical calculations. Journal of molecular graphics and modelling, 25(2):247–260.
  113. Development and testing of a general amber force field. Journal of computational chemistry, 25(9):1157–1174.
  114. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. Journal of the American Chemical Society, 137(7):2695–2703.
  115. Building force fields: An automatic, systematic, and reproducible approach. The journal of physical chemistry letters, 5(11):1885–1891.
  116. Building a more predictive protein force field: A systematic and reproducible route to amber-fb15. The Journal of Physical Chemical B, 121(16):4023–4039.
  117. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315.
  118. Dmff: An open-source automatic differentiable platform for molecular force field development and molecular dynamics simulation. Journal of Chemical Theory and Computation, 19(17):5897–5909.
  119. Wang, Y. (2023). Graph Machine Learning for (Bio)Molecular Modeling and Force Field Construction. PhD thesis. Copyright - Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works; Last updated - 2023-03-24.
  120. Spatial attention kinetic networks with e(n)-equivariance.
  121. End-to-end differentiable construction of molecular mechanics force fields. Chem. Sci., 13:12016–12033.
  122. Graph nets for partial charge prediction. arXiv preprint arXiv:1909.07903.
  123. Stochastic aggregation in graph neural networks.
  124. Espalomacharge: Machine learning-enabled ultra-fast partial charge assignment. arXiv preprint arXiv:2302.06758.
  125. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12.
  126. Denoise pretraining on nonequilibrium molecules for accurate and transferable neural potentials. Journal of Chemical Theory and Computation, 19(15):5077–5087.
  127. Evaluating the performance of the ff99sb force field based on nmr scalar coupling data. Biophysical Journal, 97(3):853–856.
  128. Fitting quantum machine learning potentials to experimental free energy data: Predicting tautomer ratios in solution. Chemical science, 12(34):11364–11381.
  129. Teaching free energy calculations to learn from experimental data. bioRxiv, pages 2021–08.
  130. Angular dependence of 1j(ni,calphai) and 2j(ni,calpha(i-1)) coupling constants measured in j-modulated hsqcs. Journal of Biomolecular NMR, 23:47–55.
  131. Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153.
  132. Xu, H. (2019). Optimal measurement network of pairwise differences. Journal of Chemical Information and Modeling, 59(11):4720–4728.
  133. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.
  134. Refinement of the cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. Journal of Chemical Theory and Computation, 7(9):2886–2902.
  135. Refinement of the sugar–phosphate backbone torsion beta for amber force fields improves the description of z-and b-dna. Journal of chemical theory and computation, 11(12):5723–5736.
  136. Unified efficient thermostat scheme for the canonical ensemble with holonomic or isokinetic constraints via molecular dynamics. The Journal of Physical Chemistry A, 123(28):6056–6079.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 32 likes.

Upgrade to Pro to view all of the tweets about this paper: