Top-down machine learning of coarse-grained protein force-fields (2306.11375v4)
Abstract: Developing accurate and efficient coarse-grained representations of proteins is crucial for understanding their folding, function, and interactions over extended timescales. Our methodology involves simulating proteins with molecular dynamics and utilizing the resulting trajectories to train a neural network potential through differentiable trajectory reweighting. Remarkably, this method requires only the native conformation of proteins, eliminating the need for labeled data derived from extensive simulations or memory-intensive end-to-end differentiable simulations. Once trained, the model can be employed to run parallel molecular dynamics simulations and sample folding events for proteins both within and beyond the training distribution, showcasing its extrapolation capabilities. By applying Markov State Models, native-like conformations of the simulated proteins can be predicted from the coarse-grained simulations. Owing to its theoretical transferability and ability to use solely experimental static structures as training data, we anticipate that this approach will prove advantageous for developing new protein force fields and further advancing the study of protein dynamics, folding, and interactions.
- Noid, W. G. Perspective: Coarse-grained models for biomolecular systems. The Journal of chemicalphysics 2013, 139
- Clementi, C.; Nymeyer, H.; Onuchic, J. N. Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? an investigation for small globular proteins11Edited by F. E. Cohen. Journal of Molecular Biology 2000, 298, 937–953
- Foley, T. T.; Shell, M. S.; Noid, W. G. The impact of resolution upon entropy and information in coarse-grained models. The Journal of chemical physics 2015, 143
- Chen, Y.-L.; Habeck, M. Data-driven coarse graining of large biomolecular structures. PLoS One 2017,
- Majewski, M.; Pérez, A.; Thölke, P.; Doerr, S.; Charron, N. E.; Giorgino, T.; Husic, B. E.; Clementi, C.; Noé, F.; De Fabritiis, G. Machine Learning Coarse-Grained Potentials of Protein Thermodynamics. arXiv preprint arXiv:2212.07492 2022,
- Thölke, P.; De Fabritiis, G. TorchMD-Net: Equivariant Transformers for Neural Network Based Molecular Potentials. International Conference on Learning Representations (ICLR). Virtual Conference, 2022
- Arts, M.; Satorras, V. G.; Huang, C.-W.; Zuegner, D.; Federici, M.; Clementi, C.; Noe, F.; Pinsler, R.; van den Berg, R. Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics. arXiv preprint arXiv:2302.00600 2023,
- Greener JG, J. D. Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins. PLoS ONE 2021, 16
- Prinz, J. H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J. D.; Schtte, C.; Noé, F. Markov models of molecular kinetics: Generation and validation. Journal of Chemical Physics 2011, 134
- Pan, A. C.; Roux, B. Building Markov state models along pathways to determine free energies and rates of transitions. J. Chem. Phys. 2008, 129
- Pérez-Hernández, G.; Paul, F.; Giorgino, T.; De Fabritiis, G.; Noé, F. Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys. 2008, 139
- Mignon, A.; Jurie, F. PCCA: A new approach for distance learning from sparse pairwise constraints. CVPR 2012, 2666–2672
- Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010, 26
- Carles Navarro (2 papers)
- Maciej Majewski (9 papers)
- Gianni De Fabritiis (39 papers)