Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics (2302.00600v3)

Published 1 Feb 2023 in cs.LG

Abstract: Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several small- to medium-sized protein simulations, reproducing the CG equilibrium distribution, and preserving dynamics of all-atom simulations such as protein folding events.

Citations (64)

View on Semantic Scholar

Summary

The paper presents a method that leverages denoising diffusion models to extract conservative force fields from equilibrium CG configurations without direct force supervision.
It integrates the learned DFF into Langevin dynamics, achieving superior simulation accuracy on metrics like JS divergence and free energy profiles.
Architectural innovations, including graph transformer backbones and symmetry-preserving designs, enable robust force field extraction and improved kinetic modeling in biomolecular systems.

The paper presents a method that leverages denoising diffusion probabilistic models for coarse-grained (CG) molecular dynamics (MD) by learning an approximate CG force field without the need for direct force supervision during training. The core idea is to train a score-based generative model on equilibrium CG configurations—obtained by projecting atomistic MD trajectories—and then to extract a force field from the learned score function. This force field, termed the Denoising Force Field (DFF), is defined via

$\mathbf{F}_z^\mathrm{DFF} = -\frac{k_B T}{\sqrt{1-\bar{\alpha}_i}}\, \epsilon_{\theta^*}(z, i),$

where $k_B T$ is the thermal energy, $\bar{\alpha}_i$ is the product of diffusion process parameters up to level $i$ , and $\epsilon_{\theta^*}$ is the noise prediction network whose output is constrained to be conservative by design.

The approach builds on several technical insights and methodological contributions:

Diffusion Model Training and Score Extraction:

The method trains a denoising diffusion model using a standard noise prediction loss equivalent to a weighted denoising score matching objective. By establishing the connection between the score function $s_\theta(z_i, i)$ and the noise prediction network via

$s_\theta(z_i,i) = -\frac{\epsilon_\theta(z_i,i)}{\sqrt{1-\bar{\alpha}_i}},$

the paper shows that at sufficiently low noise levels ( $i\approx 1$ ), the optimal score approximates the negative gradient of the CG Boltzmann distribution, i.e. the effective force field. This connection underpins the extraction of a force field from the diffusion network that is then used to simulate CG dynamics.

Integration with Molecular Dynamics:

The extracted DFF is directly plugged into a Langevin dynamics simulator:

$M\frac{\mathrm{d}^2z}{\mathrm{d}t^2} = -\nabla_zV(z) - \gamma M\frac{\mathrm{d}z}{\mathrm{d}t} + \sqrt{2M\gamma k_B T}\,w(t),$

where the force $-\nabla_zV(z)$ is substituted by $\mathbf{F}_z^\mathrm{DFF}$ . In the paper, a derivation shows that the iterative diffusion–denoising process at a low noise level approximates overdamped (Brownian) dynamics with an implicit timestep defined via the noise parameter, thereby linking stochastic generative modeling to time-integrated MD simulations.

Architectural Considerations:

The neural network architecture for $\epsilon_\theta(z, i)$ is designed to respect essential physical symmetries. Key aspects include:

Conservativeness: The network is parameterized as the gradient of an energy network, ensuring that the derived force field is conservative.
Translation Invariance and SO(3) Equivariance: Input features are constructed from pairwise difference vectors, and data augmentation is employed to achieve rotation equivariance without resorting to heavier spherical harmonic formulations. Evaluations demonstrate that the relative squared error due to rotations is less than $10^{-6}$ .
Graph Transformer Backbone: The scalar energy function is modeled with an adapted graph transformer that incorporates these symmetry constraints efficiently.
- Experimental Evaluation:

The paper provides comprehensive validation on both small systems (alanine dipeptide) and fast-folding proteins ranging from 10 to 56 amino acids:

Alanine Dipeptide:

DFF simulation (DFF sim.) achieves a significantly lower Jensen–Shannon (JS) divergence between the dihedral angle distributions of the generated samples and those from atomistic MD projections compared to flow-based methods. In the low-data regime (training with as few as 10K samples), DFF sim. outperforms prior methods such as Flow-CGNet sim., and the DFF i.i.d. sample generator approaches the performance of the reference distribution.
Fast-Folding Proteins:

For proteins such as Chignolin, Trp-cage, Bba, Villin, and Protein G, the DFF models show superior performance on multiple metrics:
- Equilibrium Metrics: The paper reports lower TIC JS and pairwise distance (PWD) JS divergence values. For instance, in Chignolin, DFF sim. achieves TIC JS values around 0.0335 compared to Flow-CGNet sim. values nearing 0.1875, representing an improvement by almost an order of magnitude.
- Global Structural Fidelity: Free energy profiles as a function of the root mean squared deviation (RMSD) and contact probability maps demonstrate that the DFF more accurately captures long-range interactions and the overall equilibrium distribution.
- Kinetic Modeling: Evaluations based on time-lagged independent component analysis (TICA) and Markov state models show that the DFF sim. model preserves dynamical information more faithfully. The average weighted JS divergence for transition probabilities across fast-folders is reported on the order of $5.1 \times 10^{-4}$ for DFF compared to $5.7 \times 10^{-3}$ for Flow-CGNet, indicating a marked improvement in representing kinetics.
- Sensitivity Analyses and Ablations:

The paper further explores:

The impact of hyperparameters such as the noise level $i$ , where a trade-off is observed between approximating the true CG force field and ensuring robust learning given the available data.
The effect of network capacity, showing that while increasing the number of hidden features can improve the i.i.d. sample quality, an overly complex network may deteriorate the performance of dynamics simulations due to bias–variance trade-offs.
The necessity of conservative network architectures versus non-conservative variants; ablation studies reveal that explicitly enforcing a gradient structure leads to more stable MD simulations and greatly reduced errors in structural distributions.

Overall, the paper provides a unified framework where a single diffusion model facilitates both equilibrium sampling and force field extraction for CG MD simulations. The approach simplifies the training setup compared to multi-stage teacher–student schemes and offers improved scalability to larger, more complex biomolecular systems while demonstrating quantitative advantages over existing ML-based coarse-graining techniques. This work is of considerable technical interest due to its rigorous exploitation of score-based generative modeling and its implications for efficient, data-driven coarse-grained simulations.

PDF Markdown

Related Papers

Tweets

https://twitter.com/TimothyDuignan/status/1788563166315868591

https://twitter.com/TimothyDuignan/status/1788389755962749434

https://twitter.com/TimothyDuignan/status/1787996393459876158