Simulation-free Schrödinger bridges via score and flow matching (2307.03672v3)

Published 7 Jul 2023 in cs.LG

Abstract: We present simulation-free score and flow matching ([SF]$^2$M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions. Our method generalizes both the score-matching loss used in the training of diffusion models and the recently proposed flow matching loss used in the training of continuous normalizing flows. [SF]$^2$M interprets continuous-time stochastic generative modeling as a Schr\"odinger bridge problem. It relies on static entropy-regularized optimal transport, or a minibatch approximation, to efficiently learn the SB without simulating the learned stochastic process. We find that [SF]$^2$M is more efficient and gives more accurate solutions to the SB problem than simulation-based methods from prior work. Finally, we apply [SF]$^2$M to the problem of learning cell dynamics from snapshot data. Notably, [SF]$^2$M is the first method to accurately model cell dynamics in high dimensions and can recover known gene regulatory networks from simulated data. Our code is available in the TorchCFM package at https://github.com/atong01/conditional-flow-matching.

Citations (26)

View on Semantic Scholar

Summary

The paper generalizes score and flow matching losses to formulate a simulation-free Schrödinger Bridge problem using entropy regularization.
It demonstrates superior efficiency and accuracy over simulation-based methods, accurately interpolating high-dimensional cell dynamics and recovering gene networks.
The approach establishes a robust theoretical connection between Brownian bridges and entropic optimal transport, inviting future research in diverse applications.

Simulation-Free Schrödinger Bridges via Score and Flow Matching

This paper introduces a novel simulation-free approach for learning continuous-time stochastic generative models via a method termed Simulation-Free Score and Flow Matching ([SF] $^2$ M). The study contributes to the field of stochastic dynamics by efficiently inferring processes from unpaired samples between source and target distributions without relying on simulations.

Key Contributions

Generalization of Existing Methods: The paper extends score-matching losses used in diffusion models and flow-matching losses in continuous normalizing flows to form [SF] $^2$ M. This generalization reinterprets generative modeling in continuous time as a Schrödinger Bridge problem, leveraging entropy-regularized optimal transport to address the inherent challenges.
Efficiency and Accuracy: [SF] $^2$ M demonstrates superior efficiency and accuracy in approximating Schrödinger Bridges over simulation-based methods by avoiding the need to simulate the stochastic process during training.
Application to Cell Dynamics: The method's application to modeling cell dynamics from snapshot data showcases its ability to operate effectively in high-dimensional spaces. Notably, [SF] $^2$ M can recover gene regulatory networks from simulated data, providing a promising tool for biological data analysis.
Theoretical Underpinning: The authors establish a robust theoretical foundation connecting the Schrödinger Bridge problem with entropic optimal transport, presenting [SF] $^2$ M as a mixture of Brownian bridges parameterized by these transport plans.

Numerical Results

The authors provide comparative analysis with state-of-the-art methods on several datasets:

Synthetic Distributions: Experiments on Gaussian and non-Gaussian synthetic data validate [SF] $^2$ M's ability to produce accurate Schrödinger Bridge approximations, outperforming prior methods in both low and high-dimensional settings.
Single-Cell Dynamics: By modeling single-cell datasets, [SF] $^2$ M achieves precise interpolation between time-resolved snapshots, even scaling efficiently to thousands of dimensions. This capability positions [SF] $^2$ M as a leading method in computational biology for dynamic modeling.

Implications and Future Work

The introduction of [SF] $^2$ M opens several avenues for further research. Practically, its application to high-dimensional biological datasets may redefine approaches to modeling cellular processes. Theoretically, this work invites exploration into other areas where simulation-free methods can replace traditional generative approaches.

The authors suggest that future efforts might refine training techniques further or apply [SF] $^2$ M to other domains where accurate modeling of dynamics is critical, such as physics or finance.

In conclusion, [SF] $^2$ M represents a significant advancement in handling complex stochastic processes without the computational overhead of simulation, offering a compelling alternative for both theoretical investigations and real-world applications.