Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 452 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

GFlowNet Foundations (2111.09266v4)

Published 17 Nov 2021 in cs.LG, cs.AI, and stat.ML

Abstract: Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context, with a training objective that makes them approximately sample in proportion to a given reward function. In this paper, we show a number of additional theoretical properties of GFlowNets. They can be used to estimate joint probability distributions and the corresponding marginal distributions where some variables are unspecified and, of particular interest, can represent distributions over composite objects like sets and graphs. GFlowNets amortize the work typically done by computationally expensive MCMC methods in a single but trained generative pass. They could also be used to estimate partition functions and free energies, conditional probabilities of supersets (supergraphs) given a subset (subgraph), as well as marginal distributions over all supersets (supergraphs) of a given set (graph). We introduce variations enabling the estimation of entropy and mutual information, sampling from a Pareto frontier, connections to reward-maximizing policies, and extensions to stochastic environments, continuous actions and modular energy functions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. An introduction to mcmc for machine learning. Machine learning, 50(1):5–43, 2003.
  2. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Transactions on Signal Processing, 50(2):174–188, 2002. doi: 10.1109/78.978374.
  3. The nonstochastic multiarmed bandit problem. SIAM journal on computing, 32(1):48–77, 2002.
  4. Neural machine translation by jointly learning to align and translate. ICLR’2015, arXiv:1409.0473, 2014.
  5. Deep equilibrium models. CoRR, abs/1909.01377, 2019. URL http://arxiv.org/abs/1909.01377.
  6. A distributional perspective on reinforcement learning. In International Conference on Machine Learning, 2017.
  7. Flow network based generative models for non-iterative diverse candidate generation. NeurIPS’2021, arXiv:2106.04399, 2021.
  8. Better mixing via deep representations. In International conference on machine learning, pages 552–560. PMLR, 2013.
  9. A graph-based genetic algorithm and its application to the multiobjective evolution of median molecules. Journal of chemical information and computer sciences, 44(3):1079–1087, 2004.
  10. Approximate inference in discrete distributions with monte carlo tree search and value functions, 2019.
  11. L. Cayton. Algorithms for manifold learning. Univ. of California at San Diego Tech. Rep, 12(1-17):1, 2005.
  12. Learning discrete energy-based models via auxiliary-variable local exploration. In Neural Information Processing Systems (NeurIPS), 2020.
  13. Bayesian structure learning with generative flow networks. In Uncertainty in Artificial Intelligence, pages 518–528. PMLR, 2022.
  14. Nice: Non-linear independent components estimation. ICLR’2015 Workshop, arXiv:1410.8516, 2014.
  15. Density estimation using real NVP. ICLR’2017, arXiv:1605.08803, 2016.
  16. A. Dosovitskiy and J. Djolonga. You only train once: Loss-conditional training of deep networks. In International Conference on Learning Representations, 2019.
  17. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503–556, 2005.
  18. Learning actionable representations with goal-conditioned policies. arXiv preprint arXiv:1811.07819, 2018.
  19. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  20. A. Goyal and Y. Bengio. Inductive biases for deep learning of higher-level cognition. arXiv, abs/2011.15091, 2020. https://arxiv.org/abs/2011.15091.
  21. Recurrent independent mechanisms. ICLR’2021, arXiv:1909.10893, 2019.
  22. Oops i took a gradient: Scalable sampling for discrete distributions, 2021.
  23. Constrained bayesian optimization for automatic chemical design. arXiv preprint arXiv:1709.05501, 2017.
  24. Reinforcement learning with deep energy-based policies. In International Conference on Machine Learning, pages 1352–1361. PMLR, 2017.
  25. W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 1970.
  26. Gflownet-em for learning compositional latent variable models. arvix, 2023.
  27. Biological sequence design with gflownets. International Conference on Machine Learning (ICML), 2022.
  28. Multi-objective gflownets. arXiv preprint arXiv:2210.12765, 2023.
  29. Markov chain monte carlo methods and the label switching problem in bayesian mixture modeling. Statistical Science, pages 50–67, 2005.
  30. J. H. Jensen. A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space. Chemical science, 10(12):3567–3572, 2019.
  31. D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  32. Factor graphs and the sum-product algorithm. IEEE Transactions on information theory, 47(2):498–519, 2001.
  33. Maximum entropy generators for energy-based models, 2019.
  34. Grammar variational autoencoder. In International Conference on Machine Learning, pages 1945–1954. PMLR, 2017.
  35. A theory of continuous generative flow networks. International Conference on Machine Learning (ICML), 2023.
  36. Batch reinforcement learning. In Reinforcement learning, pages 45–73. Springer, 2012.
  37. S. Levine. Reinforcement learning and control as probabilistic inference: Tutorial and review. arXiv preprint arXiv:1805.00909, 2018.
  38. Batchgfn: Generative flow networks for batch active learning. arXiv preprint arXiv: 2306.15058, 2023.
  39. Trajectory balance: Improved credit assignment in gflownets. arXiv preprint arXiv:2201.13259, 2022.
  40. GFlowNets and variational inference. International Conference on Learning Representations (ICLR), 2023.
  41. Equation of state calculations by fast computing machines. The journal of chemical physics, 21(6):1087–1092, 1953.
  42. J. Močkus. On bayesian methods for seeking the extremum. In Optimization techniques IFIP technical conference, pages 400–404. Springer, 1975.
  43. J.-B. Mouret and S. Doncieux. Encouraging Behavioral Diversity in Evolutionary Robotics: An Empirical Study. Evolutionary Computation, 20(1):91–133, 03 2012. ISSN 1063-6560. doi: 10.1162/EVCO_a_00048. URL https://doi.org/10.1162/EVCO_a_00048.
  44. Bridging the gap between value and policy based reinforcement learning. arXiv preprint arXiv:1702.08892, 2017.
  45. Dualdice: Behavior-agnostic estimation of discounted stationary distribution corrections. arXiv preprint arXiv:1906.04733, 2019.
  46. Elements of sequential monte carlo. Foundations and Trends® in Machine Learning, 12(3):307–392, 2019.
  47. H. Narayanan and S. Mitter. Sample complexity of testing the manifold hypothesis. In NIPS’2010, pages 1786–1794, 2010.
  48. C. Nash and C. Durkan. Autoregressive energy machines. In International Conference on Machine Learning, pages 1735–1744. PMLR, 2019.
  49. Better training of gflownets with local credit and incomplete trajectories. arXiv preprint arXiv: 2302.01687, 2023.
  50. A framework for adaptive mcmc targeting multimodal distributions. The Annals of Statistics, 48(5):2930–2952, 2020.
  51. D. Rezende and S. Mohamed. Variational inference with normalizing flows. In International conference on machine learning, pages 1530–1538. PMLR, 2015.
  52. M. Riedmiller. Neural fitted q iteration–first experiences with a data efficient neural reinforcement learning method. In European conference on machine learning, pages 317–328. Springer, 2005.
  53. The manifold tangent classifier. Advances in neural information processing systems, 24:2294–2302, 2011.
  54. Evolution strategies as a scalable alternative to reinforcement learning, 2017.
  55. J. Schmidhuber. Reinforcement learning upside down: Don’t predict rewards–just map them to actions. arXiv preprint arXiv:1912.02875, 2019.
  56. On causal and anticausal learning. In ICML’2012, pages 1255–1262, 2012.
  57. Discrete object generation with reversible inductive construction. arXiv preprint arXiv:1907.08268, 2019.
  58. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265. PMLR, 2015.
  59. Gaussian process optimization in the bandit setting: No regret and experimental design. In International Conference on Machine Learning (ICML), 2010.
  60. Reinforcement learning: An introduction. MIT press, 2018.
  61. Amortized bayesian optimization over discrete spaces. In Conference on Uncertainty in Artificial Intelligence, pages 769–778. PMLR, 2020.
  62. M. Toussaint and A. Storkey. Probabilistic inference for solving discrete and continuous state markov decision processes. In Proceedings of the 23rd international conference on Machine learning, pages 945–952, 2006.
  63. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017.
  64. Batch stationary distribution estimation. arXiv preprint arXiv:2003.00722, 2020.
  65. {MARS}: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=kHSu4ebxFXY.
  66. Generative flow networks for discrete probabilistic modeling. International Conference on Machine Learning (ICML), 2022.
  67. Robust scheduling with gflownets. International Conference on Learning Representations (ICLR), 2023.
  68. Maximum entropy inverse reinforcement learning. In Aaai, volume 8, pages 1433–1438. Chicago, IL, USA, 2008.
  69. A variational perspective on generative flow networks. arXiv preprint 2210.07992, 2022.
Citations (169)

Summary

  • The paper introduces a mathematical formulation for GFlowNets leveraging flow-matching conditions to sample objects proportionally to their rewards.
  • It details innovative training objectives, including flow matching, detailed balance, and trajectory balance losses, enabling efficient inference.
  • The work extends GFlowNets to conditional, modular, and continuous settings, paving the way for scalable probabilistic modeling and optimization.

Foundations and Theoretical Advances in Generative Flow Networks (GFlowNets)

The "GFlowNet Foundations" paper provides a comprehensive mathematical and algorithmic framework for Generative Flow Networks (GFlowNets), establishing their theoretical underpinnings, generalizations, and connections to related probabilistic inference and reinforcement learning (RL) paradigms. The work formalizes the notion of flows over trajectories in directed acyclic graphs (DAGs), introduces new training objectives such as detailed balance, and extends GFlowNets to conditional, modular, and continuous domains. This essay synthesizes the core contributions, theoretical results, and practical implications of the paper, with an emphasis on implementation and future research directions.

Mathematical Framework of GFlowNets

GFlowNets are defined as generative models that sample composite objects (e.g., sets, graphs, sequences) via a sequence of stochastic actions, with the objective that the probability of generating a particular object ss is proportional to a given non-negative reward function R(s)R(s). The construction is formalized on a pointed DAG, where each node represents a partial object and edges correspond to constructive actions.

The central mathematical object is the flow FF, a non-negative measure over complete trajectories from the initial state s0s_0 to the sink state sfs_f. The flow through a state or edge is defined as the sum of flows of all trajectories passing through that state or edge. The flow-matching condition ensures that, for each state, the sum of incoming flows equals the sum of outgoing flows, and at terminal states, the outgoing flow matches the reward.

The paper rigorously proves that Markovian flows—flows that factorize over the DAG according to local transition probabilities—are sufficient to represent the relevant distributions over terminal states. This reduces the complexity of learning from exponential in the number of trajectories to polynomial in the number of edges, enabling practical parameterization via neural networks.

Parametrizations and Training Objectives

Several equivalent parameterizations of Markovian flows are established:

  • Edge flows: Non-negative values assigned to each edge, subject to flow-matching constraints.
  • Forward/Backward transition probabilities: Local transition kernels PF(ss)P_F(s'|s) and PB(ss)P_B(s|s') consistent with the DAG.
  • State flows: Flows through each state, recursively defined via incoming/outgoing edge flows.

The paper introduces and analyzes multiple training objectives:

  • Flow Matching Loss: Enforces the flow-matching condition at each state, typically via a squared log-ratio loss.
  • Detailed Balance Loss: Inspired by MCMC, enforces local detailed balance between forward and backward transitions, avoiding explicit summation over successors.
  • Trajectory Balance Loss: Enforces global consistency over entire trajectories, as in [malkin2022trajectory].

These losses are shown to be decomposable (over edges, states, or trajectories), enabling stochastic gradient estimation and scalable training.

Conditional, Modular, and Continuous GFlowNets

The framework is extended to conditional GFlowNets, where the reward function and/or the DAG structure depend on external or internal conditioning variables. This enables amortized inference over families of distributions, such as conditional energy-based models or marginalization over subsets of variables.

A key theoretical advance is the introduction of state-conditional flow networks, which allow the estimation of free energies (log-partition functions) and marginal probabilities over descendants of arbitrary states. This is critical for applications such as marginalizing over missing variables, estimating entropies, and computing mutual information.

The paper also discusses modular energy function decomposition, where the energy of a composite object (e.g., a factor graph) is expressed as a sum of reusable factors, and the GFlowNet itself can be modularized accordingly.

For continuous or hybrid state/action spaces, the authors propose replacing sums with integrals and parameterizing transition densities via tractable families (e.g., Gaussians, normalizing flows), or by nesting GFlowNets to represent complex conditional distributions.

GFlowNets as Amortized Inference and Alternatives to MCMC

A central motivation for GFlowNets is to provide amortized probabilistic inference: once trained, a GFlowNet can generate i.i.d. samples from the target distribution in a single generative pass, in contrast to the iterative and often slow mixing of MCMC methods. The paper formalizes the conditions under which GFlowNets can be used to estimate partition functions, marginal and conditional probabilities, and to sample from complex, multimodal distributions over combinatorial objects.

The trade-off is that GFlowNets require upfront training, which is only tractable if the reward function exhibits sufficient structure for generalization. In unstructured or high-dimensional spaces with randomly placed modes, GFlowNets offer no advantage over MCMC.

Practical Implementation Considerations

Parametrization and Losses

  • Neural Network Parameterization: Edge flows, transition probabilities, or state flows can be parameterized by neural networks, with inputs representing the current state (and possibly conditioning variables).
  • Loss Computation: For large or continuous spaces, losses are estimated via stochastic sampling of edges, states, or trajectories, using a training distribution with full support.
  • Offline Training: GFlowNets can be trained from arbitrary datasets of trajectories, not necessarily generated by the current policy, enabling offline and off-policy learning.

Sampling and Inference

  • Sampling: Once trained, sampling is performed by sequentially sampling actions according to the learned forward policy until a terminal state is reached.
  • Marginalization: State-conditional flows enable efficient estimation of marginal probabilities and partition functions for subsets of variables or substructures.
  • Entropy and Mutual Information: By training GFlowNets on entropic reward functions, one can estimate entropy, conditional entropy, and mutual information via initial state flows.

Extensions

  • Active Learning: GFlowNets can be used to propose diverse, high-reward candidates in expensive evaluation settings, leveraging proxy models and acquisition functions.
  • Pareto and Multi-Objective Sampling: Conditional GFlowNets can sample from Pareto frontiers by conditioning on preference vectors.
  • Joint Learning with Energy Functions: GFlowNets can be trained jointly with energy-based models, using samples from the GFlowNet to estimate gradients of the log-likelihood.

Limitations

  • Reward Specification: The reward function does not uniquely specify the flow; backward transition probabilities must also be chosen, affecting the distribution over trajectories.
  • Stochastic Environments: In stochastic or partially controllable environments, it may not be possible to match arbitrary terminal reward functions, as backward transitions are constrained by the environment dynamics. Figure 1

    Figure 2: A counterexample illustrating that, in stochastic environments, backward transitions cannot always be chosen freely while matching flows and terminal rewards.

The paper situates GFlowNets in relation to:

  • Deep Generative Models: Unlike VAEs or GANs, GFlowNets are trained from reward/energy functions rather than data samples, enabling sampling from unnormalized or structured distributions.
  • Reinforcement Learning: GFlowNets differ from RL in that they aim to sample in proportion to reward, not to maximize expected return; they are more closely related to entropy-regularized or maximum entropy RL, but avoid the path-counting bias of MaxEnt RL.
  • MCMC and SMC: GFlowNets amortize the cost of sampling, potentially overcoming mode-mixing issues in high-dimensional, multimodal distributions, but require sufficient structure for generalization.
  • Evolutionary and Bayesian Optimization Methods: GFlowNets provide a generative, diversity-promoting alternative for candidate proposal in large combinatorial spaces.

Implications and Future Directions

The theoretical foundations established in this work open several avenues for future research and application:

  • Hierarchical and Modular GFlowNets: Leveraging compositionality for scalable inference in structured domains.
  • Continuous and Hybrid Spaces: Extending practical algorithms and architectures for continuous action/state spaces.
  • Integration with Energy-Based Models: Joint training and inference in unsupervised and semi-supervised settings.
  • Active and Multi-Objective Learning: Efficient exploration and sampling in scientific discovery, design, and optimization tasks.
  • Empirical Validation: Systematic benchmarking against MCMC, RL, and generative models in domains such as molecular design, causal structure learning, and probabilistic programming.

Conclusion

"GFlowNet Foundations" provides a rigorous and extensible mathematical framework for generative flow networks, establishing their equivalence to Markovian flows, introducing efficient training objectives, and generalizing to conditional, modular, and continuous settings. The work clarifies the relationship of GFlowNets to existing inference and learning paradigms, and lays the groundwork for their application as scalable, amortized samplers in complex probabilistic modeling tasks. The theoretical results highlight both the power and the limitations of GFlowNets, and suggest a rich landscape for future algorithmic and empirical developments.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com