Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Order-Preserving GFlowNets (2310.00386v2)

Published 30 Sep 2023 in cs.LG, cs.AI, and stat.ML

Abstract: Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates with probabilities proportional to a given reward. However, GFlowNets can only be used with a predefined scalar reward, which can be either computationally expensive or not directly accessible, in the case of multi-objective optimization (MOO) tasks for example. Moreover, to prioritize identifying high-reward candidates, the conventional practice is to raise the reward to a higher exponent, the optimal choice of which may vary across different environments. To address these issues, we propose Order-Preserving GFlowNets (OP-GFNs), which sample with probabilities in proportion to a learned reward function that is consistent with a provided (partial) order on the candidates, thus eliminating the need for an explicit formulation of the reward function. We theoretically prove that the training process of OP-GFNs gradually sparsifies the learned reward landscape in single-objective maximization tasks. The sparsification concentrates on candidates of a higher hierarchy in the ordering, ensuring exploration at the beginning and exploitation towards the end of the training. We demonstrate OP-GFN's state-of-the-art performance in single-objective maximization (totally ordered) and multi-objective Pareto front approximation (partially ordered) tasks, including synthetic datasets, molecule generation, and neural architecture search.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Performance indicators in multiobjective optimization. European journal of operational research, 292(2):397–422, 2021.
  2. Survey of variation in human transcription factors reveals prevalent dna binding changes. Science, 351(6280):1450–1454, 2016.
  3. Flow network based generative models for non-iterative diverse candidate generation. Neural Information Processing Systems (NeurIPS), 2021a.
  4. GFlowNet foundations. arXiv preprint 2111.09266, 2021b.
  5. Quantifying the chemical beauty of drugs. Nature chemistry, 4(2):90–98, 2012.
  6. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc., 131:8732, 2009.
  7. Approximate inference in discrete distributions with monte carlo tree search and value functions. In International Conference on Artificial Intelligence and Statistics, pp.  624–634. PMLR, 2020.
  8. A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm. In MICAI 2004: Advances in Artificial Intelligence: Third Mexican International Conference on Artificial Intelligence, Mexico City, Mexico, April 26-30, 2004. Proceedings 3, pp.  688–697. Springer, 2004.
  9. Bayesian structure learning with generative flow networks. In Uncertainty in Artificial Intelligence, pp.  518–528. PMLR, 2022.
  10. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
  11. Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326, 2020.
  12. Nats-bench: Benchmarking nas algorithms for architecture topology and size. IEEE transactions on pattern analysis and machine intelligence, 44(7):3634–3646, 2021.
  13. Bohb: Robust and efficient hyperparameter optimization at scale. In International Conference on Machine Learning, pp.  1437–1446. PMLR, 2018.
  14. Revisiting fundamentals of experience replay. In International Conference on Machine Learning, pp.  3061–3071. PMLR, 2020.
  15. An improved dimension-sweep algorithm for the hypervolume indicator. In 2006 IEEE international conference on evolutionary computation, pp.  1157–1163. IEEE, 2006.
  16. Reinforcement learning with deep energy-based policies. International Conference on Machine Learning (ICML), 2017.
  17. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International Conference on Machine Learning (ICML), 2018.
  18. Evaluating the quality of approximations to the non-dominated set. Citeseer, 1994.
  19. Off-policy maximum entropy reinforcement learning: Soft actor-critic with advantage weighted mixture policy (sac-awmp). arXiv preprint arXiv:2002.02829, 2020.
  20. Modified distance calculation in generational distance and inverted generational distance. In Evolutionary Multi-Criterion Optimization: 8th International Conference, EMO 2015, Guimarães, Portugal, March 29–April 1, 2015. Proceedings, Part II 8, pp.  110–125. Springer, 2015.
  21. Biological sequence design with GFlowNets. International Conference on Machine Learning (ICML), 2022.
  22. Multi-objective gflownets. In International Conference on Machine Learning, pp.  14631–14653. PMLR, 2023.
  23. Learning to scale logits for temperature-conditional gflownets. arXiv preprint arXiv:2310.02823, 2023a.
  24. Local search gflownets. arXiv preprint arXiv:2310.02710, 2023b.
  25. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  26. torchgfn: A pytorch gflownet library. arXiv preprint arXiv:2305.14594, 2023.
  27. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.
  28. Learning gflownets from partial episodes for improved convergence and stability. arXiv preprint arXiv:2209.12782, 2022.
  29. Trajectory balance: Improved credit assignment in gflownets. arXiv preprint arXiv:2201.13259, 2022.
  30. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp.  1928–1937. PMLR, 2016.
  31. Machine learning of molecular electronic properties in chemical compound space. New Journal of Physics, 15(9):095003, 2013. URL http://stacks.iop.org/1367-2630/15/i=9/a=095003.
  32. Bridging the gap between value and policy based reinforcement learning. Neural Information Processing Systems (NeurIPS), 2017.
  33. Evaluating generalization in gflownets for molecule design. In ICLR2022 Machine Learning for Drug Discovery, 2022.
  34. Bayesian learning of causal structure and mechanisms with gflownets and variational bayes. arXiv preprint arXiv:2211.02763, 2022.
  35. Generative augmented flow networks. arXiv preprint arXiv:2210.03308, 2022.
  36. Better training of gflownets with local credit and incomplete trajectories. arXiv preprint arXiv:2302.01687, 2023.
  37. Hierarchical variational models. In International conference on machine learning, pp.  324–333. PMLR, 2016.
  38. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp.  4780–4789, 2019.
  39. Thompson sampling for improved exploration in gflownets. arXiv preprint arXiv:2306.17693, 2023.
  40. Goal-conditioned gflownets for controllable multi-objective molecular design. arXiv preprint arXiv:2306.04620, 2023.
  41. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  42. Using the averaged hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Transactions on Evolutionary Computation, 16(4):504–522, 2012.
  43. Towards understanding and improving gflownet training. arXiv preprint arXiv:2305.07170, 2023.
  44. Nas-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777, 2020.
  45. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=PxTIG12RRHS.
  46. Accelerating bayesian optimization for biological sequence design with denoising autoencoders. In International Conference on Machine Learning, pp.  20459–20478. PMLR, 2022.
  47. David Allen Van Veldhuizen. Multiobjective evolutionary algorithms: classifications, analyses, and new innovations. Air Force Institute of Technology, 1999.
  48. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement learning, pp.  5–32, 1992.
  49. MARS: Markov molecular sampling for multi-objective drug discovery. International Conference on Learning Representations (ICLR), 2021.
  50. Computational design of three-dimensional rna structure and function. Nature nanotechnology, 14(9):866–873, 2019.
  51. Nas-bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning, pp.  7105–7114. PMLR, 2019.
  52. Nupack: Analysis and design of nucleic acid systems. Journal of computational chemistry, 32(1):170–173, 2011.
  53. Robust scheduling with gflownets. arXiv preprint arXiv:2302.05446, 2023a.
  54. Unifying generative models with gflownets. arXiv preprint arXiv:2209.02606, 2022.
  55. Let the flows tell: Solving graph combinatorial optimization problems with gflownets. arXiv preprint arXiv:2305.17010, 2023b.
  56. Distributional gflownets with quantile flows. arXiv preprint arXiv:2302.05793, 2023c.
  57. Metal sensing by dna. Chemical reviews, 117(12):8272–8325, 2017.
Citations (7)

Summary

We haven't generated a summary for this paper yet.