Order-Preserving GFlowNets (2310.00386v2)
Abstract: Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates with probabilities proportional to a given reward. However, GFlowNets can only be used with a predefined scalar reward, which can be either computationally expensive or not directly accessible, in the case of multi-objective optimization (MOO) tasks for example. Moreover, to prioritize identifying high-reward candidates, the conventional practice is to raise the reward to a higher exponent, the optimal choice of which may vary across different environments. To address these issues, we propose Order-Preserving GFlowNets (OP-GFNs), which sample with probabilities in proportion to a learned reward function that is consistent with a provided (partial) order on the candidates, thus eliminating the need for an explicit formulation of the reward function. We theoretically prove that the training process of OP-GFNs gradually sparsifies the learned reward landscape in single-objective maximization tasks. The sparsification concentrates on candidates of a higher hierarchy in the ordering, ensuring exploration at the beginning and exploitation towards the end of the training. We demonstrate OP-GFN's state-of-the-art performance in single-objective maximization (totally ordered) and multi-objective Pareto front approximation (partially ordered) tasks, including synthetic datasets, molecule generation, and neural architecture search.
- Performance indicators in multiobjective optimization. European journal of operational research, 292(2):397–422, 2021.
- Survey of variation in human transcription factors reveals prevalent dna binding changes. Science, 351(6280):1450–1454, 2016.
- Flow network based generative models for non-iterative diverse candidate generation. Neural Information Processing Systems (NeurIPS), 2021a.
- GFlowNet foundations. arXiv preprint 2111.09266, 2021b.
- Quantifying the chemical beauty of drugs. Nature chemistry, 4(2):90–98, 2012.
- 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc., 131:8732, 2009.
- Approximate inference in discrete distributions with monte carlo tree search and value functions. In International Conference on Artificial Intelligence and Statistics, pp. 624–634. PMLR, 2020.
- A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm. In MICAI 2004: Advances in Artificial Intelligence: Third Mexican International Conference on Artificial Intelligence, Mexico City, Mexico, April 26-30, 2004. Proceedings 3, pp. 688–697. Springer, 2004.
- Bayesian structure learning with generative flow networks. In Uncertainty in Artificial Intelligence, pp. 518–528. PMLR, 2022.
- Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
- Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326, 2020.
- Nats-bench: Benchmarking nas algorithms for architecture topology and size. IEEE transactions on pattern analysis and machine intelligence, 44(7):3634–3646, 2021.
- Bohb: Robust and efficient hyperparameter optimization at scale. In International Conference on Machine Learning, pp. 1437–1446. PMLR, 2018.
- Revisiting fundamentals of experience replay. In International Conference on Machine Learning, pp. 3061–3071. PMLR, 2020.
- An improved dimension-sweep algorithm for the hypervolume indicator. In 2006 IEEE international conference on evolutionary computation, pp. 1157–1163. IEEE, 2006.
- Reinforcement learning with deep energy-based policies. International Conference on Machine Learning (ICML), 2017.
- Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International Conference on Machine Learning (ICML), 2018.
- Evaluating the quality of approximations to the non-dominated set. Citeseer, 1994.
- Off-policy maximum entropy reinforcement learning: Soft actor-critic with advantage weighted mixture policy (sac-awmp). arXiv preprint arXiv:2002.02829, 2020.
- Modified distance calculation in generational distance and inverted generational distance. In Evolutionary Multi-Criterion Optimization: 8th International Conference, EMO 2015, Guimarães, Portugal, March 29–April 1, 2015. Proceedings, Part II 8, pp. 110–125. Springer, 2015.
- Biological sequence design with GFlowNets. International Conference on Machine Learning (ICML), 2022.
- Multi-objective gflownets. In International Conference on Machine Learning, pp. 14631–14653. PMLR, 2023.
- Learning to scale logits for temperature-conditional gflownets. arXiv preprint arXiv:2310.02823, 2023a.
- Local search gflownets. arXiv preprint arXiv:2310.02710, 2023b.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- torchgfn: A pytorch gflownet library. arXiv preprint arXiv:2305.14594, 2023.
- Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.
- Learning gflownets from partial episodes for improved convergence and stability. arXiv preprint arXiv:2209.12782, 2022.
- Trajectory balance: Improved credit assignment in gflownets. arXiv preprint arXiv:2201.13259, 2022.
- Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp. 1928–1937. PMLR, 2016.
- Machine learning of molecular electronic properties in chemical compound space. New Journal of Physics, 15(9):095003, 2013. URL http://stacks.iop.org/1367-2630/15/i=9/a=095003.
- Bridging the gap between value and policy based reinforcement learning. Neural Information Processing Systems (NeurIPS), 2017.
- Evaluating generalization in gflownets for molecule design. In ICLR2022 Machine Learning for Drug Discovery, 2022.
- Bayesian learning of causal structure and mechanisms with gflownets and variational bayes. arXiv preprint arXiv:2211.02763, 2022.
- Generative augmented flow networks. arXiv preprint arXiv:2210.03308, 2022.
- Better training of gflownets with local credit and incomplete trajectories. arXiv preprint arXiv:2302.01687, 2023.
- Hierarchical variational models. In International conference on machine learning, pp. 324–333. PMLR, 2016.
- Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp. 4780–4789, 2019.
- Thompson sampling for improved exploration in gflownets. arXiv preprint arXiv:2306.17693, 2023.
- Goal-conditioned gflownets for controllable multi-objective molecular design. arXiv preprint arXiv:2306.04620, 2023.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Using the averaged hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Transactions on Evolutionary Computation, 16(4):504–522, 2012.
- Towards understanding and improving gflownet training. arXiv preprint arXiv:2305.07170, 2023.
- Nas-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777, 2020.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=PxTIG12RRHS.
- Accelerating bayesian optimization for biological sequence design with denoising autoencoders. In International Conference on Machine Learning, pp. 20459–20478. PMLR, 2022.
- David Allen Van Veldhuizen. Multiobjective evolutionary algorithms: classifications, analyses, and new innovations. Air Force Institute of Technology, 1999.
- Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement learning, pp. 5–32, 1992.
- MARS: Markov molecular sampling for multi-objective drug discovery. International Conference on Learning Representations (ICLR), 2021.
- Computational design of three-dimensional rna structure and function. Nature nanotechnology, 14(9):866–873, 2019.
- Nas-bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning, pp. 7105–7114. PMLR, 2019.
- Nupack: Analysis and design of nucleic acid systems. Journal of computational chemistry, 32(1):170–173, 2011.
- Robust scheduling with gflownets. arXiv preprint arXiv:2302.05446, 2023a.
- Unifying generative models with gflownets. arXiv preprint arXiv:2209.02606, 2022.
- Let the flows tell: Solving graph combinatorial optimization problems with gflownets. arXiv preprint arXiv:2305.17010, 2023b.
- Distributional gflownets with quantile flows. arXiv preprint arXiv:2302.05793, 2023c.
- Metal sensing by dna. Chemical reviews, 117(12):8272–8325, 2017.