Understanding Sample Generation Strategies for Learning Heuristic Functions in Classical Planning (2211.13316v3)
Abstract: We study the problem of learning good heuristic functions for classical planning tasks with neural networks based on samples represented by states with their cost-to-goal estimates. The heuristic function is learned for a state space and goal condition with the number of samples limited to a fraction of the size of the state space, and must generalize well for all states of the state space with the same goal condition. Our main goal is to better understand the influence of sample generation strategies on the performance of a greedy best-first heuristic search (GBFS) guided by a learned heuristic function. In a set of controlled experiments, we find that two main factors determine the quality of the learned heuristic: the algorithm used to generate the sample set and how close the sample estimates to the perfect cost-to-goal are. These two factors are dependent: having perfect cost-to-goal estimates is insufficient if the samples are not well distributed across the state space. We also study other effects, such as adding samples with high-value estimates. Based on our findings, we propose practical strategies to improve the quality of learned heuristics: three strategies that aim to generate more representative states and two strategies that improve the cost-to-goal estimates. Our practical strategies result in a learned heuristic that, when guiding a GBFS algorithm, increases by more than 30% the mean coverage compared to a baseline learned heuristic.
- “Solving the Rubik’s Cube with Deep Reinforcement Learning and Search” In Nature Machine Intelligence 1.8, 2019, pp. 356–363
- “Revisiting Regression in Planning” In International Joint Conference on Artificial Intelligence, 2013
- Shahab Jabbari Arfaee, Sandra Zilles and Robert C. Holte “Learning Heuristic Functions for Large State Spaces” In Artificial Intelligence 175.16-17, 2011, pp. 2075–2098
- “Complexity Results for SAS+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT Planning” In Computational Intelligence 11.4, 1995, pp. 625–655
- Blai Bonet “An Admissible Heuristic for SAS+{}^{+}start_FLOATSUPERSCRIPT + end_FLOATSUPERSCRIPT Planning Obtained from the State Equation” In International Joint Conference on Artificial Intelligence, 2013, pp. 2268–2274
- “Planning as heuristic search” In Artificial Intelligence 129.1-2, 2001, pp. 5–33
- Joseph C Culberson and Jonathan Schaeffer “Pattern databases” In Computational Intelligence 14.3 Wiley Online Library, 1998, pp. 318–334
- “Neural Logic Machines” In International Conference on Learning Representations, 2018
- James E. Doran and Donald Michie “Experiments with the Graph Traverser Program” In Proceedings of the Royal Society A 294, 1966, pp. 235–259
- Stefan Edelkamp “Planning with Pattern Databases” In European Conference on Planning, 2001
- Patrick Ferber, Malte Helmert and Jorg Hoffmann “Neural Network Heuristics for Classical Planning: A Study of Hyperparameter Space” In European Conference on Artificial Intelligence 325, 2020, pp. 2346–2353
- “Neural Network Heuristic Functions for Classical Planning: Bootstrapping and Comparison to Other Methods” In International Conference on Automated Planning and Scheduling, 2022
- “Purely Declarative Action Descriptions are Overrated: Classical Planning with Simulators” In International Joint Conference on Artificial Intelligence, 2017
- “Reinforcement Learning for Classical Planning: Viewing Heuristics as Dense Reward Generators” In International Conference on Automated Planning and Scheduling 32, 2022, pp. 588–596
- Marco Gori, Gabriele Monfardini and Franco Scarselli “A New Model for Learning in Graph Domains” In IEEE International Joint Conference on Neural Networks 2.2005, 2005, pp. 729–734
- Peter E. Hart, Nils J. Nilsson and Bertram Raphael “A Formal Basis for the Heuristic Determination of Minimum Cost Paths” In IEEE Transactions on Systems Science and Cybernetics 4.2, 1968, pp. 100–107
- “Admissible Heuristics for Optimal Planning” In International Conference on Artificial Intelligence Planning Systems, 2000, pp. 140–149
- “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification” In International Conference on Computer Vision, 2015
- “Deep Residual Learning for Image Recognition” In IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778
- M. Helmert “The Fast Downward Planning System” In Journal of Artificial Intelligence Research 26, 2006, pp. 191–246
- Malte Helmert “Concise Finite-Domain Representations For PDDL Planning Tasks” In Artificial Intelligence 173.5-6, 2009, pp. 503–535
- “Landmarks, Critical Paths and Abstractions: What’s the Difference Anyway?” In International Conference on Automated Planning and Scheduling, 2009, pp. 162–169
- Malte Helmert, Patrik Haslum and Jörg Hoffmann “Flexible Abstraction Heuristics for Optimal Sequential Planning” In International Conference on Automated Planning and Scheduling, 2007
- “The FF Planning System: Fast Plan Generation Through Heuristic Search” In Journal of Artificial Intelligence Research 14, 2001, pp. 253–302
- Jörg Hoffmann, Julie Porteous and Laura Sebastia “Ordered Landmarks in Planning” In Journal of Artificial Intelligence Research 22, 2004, pp. 215–278
- Myles Hollander, Douglas A. Wolfe and Eric Chicken “Nonparametric statistical methods” Wiley, 2014
- Robert C. Holte “Common Misconceptions Concerning Heuristic Search” In Symposium on Combinatorial Search, 2010
- “Cost-Optimal Planning with Landmarks” In International Joint Conference on Artificial Intelligence, 2009, pp. 1728–1733
- “Adam: A Method for Stochastic Optimization” In International Conference on Learning Representations, 2015
- “Dying ReLU and Initialization: Theory and Numerical Examples” In Communications in Computational Physics 28.5, 2020, pp. 1671–1706
- “Sampling from Pre-Images to Learn Heuristic Functions for Classical Planning” In Symposium on Combinatorial Search 15.1, 2022, pp. 308–310
- “PyTorch: An Imperative Style, High-Performance Deep Learning Library” In Advances In Neural Information Processing Systems 32, 2019
- Or Rivlin, Tamir Hazan and Erez Karpas “Generalized planning with deep reinforcement learning”, 2020 arXiv:2005.02305 [cs.AI]
- Mehdi Samadi, Ariel Felner and Jonathan Schaeffer “Learning from Multiple Heuristics” In AAAI Conference on Artificial Intelligence, 2008, pp. 357–362
- “The Graph Neural Network Model” In IEEE Transactions on Neural Networks 20.1, 2008, pp. 61–80
- William Shen, Felipe Trevizan and Sylvie Thiebaux “Learning Domain-Independent Planning Heuristics with Hypergraph Networks” In International Conference on Automated Planning and Scheduling, 2020, pp. 574–584
- Simon Ståhlberg, Blai Bonet and Hector Geffner “Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits” In International Conference on Automated Planning and Scheduling 32, 2022, pp. 629–637
- “Exponential-Binary State-Space Search”, 2019 arXiv:1906.02912 [cs.AI]
- “Action Schema Networks: Generalised Policies With Deep Learning” In AAAI Conference on Artificial Intelligence, 2018
- “ASNets: Deep Learning for Generalised Planning” In Journal of Artificial Intelligence Research 68, 2020, pp. 1–68
- “An LP-based Heuristic for Optimal Planning” In Principles and Practice of Constraint Programming, 2007, pp. 651–665
- Liu Yu, Ryo Kuroiwa and Alex Fukunaga “Learning Search-Space Specific Heuristics Using Neural Network” In ICAPS Workshop on Heuristics and Search for Domain-independent Planning, 2020