HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent Pathfinding (2402.15546v1)
Abstract: Large-scale multi-agent pathfinding (MAPF) presents significant challenges in several areas. As systems grow in complexity with a multitude of autonomous agents operating simultaneously, efficient and collision-free coordination becomes paramount. Traditional algorithms often fall short in scalability, especially in intricate scenarios. Reinforcement Learning (RL) has shown potential to address the intricacies of MAPF; however, it has also been shown to struggle with scalability, demanding intricate implementation, lengthy training, and often exhibiting unstable convergence, limiting its practical application. In this paper, we introduce Heuristics-Informed Multi-Agent Pathfinding (HiMAP), a novel scalable approach that employs imitation learning with heuristic guidance in a decentralized manner. We train on small-scale instances using a heuristic policy as a teacher that maps each single agent observation information to an action probability distribution. During pathfinding, we adopt several inference techniques to improve performance. With a simple training scheme and implementation, HiMAP demonstrates competitive results in terms of success rate and scalability in the field of imitation-learning-only MAPF, showing the potential of imitation-learning-only MAPF equipped with inference techniques.
- Natalie Abreu. 2022. Efficient Deep Learning for Multi Agent Pathfinding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 13122–13123.
- Learning to Schedule in Multi-Agent Pathfinding. (2023).
- Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. In Proceedings of the International Symposium on Combinatorial Search, Vol. 5. 19–27.
- Multiagent Path Finding Using Deep Reinforcement Learning Coupled With Hot Supervision Contrastive Loss. IEEE Transactions on Industrial Electronics 70, 7 (2022), 7032–7040.
- Learning to Team-Based Navigation: A Review of Deep Reinforcement Learning Techniques for Multi-Agent Pathfinding. arXiv preprint arXiv:2308.05893 (2023).
- PRIMAL _2_2\_2_ 2: Pathfinding via reinforcement and imitation multi-agent learning-lifelong. IEEE Robotics and Automation Letters 6, 2 (2021), 2666–2673.
- RDE: A Hybrid Policy Framework for Multi-Agent Path Finding Problem. arXiv preprint arXiv:2311.01728 (2023).
- Eecbs: A bounded-suboptimal search for multi-agent path finding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 12353–12362.
- Qiushi Lin and Hang Ma. 2023. SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially Observable Multi-Agent Path Finding. IEEE Robotics and Automation Letters (2023).
- Feasibility study: Moving non-homogeneous teams in congested video game environments. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 13. 270–272.
- Distributed heuristic multi-agent path finding with communication. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 8699–8705.
- Learning selective communication for multi-agent path finding. IEEE Robotics and Automation Letters 7, 2 (2021), 1455–1462.
- Planning, Scheduling and Monitoring for Airport Surface Operations.. In AAAI Workshop: Planning for Hybrid Systems. 608–614.
- MAPFAST: A deep algorithm selector for multi agent path finding using shortest path embeddings. arXiv preprint arXiv:2102.12461 (2021).
- Primal: Pathfinding via reinforcement and imitation multi-agent learning. IEEE Robotics and Automation Letters 4, 3 (2019), 2378–2385.
- Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence 219 (2015), 40–66.
- Multi-agent pathfinding: Definitions, variants, and benchmarks. In Proceedings of the International Symposium on Combinatorial Search, Vol. 10. 151–158.
- Efficient SAT approach to multi-agent path finding under the sum of costs objective. In Proceedings of the twenty-second european conference on artificial intelligence. 810–818.
- SCRIMP: Scalable Communication for Reinforcement-and Imitation-Learning-Based Multi-Agent Pathfinding. arXiv preprint arXiv:2303.00605 (2023).
- Coordinating hundreds of cooperative, autonomous vehicles in warehouses. AI magazine 29, 1 (2008), 9–9.
- Jingjin Yu and Steven M LaValle. 2013. Planning optimal paths for multiple robots on graphs. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 3612–3617.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.