Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Agent Generative Adversarial Interactive Self-Imitation Learning for AUV Formation Control and Obstacle Avoidance (2401.11378v1)

Published 21 Jan 2024 in cs.RO and cs.LG

Abstract: Multiple autonomous underwater vehicles (multi-AUV) can cooperatively accomplish tasks that a single AUV cannot complete. Recently, multi-agent reinforcement learning has been introduced to control of multi-AUV. However, designing efficient reward functions for various tasks of multi-AUV control is difficult or even impractical. Multi-agent generative adversarial imitation learning (MAGAIL) allows multi-AUV to learn from expert demonstration instead of pre-defined reward functions, but suffers from the deficiency of requiring optimal demonstrations and not surpassing provided expert demonstrations. This paper builds upon the MAGAIL algorithm by proposing multi-agent generative adversarial interactive self-imitation learning (MAGAISIL), which can facilitate AUVs to learn policies by gradually replacing the provided sub-optimal demonstrations with self-generated good trajectories selected by a human trainer. Our experimental results in a multi-AUV formation control and obstacle avoidance task on the Gazebo platform with AUV simulator of our lab show that AUVs trained via MAGAISIL can surpass the provided sub-optimal expert demonstrations and reach a performance close to or even better than MAGAIL with optimal demonstrations. Further results indicate that AUVs' policies trained via MAGAISIL can adapt to complex and different tasks as well as MAGAIL learning from optimal demonstrations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. L. Paull, S. Saeedi, M. Seto, and H. Li, “Auv navigation and localization: A review,” IEEE Journal of Oceanic Engineering, vol. 39, no. 1, pp. 131–149, 2013.
  2. C. Cheng, Q. Sha, B. He, and G. Li, “Path planning and obstacle avoidance for auv: A review,” Ocean Engineering, vol. 235, p. 109355, 2021.
  3. J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1238–1274, 2013.
  4. M. Carreras, J. Yuh, J. Batlle, and P. Ridao, “A behavior-based scheme using reinforcement learning for autonomous underwater vehicles,” IEEE Journal of Oceanic Engineering, vol. 30, no. 2, pp. 416–427, 2005.
  5. A. El-Fakdi and M. Carreras, “Policy gradient based reinforcement learning for real autonomous underwater cable tracking,” in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.   IEEE, 2008, pp. 3635–3640.
  6. ——, “Two-step gradient-based reinforcement learning for underwater robotics behavior learning,” Robotics and Autonomous Systems, vol. 61, no. 3, pp. 271–282, 2013.
  7. R. Yu, Z. Shi, C. Huang, T. Li, and Q. Ma, “Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle,” in Proceedings of 2017 36th Chinese Control Conference (CCC).   IEEE, 2017, pp. 4958–4965.
  8. Q. Zhang, J. Lin, Q. Sha, B. He, and G. Li, “Deep interactive reinforcement learning for path following of autonomous underwater vehicle,” IEEE Access, vol. 8, pp. 24 258–24 268, 2020.
  9. R. B. Grando, J. C. de Jesus, V. A. Kich, A. H. Kolling, N. P. Bortoluzzi, P. M. Pinheiro, A. A. Neto, and P. L. Drews, “Deep reinforcement learning for mapless navigation of a hybrid aerial underwater vehicle with medium transition,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 1088–1094.
  10. B. Hadi, A. Khosravi, and P. Sarhadi, “Deep reinforcement learning for adaptive path planning and control of an autonomous underwater vehicle,” Applied Ocean Research, vol. 129, p. 103326, 2022.
  11. C. Zhang, P. Cheng, B. Du, B. Dong, and W. Zhang, “Auv path tracking with real-time obstacle avoidance via reinforcement learning under adaptive constraints,” Ocean Engineering, vol. 256, p. 111453, 2022.
  12. I. Carlucho, M. De Paula, S. Wang, Y. Petillot, and G. G. Acosta, “Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning,” Robotics and Autonomous Systems, vol. 107, pp. 71–86, 2018.
  13. Z. Fang, D. Jiang, J. Huang, C. Cheng, Q. Sha, B. He, and G. Li, “Autonomous underwater vehicle formation control and obstacle avoidance using multi-agent generative adversarial imitation learning,” Ocean Engineering, vol. 262, p. 112182, 2022.
  14. G. Wang, F. Wei, Y. Jiang, M. Zhao, K. Wang, and H. Qi, “A multi-auv maritime target search method for moving and invisible objects based on multi-agent deep reinforcement learning,” Sensors, vol. 22, no. 21, p. 8562, 2022.
  15. C. Lin, G. Han, T. Zhang, S. B. H. Shah, and Y. Peng, “Smart underwater pollution detection based on graph-based multi-agent reinforcement learning towards auv-based network its,” IEEE Transactions on Intelligent Transportation Systems, pp. 7494–7505, 2022.
  16. D. Gu and E. Yang, “Multiagent reinforcement learning for multi-robot systems: A survey,” Technical Report of the Department of Computer Science, 2004.
  17. L. Buşoniu, R. Babuška, and B. De Schutter, “Multi-agent reinforcement learning: An overview,” Innovations in Multi-agent Systems and Applications-1, pp. 183–221, 2010.
  18. B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey of robot learning from demonstration,” Robotics and Autonomous Systems, vol. 57, no. 5, pp. 469–483, 2009.
  19. H. Ravichandar, A. S. Polydoros, S. Chernova, and A. Billard, “Recent advances in robot learning from demonstration,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 3, pp. 297–330, 2020.
  20. S. Ross and D. Bagnell, “Efficient reductions for imitation learning,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.   JMLR Workshop and Conference Proceedings, 2010, pp. 661–668.
  21. A. Y. Ng, S. J. Russell et al., “Algorithms for inverse reinforcement learning.” in Proceedings of International Conference on Machine Learning (ICML), vol. 1, 2000, p. 2.
  22. J. Ho, J. Gupta, and S. Ermon, “Model-free imitation learning with policy optimization,” in Proceedings of International Conference on Machine Learning (ICML).   PMLR, 2016, pp. 2760–2769.
  23. J. Ho and S. Ermon, “Generative adversarial imitation learning,” Advances in Neural Information Processing Systems, vol. 29, pp. 4565–4573, 2016.
  24. T. Higaki and H. Hashimoto, “Human-like route planning for automatic collision avoidance using generative adversarial imitation learning,” Applied Ocean Research, vol. 138, p. 103620, 2023.
  25. D. Jiang, J. Huang, Z. Fang, C. Cheng, Q. Sha, B. He, and G. Li, “Generative adversarial interactive imitation learning for path following of autonomous underwater vehicle,” Ocean Engineering, vol. 260, p. 111971, 2022.
  26. G. Li, R. Gomez, K. Nakamura, and B. He, “Human-centered reinforcement learning: A survey,” IEEE Transactions on Human-Machine Systems, vol. 49, no. 4, pp. 337–349, 2019.
  27. P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” Advances in Neural Information Processing Systems, vol. 30, pp. 4300–4308, 2017.
  28. J. Song, H. Ren, D. Sadigh, and S. Ermon, “Multi-agent generative adversarial imitation learning,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  29. Y. Guo, J. Oh, S. Singh, and H. Lee, “Generative adversarial self-imitation learning,” arXiv preprint arXiv:1812.00950, 2018.
  30. K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017.
  31. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 38, no. 2, pp. 156–172, 2008.
  32. H. Qie, D. Shi, T. Shen, X. Xu, Y. Li, and L. Wang, “Joint optimization of multi-uav target assignment and path planning based on multi-agent reinforcement learning,” IEEE Access, vol. 7, pp. 146 264–146 272, 2019.
  33. K. Zhang, Z. Yang, and T. Başar, “Multi-agent reinforcement learning: A selective overview of theories and algorithms,” Handbook of Reinforcement Learning and Control, pp. 321–384, 2021.
  34. M. Bloem and N. Bambos, “Infinite time horizon maximum causal entropy inverse reinforcement learning,” in Proceedings of the 53rd IEEE Conference on Decision and Control.   IEEE, 2014, pp. 4911–4916.
  35. C. Schroeder de Witt, T. Gupta, D. Makoviichuk, V. Makoviychuk, P. H. Torr, M. Sun, and S. Whiteson, “Is independent learning all you need in the starcraft multi-agent challenge?” arXiv e-prints, pp. arXiv–2011, 2020.
  36. Z. Zhang, “Improved adam optimizer for deep neural networks,” in 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS).   Ieee, 2018, pp. 1–2.
  37. M. M. M. Manhães, S. A. Scherer, M. Voss, L. R. Douat, and T. Rauschenbach, “Uuv simulator: A gazebo-based package for underwater intervention and multi-robot simulation,” in OCEANS 2016 MTS/IEEE Monterey.   IEEE, 2016, pp. 1–8.
Citations (1)

Summary

We haven't generated a summary for this paper yet.