Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Designing a skilled soccer team for RoboCup: exploring skill-set-primitives through reinforcement learning (2312.14360v2)

Published 22 Dec 2023 in cs.RO

Abstract: The RoboCup 3D Soccer Simulation League serves as a competitive platform for showcasing innovation in autonomous humanoid robot agents through simulated soccer matches. Our team, FC Portugal, developed a new codebase from scratch in Python after RoboCup 2021. The team's performance relies on a set of skills centered around novel unifying primitives and a custom, symmetry-extended version of the Proximal Policy Optimization algorithm. Our methods have been thoroughly tested in official RoboCup matches, where FC Portugal has won the last two main competitions, in 2022 and 2023. This paper presents our training framework, as well as a timeline of skills developed using our skill-set-primitives, which considerably improve the sample efficiency and stability of skills, and motivate seamless transitions. We start with a significantly fast sprint-kick developed in 2021 and progress to the most recent skill set, including a multi-purpose omnidirectional walk, a dribble with unprecedented ball control, a solid kick, and a push skill. The push addresses low-level collision scenarios and high-level strategies to increase ball possession. We address the resource-intensive nature of this task through an innovative multi-agent learning approach. Finally, we release the team's codebase to the RoboCup community, providing other teams with a robust and modern foundation upon which they can build new features.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Learning a humanoid kick with controlled distance. In RoboCup 2016: Robot World Cup XX 20, pp.  45–57. Springer, 2017.
  2. Learning low level skills from scratch for humanoid robot soccer using deep reinforcement learning. In 2019 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp.  256–263. IEEE, 2019a.
  3. Learning to run faster in a humanoid robot soccer environment through reinforcement learning. In RoboCup 2019: Robot World Cup XXIII, pp.  3–15. Springer, 2019b.
  4. 6D localization and kicking for humanoid robotic soccer. Journal of Intelligent & Robotic Systems, 102(2):1–25, 2021.
  5. Addressing imperfect symmetry: a novel symmetry-learning actor-critic extension. arXiv preprint arXiv:2309.02711, 2023.
  6. The complexity of decentralized control of markov decision processes. Mathematics of operations research, 27(4):819–840, 2002.
  7. Generating human-like soccer primitives from human data. Robotics and Autonomous Systems, 57(8):860–869, 2009.
  8. Synthesis of walking primitive databases for biped robots in 3d-environments. In 2003 IEEE international conference on robotics and automation (Cat. No. 03CH37422), volume 1, pp.  1343–1349. IEEE, 2003.
  9. Keyframe sampling, optimization, and behavior integration: Towards long-distance kicking in the robocup 3d simulation league. In RoboCup 2014: Robot World Cup XVIII 18, pp.  571–582. Springer, 2015.
  10. Klaus Dorer. Learning to use toes in a humanoid robot. In RoboCup 2017: Robot World Cup XXI 11, pp.  168–179. Springer, 2018.
  11. New curve sprint test for soccer players: Reliability and relationship with linear sprint. Journal of sports sciences, 38(11-12):1320–1325, 2020.
  12. Learning to walk with toes. In Artificial Intelligence: Research Impact on Key Industries. Proceedings of the Upper-Rhine Artificial Intelligence Symposium, pp. 3–11, 2020.
  13. A survey of research on several problems in the RoboCup3D simulation environment. Research Square preprint at https://doi.org/10.21203/rs.3.rs-2925677/v1, 2023.
  14. Physics-based dexterous manipulations with estimated hand poses and residual reinforcement learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.  9561–9568. IEEE, 2020.
  15. Cooperative multi-agent control using deep reinforcement learning. In Autonomous Agents and Multiagent Systems: AAMAS 2017 Workshops, Best Papers, São Paulo, Brazil, May 8-12, 2017, Revised Selected Papers 16, pp.  66–83. Springer, 2017.
  16. The development of honda humanoid robot. In Proceedings. 1998 IEEE international conference on robotics and automation (Cat. No. 98CH36146), volume 2, pp.  1321–1326. IEEE, 1998.
  17. Apply acceleration sampling to learn kick motion for nao humanoid robot. In 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), pp.  318–323. IEEE, 2020.
  18. Residual reinforcement learning for robot control. In 2019 International Conference on Robotics and Automation (ICRA), pp.  6023–6029. IEEE, 2019.
  19. Optimization of parametrised kicking motion for humanoid soccer player. In 2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp.  241–246. IEEE, 2014.
  20. Biped walking pattern generation by using preview control of zero-moment point. In 2003 IEEE international conference on robotics and automation (Cat. No. 03CH37422), volume 2, pp.  1620–1626. IEEE, 2003.
  21. A fast and stable omnidirectional walking engine for the nao humanoid robot. In RoboCup 2019: Robot World Cup XXIII 23, pp.  99–111. Springer, 2019.
  22. A cpg-based agile and versatile locomotion framework using proximal symmetry loss. arXiv preprint arXiv:2103.00928, 2021a.
  23. Robust biped locomotion using deep reinforcement learning on top of an analytical control approach. Robotics and Autonomous Systems, 146:103900, 2021b.
  24. Learning hybrid locomotion skills — learn to exploit residual actions and modulate model-based gait control. Frontiers in Robotics and AI, 10:1004490, 2023.
  25. A reliable model-based walking engine with push recovery capability. In 2017 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp.  122–127. IEEE, 2017.
  26. A design method for kicks of the desired distance by combinations of based kicks in rss3d. In 2019 6th International Conference on Computational Science/Intelligence and Applied Informatics (CSII), pp.  66–71. IEEE, 2019.
  27. Robocup: A challenge problem for ai. AI magazine, 18(1):73–73, 1997.
  28. A study of layered learning strategies applied to individual behaviors in robot soccer. In RoboCup 2015: Robot World Cup XIX 19, pp.  290–302. Springer, 2015a.
  29. Ball dribbling for humanoid biped robots: a reinforcement learning and fuzzy control approach. In RoboCup 2014: Robot World Cup XVIII 18, pp.  549–561. Springer, 2015b.
  30. An omnidirectional walk for a biped robot based on gyroscope-accelerometer measurement. In 2014 IEEE International Conference on Mechatronics and Automation, pp.  1052–1057. IEEE, 2014.
  31. From motor control to team play in simulated humanoid football. Science Robotics, 7(69):eabo0235, 2022.
  32. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30, 2017.
  33. Overlapping layered learning. Artificial Intelligence, 254:21–43, 2018.
  34. Ut austin villa 2014: Robocup 3d simulation league champion via overlapping layered learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 29, 2015.
  35. Learning push recovery behaviors for humanoid walking using deep reinforcement learning. Journal of Intelligent & Robotic Systems, 106(1):8, 2022.
  36. Learning humanoid robot running motions with symmetry incentive through proximal policy optimization. Journal of Intelligent & Robotic Systems, 102(3):54, 2021.
  37. Luckeciano Carvalho Melo and Marcos Ricardo Omena Albuquerque Máximo. Learning humanoid robot running skills through proximal policy optimization. In 2019 Latin american robotics symposium (LARS), 2019 Brazilian symposium on robotics (SBR) and 2019 workshop on robotics in education (WRE), pp.  37–42. IEEE, 2019.
  38. Keyframe movement optimization for simulated humanoid robot using a parallel optimization framework. In 2016 XIII Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR), pp.  79–84. IEEE, 2016.
  39. Deep reinforcement learning for humanoid robot dribbling. In 2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE), pp.  1–6. IEEE, 2020.
  40. Deep reinforcement learning for humanoid robot behaviors. Journal of Intelligent & Robotic Systems, 105(1):12, 2022.
  41. Learning from demonstration and adaptation of biped locomotion. Robotics and autonomous systems, 47(2-3):79–91, 2004.
  42. B. Ravindran and A. G. Barto. Symmetries and model minimization in markov decision processes. Technical report, University of Massachusetts, USA, 2001.
  43. Performing the kick during walking for robocup 3d soccer simulation league using reinforcement learning algorithm. International Journal of Social Robotics, 13:1235–1252, 2021.
  44. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  45. An adaptive lipm-based dynamic walk using model parameter optimization on humanoid robots. KI-Künstliche Intelligenz, 30(3-4):233–244, 2016.
  46. Learning to walk fast: Optimized hip height movement for simulated and real humanoid robots. Journal of Intelligent & Robotic Systems, 80(3):555–571, 2015.
  47. Adaptive omni-directional walking method with fuzzy interpolation for biped robots. International Journal of Networked and Distributed Computing, 4(3):145–158, 2016a.
  48. A novel fuzzy omni-directional gait planning algorithm for biped robot. In 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), pp.  71–76. IEEE, 2016b.
  49. Residual policy learning. arXiv preprint arXiv:1812.06298, 2018.
  50. Multi-agent actor centralized-critic with communication. Neurocomputing, 390:40–56, 2020.
  51. Scaling multi-agent reinforcement learning to full 11 versus 11 simulated robotic football. Autonomous Agents and Multi-Agent Systems, 37(1):20, 2023.
  52. Deep reinforcement multi-directional kick-learning of a simulated robot with toes. In 2021 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp.  104–110. IEEE, 2021.
  53. Peter Stone. Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer. The MIT Press, 03 2000. ISBN 9780262284448. doi: 10.7551/mitpress/4151.001.0001.
  54. Gait optimization method for humanoid robots based on parallel comprehensive learning particle swarm optimizer algorithm. Frontiers in Neurorobotics, 14:600885, 2021.
  55. Parallel deep reinforcement learning method for gait control of biped robot. IEEE Transactions on Circuits and Systems II: Express Briefs, 69(6):2802–2806, 2022.
  56. Humanoid robot’s omnidirectional walking. In 2015 IEEE International Conference on Information and Automation, pp.  381–385, 2015. doi: 10.1109/ICInfA.2015.7279317.
  57. Concurrent layered learning. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pp.  193–200, 2003.
  58. SimSpark: An open source robot simulator developed by the RoboCup community. In RoboCup 2013: Robot World Cup XVII, pp.  632–639. Springer, 2013.
  59. The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems, 35:24611–24624, 2022.
  60. Tossingbot: Learning to throw arbitrary objects with residual physics. IEEE Transactions on Robotics, 36(4):1307–1319, 2020.
  61. Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of reinforcement learning and control, pp. 321–384, 2021.
Citations (3)

Summary

  • The paper introduces a novel training framework with skill-set-primitives that significantly enhances robotic soccer performance.
  • It applies an extended Proximal Policy Optimization algorithm leveraging symmetry properties for efficient, multi-stage skill learning.
  • The study publicly releases its codebase, enabling further research and innovation in humanoid soccer robotics and AI applications.

Background on RoboCup and Reinforcement Learning

RoboCup stands as an influential platform for advancing research in robotics and artificial intelligence. It simulates a soccer game where two teams of autonomous humanoid robots compete, providing a challenging and dynamic environment for developing and testing AI techniques. The use of reinforcement learning (RL) to train robots has become increasingly popular, with a focus on creating highly skilled robotic agents capable of displaying tactical acumen and collaboration with teammates. One of the primary challenges is forming a cohesive set of robot skills that work in harmony across different levels, from basic motor control to overall team strategy.

FC Portugal's Novel Approach

The paper describes an innovative training framework introduced by the FC Portugal team to enhance the performance of their soccer-playing robots. This framework is rooted in a concept called skill-set-primitives, which encapsulates recurring base actions facilitating easier transitions between complex behaviors, such as walking, running, and ball control. By applying this approach, the team has achieved significant success in the last two RoboCup 3D Soccer Simulation League competitions, demonstrating reinforced skills in tactical play and robust motor control.

Methodology and Results

The paper outlines the use of a custom algorithm, which extends Proximal Policy Optimization (PPO) with the ability to leverage symmetry properties for more efficient learning. This training took place over several stages, beginning with a fast sprint-kick behavior and progressing to advanced skills including an omnidirectional walk, a precise kick, and a close control dribble. The team also developed a push strategy to handle collisions between the robots, learning strategies and motor control in a multi-agent setting.

FC Portugal's robots, while primarily focusing on offensive skills, demonstrated speed, precision, and control in their movements. Notable achievements include a sprinting speed of 3.69 m/s and a maneuverable approach to ball control and kicking while maintaining dribble compliance with new RoboCup regulations.

Codebase Release and Implications

In an effort to contribute to the RoboCup community and AI research at large, the team has publicly released the codebase for their robot soccer team. This includes the reinforcement learning gym integrated into their system, enabling other teams and researchers to build upon the proven foundation laid by FC Portugal. The shared resources and detailed methodology pave the way for further innovation in the field of robotic soccer, with implications potentially reaching beyond the game into other areas of robotics and AI applications.

Youtube Logo Streamline Icon: https://streamlinehq.com