Action valuation of on- and off-ball soccer players based on multi-agent deep reinforcement learning (2305.17886v2)

Published 29 May 2023 in cs.AI and cs.LG

Abstract: Analysis of invasive sports such as soccer is challenging because the game situation changes continuously in time and space, and multiple agents individually recognize the game situation and make decisions. Previous studies using deep reinforcement learning have often considered teams as a single agent and valued the teams and players who hold the ball in each discrete event. Then it was challenging to value the actions of multiple players, including players far from the ball, in a spatiotemporally continuous state space. In this paper, we propose a method of valuing possible actions for on- and off-ball soccer players in a single holistic framework based on multi-agent deep reinforcement learning. We consider a discrete action space in a continuous state space that mimics that of Google research football and leverages supervised learning for actions in reinforcement learning. In the experiment, we analyzed the relationships with conventional indicators, season goals, and game ratings by experts, and showed the effectiveness of the proposed method. Our approach can assess how multiple players move continuously throughout the game, which is difficult to be discretized or labeled but vital for teamwork, scouting, and fan engagement.

References (5)

Citations (5)

View on Semantic Scholar

Summary

The paper presents a novel multi-agent deep RL framework with action supervision to better assess the contributions of soccer players.
It leverages GRUs and a temporal difference loss combined with supervised learning for more accurate Q-value estimation in dynamic game settings.
Comparative analysis shows the model delivers robust valuation of both on- and off-ball actions, enhancing scouting and tactical decisions.

Action Valuation for Soccer Players Using Deep Reinforcement Learning

Introduction

This paper introduces a novel method for valuing the actions of on- and off-ball soccer players using a holistic framework based on multi-agent deep reinforcement learning (RL). The approach leverages a discrete action space in a continuous state space, mimicking Google Research Football, and incorporates supervised learning to enhance RL effectiveness. The valuation method addresses the challenges posed by the dynamic nature of soccer, where players continuously make decisions in a spatiotemporally continuous environment. The proposed framework assesses both the overt and subtle contributions of players, providing valuable insights into teamwork effectiveness, player scouting, and fan engagement.

Methodology

The RL framework comprises three core components: state, action, and reward. The state includes the positions and velocities of players and the ball, while the actions encompass movement directions, passes, and shots. Rewards are assigned based on game outcomes, such as goals, expected scoring probabilities, and opponent attacks.

The architecture of the RL model incorporates gated recurrent units (GRUs) for processing sequential game state inputs and outputs Q-values for each player's actions.

Figure 1: Our deep RL model with action supervision.

To enhance learning accuracy, the model employs a temporal difference (TD) loss function combined with action supervision. The action supervision restrains learning trajectories by emphasizing actions taken in the actual game, inducing an inductive bias to efficiently estimate Q-values for potential actions.

Experiments and Results

Experiments were conducted using data from 54 games in the 2019 Meiji J1 League. The effectiveness of the model was demonstrated through the analysis of attack sequences from Yokohama F. Marinos. Quantitative evaluation showed that models with action supervision exhibited superior performance, evidenced by lower TD loss and higher Q-values for realistic playing strategies.

Figure 2: Example of estimated Q-values. (a) shows an example attack. Blue, red, and black indicate the attacker, defender, and the ball, respectively (note that the ball is over a defender). (b) and (c) show the Q-value of player A for each action using the RL model with and without action supervision, respectively. Note that $i, p, sh, sp, rd, ss$ correspond to idle, pass, shot, sprint, decelerate, and sprint end, respectively, and arrows correspond to the direction of movement.

The proposed method displayed robustness in valuing defensive players' contributions, correlating significantly with players known for their effective passes. The analysis of player valuation against conventional indicators like goals, expert ratings, and existing scoring models highlighted the unique insights provided by the RL framework.

Figure 3: The relationship between (a) average Q-values computed by the proposed method and season goals and (b) average Q-values of on- and off-ball states.

Figure 4: The relationship between average Q-values by the proposed method and (a) the average rating by the experts, (b) OBSO \cite{Spearman18}.

Comparative Analysis

The proposed method was evaluated against other indicators, such as OBSO and C-OBSO. These comparisons underscored the proposed approach's capability to comprehensively value players' impacts, including their indirect contributions through strategic movements and passes. The RL model's valuations effectively identified players' contributions beyond mere goal-scoring, reflecting their overall engagement in playmaking and defense.

Conclusion

The paper presents a deep RL framework augmented with action supervision that systematically values soccer players' contributions in a continuous and dynamic environment. This approach provides meaningful insights into player valuations, offering a tool for coaches, scouts, and teams aiming to enhance team collaboration and extract individual player assessments. Future work may involve refining RL models with more sophisticated reward functions and advancing simulations to bolster player valuation accuracy. The results contribute to strengthening the understanding of tactical behavior modeling in team sports, paving the way for intelligent sports analytics.