Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games (2102.08903v2)

Published 17 Feb 2021 in cs.LG, cs.GT, math.OC, and stat.ML

Abstract: Policy-based methods with function approximation are widely used for solving two-player zero-sum games with large state and/or action spaces. However, it remains elusive how to obtain optimization and statistical guarantees for such algorithms. We present a new policy optimization algorithm with function approximation and prove that under standard regularity conditions on the Markov game and the function approximation class, our algorithm finds a near-optimal policy within a polynomial number of samples and iterations. To our knowledge, this is the first provably efficient policy optimization algorithm with function approximation that solves two-player zero-sum Markov games.

Citations (16)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games (2102.08903v2)

Summary

Related Papers