Emergent Mind

Abstract

Monte Carlo Tree Search (MCTS) has recently been successfully used to create strategies for playing imperfect-information games. Despite its popularity, there are no theoretic results that guarantee its convergence to a well-defined solution, such as Nash equilibrium, in these games. We partially fill this gap by analysing MCTS in the class of zero-sum extensive-form games with simultaneous moves but otherwise perfect information. The lack of information about the opponent's concurrent moves already causes that optimal strategies may require randomization. We present theoretic as well as empirical investigation of the speed and quality of convergence of these algorithms to the Nash equilibria. Primarily, we show that after minor technical modifications, MCTS based on any (approximately) Hannan consistent selection function always converges to an (approximate) subgame perfect Nash equilibrium. Without these modifications, Hannan consistency is not sufficient to ensure such convergence and the selection function must satisfy additional properties, which empirically hold for the most common Hannan consistent algorithms.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.