On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems (2011.10829v2)

Published 21 Nov 2020 in cs.LG, cs.SY, and eess.SY

Abstract: We consider the problem of Reinforcement Learning for nonlinear stochastic dynamical systems. We show that in the RL setting, there is an inherent Curse of Variance" in addition to BeLLMan's infamousCurse of Dimensionality", in particular, we show that the variance in the solution grows factorial-exponentially in the order of the approximation. A fundamental consequence is that this precludes the search for anything other than ``local" feedback solutions in RL, in order to control the explosive variance growth, and thus, ensure accuracy. We further show that the deterministic optimal control has a perturbation structure, in that the higher order terms do not affect the calculation of lower order terms, which can be utilized in RL to get accurate local solutions.

Citations (8)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems (2011.10829v2)

Summary

Related Papers