Emergent Mind

Approximate Dynamic Programming based on Projection onto the (min,+) subsemimodule

(1403.4175)
Published Mar 17, 2014 in cs.SY and math.OC

Abstract

We develop a new Approximate Dynamic Programming (ADP) method for infinite horizon discounted reward Markov Decision Processes (MDP) based on projection onto a subsemimodule. We approximate the value function in terms of a $(\min,+)$ linear combination of a set of basis functions whose $(\min,+)$ linear span constitutes a subsemimodule. The projection operator is closely related to the Fenchel transform. Our approximate solution obeys the $(\min,+)$ Projected Bellman Equation (MPPBE) which is different from the conventional Projected Bellman Equation (PBE). We show that the approximation error is bounded in its $L_\infty$-norm. We develop a Min-Plus Approximate Dynamic Programming (MPADP) algorithm to compute the solution to the MPPBE. We also present the proof of convergence of the MPADP algorithm and apply it to two problems, a grid-world problem in the discrete domain and mountain car in the continuous domain.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.