MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage (2104.02411v1)

Published 6 Apr 2021 in cs.LG, cs.SY, and eess.SY

Abstract: In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presence of unmodelled stochasticity or model error. When the policy has a (nearly) bang-bang structure, we observe that the policy gradient method can struggle to produce meaningful steps in the policy parameters. To tackle this issue, we propose a homotopy strategy based on the interior-point method, providing a relaxation of the policy during the learning. We investigate a specific well-known battery storage problem, and show that the proposed method delivers a homogeneous and faster learning than a classical policy gradient approach.

Citations (23)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage (2104.02411v1)

Summary

Related Papers