Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Should we use model-free or model-based control? A case study of battery management systems (2407.15313v1)

Published 22 Jul 2024 in eess.SY and cs.SY

Abstract: Reinforcement learning (RL) and model predictive control (MPC) each offer distinct advantages and limitations when applied to control problems in power and energy systems. Despite various studies on these methods, benchmarks remain lacking and the preference for RL over traditional controls is not well understood. In this work, we put forth a comparative analysis using RL- and MPC-based controllers for optimizing a battery management system (BMS). The BMS problem aims to minimize costs while adhering to operational limits. by adjusting the battery (dis)charging in response to fluctuating electricity prices over a time horizon. The MPC controller uses a learningbased forecast of future demand and price changes to formulate a multi-period linear program, that can be solved using off-the-shelf solvers. Meanwhile, the RL controller requires no timeseries modeling but instead is trained from the sample trajectories using the proximal policy optimization (PPO) algorithm. Numerical tests compare these controllers across optimality, training time, testing time, and robustness, providing a comprehensive evaluation of their efficacy. RL not only yields optimal solutions quickly but also ensures robustness to shifts in customer behavior, such as changes in demand distribution. However, as expected, training the RL agent is more time-consuming than MPC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. H. Zhang, S. Seal, D. Wu, F. Bouffard, and B. Boulet, “Building energy management with reinforcement learning and model predictive control: A survey,” IEEE Access, vol. 10, pp. 27 853–27 862, 2022.
  2. S. Kamthe and M. Deisenroth, “Data-efficient reinforcement learning with probabilistic model predictive control,” in International conference on artificial intelligence and statistics.   PMLR, 2018, pp. 1701–1710.
  3. Y. Lin, J. McPhee, and N. L. Azad, “Comparison of deep reinforcement learning and model predictive control for adaptive cruise control,” IEEE Transactions on Intelligent Vehicles, vol. 6, no. 2, pp. 221–231, 2020.
  4. H. Chen, R. Xiong, C. Lin, and W. Shen, “Model predictive control based real-time energy management for hybrid energy storage system,” CSEE Journal of Power and Energy Systems, vol. 7, no. 4, pp. 862–874, 2020.
  5. K.-b. Kwon and H. Zhu, “Reinforcement learning-based optimal battery control under cycle-based degradation cost,” IEEE Transactions on Smart Grid, vol. 13, no. 6, pp. 4909–4917, 2022.
  6. A. S. Zamzam, B. Yang, and N. D. Sidiropoulos, “Energy storage management via deep Q-networks,” in 2019 IEEE Power & Energy Society General Meeting (PESGM).   IEEE, 2019, pp. 1–5.
  7. J. Feng, Y. Shi, G. Qu, S. H. Low, A. Anandkumar, and A. Wierman, “Stability constrained reinforcement learning for decentralized real-time voltage control,” IEEE Transactions on Control of Network Systems, 2023.
  8. Y. Chen, Y. Shi, D. Arnold, and S. Peisert, “SAVER: Safe learning-based controller for real-time voltage regulation,” in 2022 IEEE Power & Energy Society General Meeting (PESGM).   IEEE, 2022, pp. 1–5.
  9. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International conference on machine learning.   PMLR, 2015, pp. 1889–1897.
  10. V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International conference on machine learning.   PMLR, 2016, pp. 1928–1937.
  11. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning.   PMLR, 2018, pp. 1861–1870.
  12. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  13. Stable-Baselines3, “Stable-Baselines3 Docs - Reliable Reinforcement Learning Implementations — Stable Baselines3 2.2.1 documentation,” Online, Accessed 2023. [Online]. Available: https://stable-baselines3.readthedocs.io/en/master/

Summary

We haven't generated a summary for this paper yet.