CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal Covariance Design (2401.07369v1)
Abstract: Sampling-based Model Predictive Control (MPC) has been a practical and effective approach in many domains, notably model-based reinforcement learning, thanks to its flexibility and parallelizability. Despite its appealing empirical performance, the theoretical understanding, particularly in terms of convergence analysis and hyperparameter tuning, remains absent. In this paper, we characterize the convergence property of a widely used sampling-based MPC method, Model Predictive Path Integral Control (MPPI). We show that MPPI enjoys at least linear convergence rates when the optimization is quadratic, which covers time-varying LQR systems. We then extend to more general nonlinear systems. Our theoretical analysis directly leads to a novel sampling-based MPC algorithm, CoVariance-Optimal MPC (CoVo-MPC) that optimally schedules the sampling covariance to optimize the convergence rate. Empirically, CoVo-MPC significantly outperforms standard MPPI by 43-54% in both simulations and real-world quadrotor agile control tasks. Videos and Appendices are available at \url{https://lecar-lab.github.io/CoVO-MPC/}.
- Model-Based Offline Planning, March 2021.
- Constrained Covariance Steering Based Tube-MPPI, April 2022.
- Information Theoretic Model Predictive Q-Learning, May 2020.
- Chapter 3 - The Cross-Entropy Method for Optimization. In C. R. Rao and Venu Govindaraju, editors, Handbook of Statistics, volume 31 of Handbook of Statistics, pages 35–59. Elsevier, January 2013. 10.1016/B978-0-444-53859-8.00003-5.
- Franco Busetti. Simulated annealing overview. World Wide Web URL www. geocities. com/francorbusetti/saweb. pdf, 4, 2003.
- Trajectory Optimization With Implicit Hard Contacts. IEEE Robotics and Automation Letters, 3(4):3316–3323, October 2018. ISSN 2377-3766. 10.1109/LRA.2018.2852785.
- Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models, November 2018.
- Vision-Based High Speed Driving with a Deep Dynamic Observer, December 2018.
- Cross-Entropy Randomized Motion Planning. In Robotics: Science and Systems VII, pages 153–160. MIT Press, 2012.
- Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control, December 2018.
- Model Predictive Control in Industry: Challenges and Opportunities. IFAC-PapersOnLine, 48(8):531–538, January 2015. ISSN 2405-8963. 10.1016/j.ifacol.2015.09.022.
- Robust Model Predictive Path Integral Control: Analysis and Performance Guarantees. IEEE Robotics and Automation Letters, 6(2):1423–1430, April 2021. ISSN 2377-3766, 2377-3774. 10.1109/LRA.2021.3057563.
- Crazyflie 2.0 quadrotor as a platform for research and education in robotics and control engineering. In 2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR), pages 37–42, August 2017. 10.1109/MMAR.2017.8046794.
- Learning Latent Dynamics for Planning from Pixels, June 2019.
- Temporal Difference Learning for Model Predictive Control, July 2022.
- Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES). Evolutionary Computation, 11(1):1–18, March 2003. ISSN 1063-6560. 10.1162/106365603321828970.
- Using the Cross-Entropy Method to Guide/Govern Mobile Agent’s Path Finding in Networks. In Samuel Pierre and Roch Glitho, editors, Mobile Agents for Telecommunication Applications, Lecture Notes in Computer Science, pages 255–268, Berlin, Heidelberg, 2001. Springer. ISBN 978-3-540-44651-4. 10.1007/3-540-44651-6_24.
- Datt: Deep adaptive trajectory tracking for quadrotor control. In 7th Annual Conference on Robot Learning, 2023.
- When to Trust Your Model: Model-Based Policy Optimization, November 2021.
- Model-Based Reinforcement Learning for Atari, February 2020.
- Robert Tjarko Lange. Reinforcement Learning Environments in JAX, November 2023.
- Iterative linear quadratic regulator design for nonlinear biological movement systems. In Proceedings of the First International Conference on Informatics in Control, Automation and Robotics, pages 222–229, Setúbal, Portugal, 2004. SciTePress - Science and and Technology Publications. ISBN 978-972-8865-12-2. 10.5220/0001143902220229.
- Perturbation-based regret analysis of predictive control in linear time varying systems. Advances in Neural Information Processing Systems, 34:5174–5185, 2021.
- Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control, January 2019.
- The Cross Entropy Method for Fast Policy Search. Proceedings, Twentieth International Conference on Machine Learning, 2003.
- David Mayne. A second-order gradient method for determining optimal trajectories of non-linear discrete-time systems. International Journal of Control, 3(1):85–95, 1966.
- David Mayne. Robust and stochastic model predictive control: Are we going in the right direction? Annual Reviews in Control, 41:184–192, January 2016. ISSN 1367-5788. 10.1016/j.arcontrol.2016.04.006.
- Basis Function Adaptation in Temporal Difference Reinforcement Learning. Annals of Operations Research, 134(1):215–238, February 2005. ISSN 0254-5330, 1572-9338. 10.1007/s10479-005-5732-z.
- Konstantin Mishchenko. Regularized Newton Method with Global $O(1/k2̂)$ Convergence, March 2023.
- Temporal Predictive Coding For Model-Based Planning In Latent Space, June 2021.
- Neural-fly enables rapid learning for agile flight in strong winds. Science Robotics, 7(66):eabm6597, 2022.
- ℒℒ\mathscr{L}script_L1-Adaptive MPPI Architecture for Robust and Agile Control of Multirotors. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7661–7666, October 2020. 10.1109/IROS45743.2020.9341154.
- Crazyswarm: A large nano-quadcopter swarm. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 3299–3304, May 2017. 10.1109/ICRA.2017.7989376.
- Deep Model Predictive Optimization, October 2023.
- Neural lander: Stable drone landing control using learned dynamics. In 2019 international conference on robotics and automation (icra), pages 9784–9790. IEEE, 2019.
- Neural-swarm2: Planning and control of heterogeneous multirotor swarms using learned interactions. IEEE Transactions on Robotics, 38(2):1063–1079, 2021.
- A. Sideris and J.E. Bobrow. An efficient sequential linear quadratic algorithm for solving nonlinear optimal control problems. In Proceedings of the 2005, American Control Conference, 2005., pages 2275–2280 vol. 4, June 2005. 10.1109/ACC.2005.1470308.
- Reaching the limit in autonomous racing: Optimal control versus reinforcement learning. Science Robotics, 8(82):eadg1462, September 2023. 10.1126/scirobotics.adg1462.
- Learning tetris using the noisy cross-entropy method, 2006.
- Control-limited differential dynamic programming. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 1168–1175, May 2014. 10.1109/ICRA.2014.6907001.
- An Online Learning Approach to Model Predictive Control. In Robotics: Science and Systems XV. Robotics: Science and Systems Foundation, June 2019. ISBN 978-0-9923747-5-4. 10.15607/RSS.2019.XV.033.
- Variational Inference MPC using Tsallis Divergence, April 2021.
- Aggressive driving with model predictive path integral control. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 1433–1440, May 2016. 10.1109/ICRA.2016.7487277.
- Information theoretic MPC for model-based reinforcement learning. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 1714–1721, May 2017. 10.1109/ICRA.2017.7989202.
- The power of predictions in online control. Advances in Neural Information Processing Systems, 33:1994–2004, 2020.
- SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning, June 2019.