Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 42 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 187 tok/s Pro
GPT OSS 120B 431 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Federated Control in Markov Decision Processes (2405.04026v1)

Published 7 May 2024 in stat.ML and cs.LG

Abstract: We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the training process. In face of the difference among restricted regions, we firstly introduce concepts of leakage probabilities to understand how such heterogeneity affects the learning process, and then propose a novel communication protocol that we call Federated-Q protocol (FedQ), which periodically aggregates agents' knowledge of their restricted regions and accordingly modifies their learning problems for further training. In terms of theoretical analysis, we justify the correctness of FedQ as a communication protocol, then give a general result on sample complexity of derived algorithms FedQ-X with the RL oracle , and finally conduct a thorough study on the sample complexity of FedQ-SynQ. Specifically, FedQ-X has been shown to enjoy linear speedup in terms of sample complexity when workload is uniformly distributed among agents. Moreover, we carry out experiments in various environments to justify the efficiency of our methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Model-based reinforcement learning with a generative model is minimax optimal. In Conference on Learning Theory, pages 67–83. PMLR, 2020.
  2. 5g network coverage planning and analysis of the deployment challenges. Sensors, 21(19):6608, 2021.
  3. A model-based reinforcement learning with adversarial training for online recommendation. Advances in Neural Information Processing Systems, 32, 2019.
  4. Model-free deep reinforcement learning for urban autonomous driving. In 2019 IEEE intelligent transportation systems conference (ITSC), pages 2765–2771. IEEE, 2019.
  5. Finite-sample analysis of contractive stochastic approximation using smooth convex envelopes. Advances in Neural Information Processing Systems, 33:8223–8234, 2020.
  6. Exact decomposition approaches for markov decision processes: A survey. Advances in Operations Research, 2010, 2010.
  7. Extending cell tower coverage through drones. In Proceedings of the 18th International Workshop on Mobile Computing Systems and Applications, pages 7–12, 2017.
  8. Driverless car: Autonomous driving using deep reinforcement learning in urban environment. In 2018 15th international conference on ubiquitous robots (ur), pages 896–901. IEEE, 2018.
  9. David A Freedman. On tail probabilities for martingales. The Annals of Probability, 3(1):100–118, 1975.
  10. Optimal control in markov decision processes via distributed optimization. In 2015 54th IEEE Conference on Decision and Control (CDC), pages 7462–7469. IEEE, 2015.
  11. Low latency 5g distributed wireless network architecture: A techno-economic comparison. Inventions, 6(1):11, 2021.
  12. A comprehensive survey of ran architectures toward 5g mobile communication system. Ieee Access, 7:70371–70421, 2019.
  13. Federated reinforcement learning with environment heterogeneity. In International Conference on Artificial Intelligence and Statistics, pages 18–37. PMLR, 2022.
  14. Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 2021.
  15. Is q-learning minimax optimal? a tight sample complexity analysis. Operations Research, 2023a.
  16. Breaking the sample size barrier in model-based reinforcement learning with a generative model. Operations Research, 2023b.
  17. Lifelong federated reinforcement learning: A learning architecture for navigation in cloud robotic systems. arXiv preprint arXiv:1901.06455, 2019.
  18. Decentralized cooperative stochastic bandits. Advances in Neural Information Processing Systems, 32, 2019.
  19. Solving very large weakly coupled markov decision processes. In AAAI/IAAI, pages 165–172, 1998.
  20. Federated reinforcement learning for fast personalization. In 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pages 123–127. IEEE, 2019.
  21. Introduction to reinforcement learning, volume 135. MIT press Cambridge, 1998.
  22. Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science. Number 47 in Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018. ISBN 978-1-108-41519-4.
  23. Martin J Wainwright. Variance-reduced q-learning is minimax optimal. arXiv preprint arXiv:1906.04697, 2019.
  24. Federated deep reinforcement learning for internet of things with decentralized cooperative edge caching. IEEE Internet of Things Journal, 7(10):9441–9455, 2020.
  25. Distributed bandit learning: Near-optimal regret with efficient communication. arXiv preprint arXiv:1904.06309, 2019.
  26. The blessing of heterogeneity in federated q-learning: Linear speedup and beyond. arXiv preprint arXiv:2305.10697, 2023.
  27. Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2810–2818, 2019.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 10 likes.

Upgrade to Pro to view all of the tweets about this paper: