Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning (2302.00521v2)
Abstract: Being able to harness the power of large datasets for developing cooperative multi-agent controllers promises to unlock enormous value for real-world applications. Many important industrial systems are multi-agent in nature and are difficult to model using bespoke simulators. However, in industry, distributed processes can often be recorded during operation, and large quantities of demonstrative data stored. Offline multi-agent reinforcement learning (MARL) provides a promising paradigm for building effective decentralised controllers from such datasets. However, offline MARL is still in its infancy and therefore lacks standardised benchmark datasets and baselines typically found in more mature subfields of reinforcement learning (RL). These deficiencies make it difficult for the community to sensibly measure progress. In this work, we aim to fill this gap by releasing off-the-grid MARL (OG-MARL): a growing repository of high-quality datasets with baselines for cooperative offline MARL research. Our datasets provide settings that are characteristic of real-world systems, including complex environment dynamics, heterogeneous agents, non-stationarity, many agents, partial observability, suboptimality, sparse rewards and demonstrated coordination. For each setting, we provide a range of different dataset types (e.g. Good, Medium, Poor, and Replay) and profile the composition of experiences for each dataset. We hope that OG-MARL will serve the community as a reliable source of datasets and help drive progress, while also providing an accessible entry point for researchers new to the field.
- Reducing overestimation bias in multi-agent domains using double centralized critics. ArXiv Preprint, 2019.
- An optimistic perspective on offline reinforcement learning. ArXiv Preprint, 2019.
- Deep reinforcement learning at the edge of the statistical precipice. Advances in Neural Information Processing Systems, 2021.
- A model-based solution to the offline multi-agent reinforcement learning coordination problem. ArXiv Preprint, 2023.
- Dota 2 with large scale deep reinforcement learning. ArXiv Preprint, 2019.
- The complexity of decentralized control of markov decision processes. Mathematics of operations research, 2002.
- Openai gym. ArXiv Preprint, 2016.
- Decision transformer: reinforcement learning via sequence modeling. Advances in Neural Information Processing Systems, 2021.
- Leveraging procedural generation to benchmark reinforcement learning. International Conference on Machine Learning, 2020.
- Q. Cui and S. S. Du. Provably efficient offline multi-agent reinforcement learning via strategy-wise bonus. Advances in Neural Information Processing Systems, 2022.
- Q. Cui and L. F. Yang. Minimax sample complexity for turn-based stochastic game. In Uncertainty in Artificial Intelligence, 2021.
- Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Springer Machine Learning, 2021.
- Smacv2: An improved benchmark for cooperative multi-agent reinforcement learning. ArXiv Preprint, 2022.
- Reduce, reuse, recycle: Selective reincarnation in multi-agent reinforcement learning. Workshop on Reincarnating Reinforcement Learning at ICLR, 2023.
- D4rl: Datasets for deep data-driven reinforcement learning. ArXiv Preprint, 2020.
- S. Fujimoto and S. S. Gu. A minimalist approach to offline reinforcement learning. Advances in Neural Information Processing Systems, 2021.
- Addressing function approximation error in actor-critic methods. International Conference on Machine Learning, 2018.
- Off-policy deep reinforcement learning without exploration. International Conference on Machine Learning, 2019.
- Datasheets for datasets. ArXiv Preprint, 2021.
- Why so pessimistic? estimating uncertainties for offline rl through ensembles, and why their independence matters. Advances in Neural Information Processing Systems, 2022.
- Towards a standardised performance evaluation protocol for cooperative MARL. Advances in Neural Information Processing Systems, 2022.
- A review of safe reinforcement learning: Methods, theory and applications. ArXiv Preprint, 2022.
- Rl unplugged: A suite of benchmarks for offline reinforcement learning. Advances in Neural Information Processing Systems, 2020.
- Cooperative multi-agent control using deep reinforcement learning. International Conference on Autonomous Agents and Multiagent Systems, 2017.
- Rethinking the implementation tricks and monotonicity constraint in cooperative multi-agent reinforcement learning, 2021.
- J. Jiang and Z. Lu. Offline decentralized multi-agent reinforcement learning. ArXiv Preprint, 2021.
- V. Khattar and M. Jin. Winning the citylearn challenge: Adaptive optimization with evolutionary search under trajectory-based guidance. ArXiv Preprint, 2022.
- Offline reinforcement learning with implicit q-learning. Deep RL Workshop at NeurIPS, 2021.
- L. Kraemer and B. Banerjee. Multi-agent reinforcement learning as a rehearsal for decentralized planning. Elsevier Neurocomputing, 2016.
- Stabilizing off-policy q-learning via bootstrapping error reduction. Neural Information Processing Systems, 2019.
- Conservative q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems, 2020.
- V. Kurenkov and S. Kolesnikov. Showing your offline reinforcement learning work: Online evaluation budget matters. International Conference on Machine Learning, 2022.
- Offline reinforcement learning: Tutorial, review, and perspectives on open problems. ArXiv Preprint, 2020.
- Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30, 2017.
- Challenges and opportunities in offline reinforcement learning from visual observations. Decision Awareness in Reinforcement Learning Workshop at ICML, 2022.
- Contrasting centralized and decentralized critics in multi-agent reinforcement learning. International Conference on Autonomous Agents and Multi-Agent Systems, 2021.
- A deeper understanding of state-based critics in multi-agent reinforcement learning. ArXiv Preprint., 2022.
- Offline pre-trained multi-agent decision transformer: One big sequence model conquers all starcraftii tasks. ArXiv Preprint, 2021.
- Flatland-rl : Multi-agent reinforcement learning on trains. ArXiv Preprint, 2020.
- Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning. Workshop on Reincarnating Reinforcement Learning at ICLR, 2023.
- Plan better amid conservatism: Offline multi-agent reinforcement learning with actor rectification. International Conference on Machine Learning, 2022.
- Facmac: Factored multi-agent centralised policy gradients. Advances in Neural Information Processing Systems, 2021.
- A survey on offline reinforcement learning: Taxonomy, review, and open problems. IEEE Transactions on Neural Networks and Learning Systems, 2023.
- Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. International Conference on Machine Learning, 2018.
- Tackling climate change with machine learning. ACM Computing Surveys, 2022.
- The starcraft multi-agent challenge. International Conference on Autonomous Agents and MultiAgent Systems, 2019.
- Reinforcement Learning: An Introduction. The MIT Press, 2018.
- Multi-agent routing value iteration network. International Conference on Machine Learning, 2020.
- Pettingzoo: Gym for multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 2021.
- Mujoco: A physics engine for model-based control. IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012.
- Citylearn: Standardizing research in multi-agent reinforcement learning for demand response and urban energy management. ArXiv Preprint, 2020.
- Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 2019.
- Multi-agent reinforcement learning for active voltage control on power distribution networks. Advances in Neural Information Processing Systems, 2021.
- The societal implications of deep reinforcement learning. Journal of Artificial Intelligence Research, 2021.
- Constraints penalized q-learning for safe offline reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
- Believe what you see: Implicit constraint approach for offline multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 2021.
- A review of deep reinforcement learning for smart building energy management. IEEE Internet of Things Journal, 2021.
- Y. Yu. Towards sample efficient reinforcement learning. International Joint Conference on Artificial Intelligence, 2018.
- CityFlow: A multi-agent reinforcement learning environment for large scale city traffic scenario. ACM International World Wide Web Conference, 2019.
- Finite-sample analysis for decentralized batch multiagent reinforcement learning with networked agents. IEEE Transactions on Automatic Control, 2021.
- Pessimistic minimax value iteration: Provably efficient equilibrium learning from offline datasets. International Conference on Machine Learning, 2022.
- Learning implicit credit assignment for cooperative multi-agent reinforcement learning. Arxiv Preprint, 2020.
- Madiff: Offline multi-agent learning with diffusion models. Arxiv Preprint, 2023.