Communication-Efficient Cooperative Multi-Agent PPO via Regulated Segment Mixture in Internet of Vehicles (2308.04198v1)
Abstract: Multi-Agent Reinforcement Learning (MARL) has become a classic paradigm to solve diverse, intelligent control tasks like autonomous driving in Internet of Vehicles (IoV). However, the widely assumed existence of a central node to implement centralized federated learning-assisted MARL might be impractical in highly dynamic scenarios, and the excessive communication overheads possibly overwhelm the IoV system. Therefore, in this paper, we design a communication efficient cooperative MARL algorithm, named RSM-MAPPO, to reduce the communication overheads in a fully distributed architecture. In particular, RSM-MAPPO enhances the multi-agent Proximal Policy Optimization (PPO) by incorporating the idea of segment mixture and augmenting multiple model replicas from received neighboring policy segments. Afterwards, RSM-MAPPO adopts a theory-guided metric to regulate the selection of contributive replicas to guarantee the policy improvement. Finally, extensive simulations in a mixed-autonomy traffic control scenario verify the effectiveness of the RSM-MAPPO algorithm.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.