Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling (2312.15195v2)
Abstract: The emergence of on-demand ride pooling services allows each vehicle to serve multiple passengers at a time, thus increasing drivers' income and enabling passengers to travel at lower prices than taxi/car on-demand services (only one passenger can be assigned to a car at a time like UberX and Lyft). Although on-demand ride pooling services can bring so many benefits, ride pooling services need a well-defined matching strategy to maximize the benefits for all parties (passengers, drivers, aggregation companies and environment), in which the regional dispatching of vehicles has a significant impact on the matching and revenue. Existing algorithms often only consider revenue maximization, which makes it difficult for requests with unusual distribution to get a ride. How to increase revenue while ensuring a reasonable assignment of requests brings a challenge to ride pooling service companies (aggregation companies). In this paper, we propose a framework for vehicle dispatching for ride pooling tasks, which splits the city into discrete dispatching regions and uses the reinforcement learning (RL) algorithm to dispatch vehicles in these regions. We also consider the mutual information (MI) between vehicle and order distribution as the intrinsic reward of the RL algorithm to improve the correlation between their distributions, thus ensuring the possibility of getting a ride for unusually distributed requests. In experimental results on a real-world taxi dataset, we demonstrate that our framework can significantly increase revenue up to an average of 3\% over the existing best on-demand ride pooling method.
- Deep Variational Information Bottleneck. In ICLR. https://arxiv.org/abs/1612.00410
- On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment. Proceedings of the National Academy of Sciences 114, 3 (2017), 462–467. https://doi.org/10.1073/pnas.1611675114 arXiv:https://www.pnas.org/content/114/3/462.full.pdf
- Geoff Boeing. 2017. OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Computers, Environment and Urban Systems 65 (2017), 126–139.
- Kathryn Gessner. 2019. Uber vs. Lyft: Who’s tops in the battle of U.S. rideshare companies. https://www.uber.com/en-GB/newsroom/company-info/.
- Stable training of bellman error in reinforcement learning. In Neural Information Processing: 27th International Conference, ICONIP 2020, Bangkok, Thailand, November 18–22, 2020, Proceedings, Part V 27. Springer, 439–448.
- Mind your data! hiding backdoors in offline reinforcement learning datasets. arXiv preprint arXiv:2210.04688 (2022).
- Curiosity-Driven and Victim-Aware Adversarial Policies. In Proceedings of the 38th Annual Computer Security Applications Conference (, Austin, TX, USA,) (ACSAC ’22). Association for Computing Machinery, 186–200.
- Suining He and Kang G. Shin. 2019. Spatio-Temporal Capsule-Based Reinforcement Learning for Mobility-on-Demand Network Coordination. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 2806–2813. https://doi.org/10.1145/3308558.3313401
- Alex Heath. 2016. Inside Uber’s quest to get more people in fewer cars. https://www.businessinsider.com/uberpool-ride-sharing-could-be-the-future-of-uber-2016-6/.
- CoRide: Joint Order Dispatching and Fleet Management for Multi-Scale Ride-Hailing Platforms. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM) (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 1983–1992. https://doi.org/10.1145/3357384.3357978
- Balancing efficiency and fairness in on-demand ridesourcing. In Advances in Neural Information Processing Systems. 5309–5319.
- Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 1774–1783. https://doi.org/10.1145/3219819.3219993
- ZAC: A Zone Path Construction Approach for Effective Real-Time Ridesharing. In ICAPS.
- T-share: A large-scale dynamic taxi ridesharing service. In Data Engineering (ICDE), 2013 IEEE 29th International Conference on. IEEE, 410–421.
- Data-Driven Distributionally Robust Vehicle Balancing Using Dynamic Region Partitions. In Proceedings of the 8th International Conference on Cyber-Physical Systems (ICCPS ’17). Association for Computing Machinery, New York, NY, USA, 261–271. https://doi.org/10.1145/3055004.3055024
- Human-level control through deep reinforcement learning. Nat. 518, 7540 (2015), 529–533. https://doi.org/10.1038/nature14236
- Credit Assignment For Collective Multiagent RL With Global Rewards. In Advances in Neural Information Processing Systems, Vol. 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2018/file/94bb077f18daa6620efa5cf6e6f178d2-Paper.pdf
- NYYellowTaxi. 2016. New York Yellow Taxi DataSet. http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml.
- Takuma Oda and Carlee Joe-Wong. 2018. MOVI: A Model-Free Approach to Dynamic Fleet Management. In IEEE INFOCOM 2018 - IEEE Conference on Computer Communications (Honolulu, HI, USA). IEEE Press, 2708–2716.
- A Cost-Effective Recommender System for Taxi Drivers. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, New York, USA) (KDD ’14). Association for Computing Machinery, New York, NY, USA, 45–54. https://doi.org/10.1145/2623330.2623668
- A survey on dynamic and stochastic vehicle routing problems. International Journal of Production Research 54 (01 2016), 215–231. https://doi.org/10.1080/00207543.2015.1043403
- Stefan Ropke and Jean-François Cordeau. 2009. Branch and Cut and Price for the Pickup and Delivery Problem with Time Windows. Transportation Science 43, 3 (2009), 267–286. https://doi.org/10.1287/trsc.1090.0272
- Quantifying the benefits of vehicle pooling with shareability networks. Proceedings of the National Academy of Sciences 111, 37 (2014), 13290–13294.
- A Collaborative Multiagent Taxi-Dispatch System. IEEE Transactions on Automation Science and Engineering 7, 3 (2010), 607–616. https://doi.org/10.1109/TASE.2009.2028577
- Neural Approximate Dynamic Programming for On-Demand Ride-Pooling. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. AAAI Press, 507–515.
- Matthijs TJ Spaan. 2012. Partially observable Markov decision processes. In Reinforcement Learning. Springer, 387–414.
- A Deep Value-Network Based Approach for Multi-Driver Order Dispatching. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 1780–1790. https://doi.org/10.1145/3292500.3330724
- Martin J. Wainwright and Michael I. Jordan. 2008. Graphical Models, Exponential Families, and Variational Inference. Foundations and Trends® in Machine Learning 1, 1–2 (2008), 1–305. https://doi.org/10.1561/2200000001
- PrivateHunt: Multi-Source Data-Driven Dispatching in For-Hire Vehicle Systems. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 1, Article 45 (mar 2018), 26 pages. https://doi.org/10.1145/3191777
- T-Finder: A Recommender System for Finding Passengers and Vacant Taxis. IEEE Transactions on Knowledge and Data Engineering 25, 10 (oct 2013), 2390–2403.
- Taxi Dispatch Planning via Demand and Destination Modeling. In 2018 IEEE 43rd Conference on Local Computer Networks (LCN). IEEE Computer Society, Los Alamitos, CA, USA, 377–384.
- Large-Scale Order Dispatch in On-Demand Ride-Hailing Platforms: A Learning and Planning Approach. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’18). 905–913.
- Mean Field Multi-Agent Reinforcement Learning. In Proceedings of the 35th International Conference on Machine Learning (ICML), Vol. 80. 5571–5580.
- A Taxi Order Dispatch Model Based On Combinatorial Optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD ’17). Association for Computing Machinery, New York, NY, USA, 2151–2159. https://doi.org/10.1145/3097983.3098138
- Pruning the Communication Bandwidth between Reinforcement Learning Agents through Causal Inference: An Innovative Approach to Designing a Smart Grid Power System. Sensors 22, 20 (2022). https://www.mdpi.com/1424-8220/22/20/7785
- Common belief multi-agent reinforcement learning based on variational recurrent models. Neurocomputing 513 (2022), 341–350. https://doi.org/10.1016/j.neucom.2022.09.144
- Structural relational inference actor-critic for multi-agent reinforcement learning. Neurocomputing 459 (2021), 383–394.
- Future Aware Pricing and Matching for Sustainable On-Demand Ride Pooling. In Thirty-Seventh AAAI Conference on Artificial Intelligence, (AAAI). 14628–14636.
- Multi-Agent Reinforcement Learning for Order-Dispatching via Order-Vehicle Distribution Matching. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM) (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 2645–2653. https://doi.org/10.1145/3357384.3357799
- Xianjie Zhang (3 papers)
- Jiahao Sun (20 papers)
- Chen Gong (152 papers)
- Kai Wang (624 papers)
- Yifei Cao (4 papers)
- Hao Chen (1006 papers)
- Yu Liu (786 papers)