Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 177 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

DROP: Deep relocating option policy for optimal ride-hailing vehicle repositioning (2109.04149v1)

Published 9 Sep 2021 in cs.MA, cs.AI, and cs.LG

Abstract: In a ride-hailing system, an optimal relocation of vacant vehicles can significantly reduce fleet idling time and balance the supply-demand distribution, enhancing system efficiency and promoting driver satisfaction and retention. Model-free deep reinforcement learning (DRL) has been shown to dynamically learn the relocating policy by actively interacting with the intrinsic dynamics in large-scale ride-hailing systems. However, the issues of sparse reward signals and unbalanced demand and supply distribution place critical barriers in developing effective DRL models. Conventional exploration strategy (e.g., the $\epsilon$-greedy) may barely work under such an environment because of dithering in low-demand regions distant from high-revenue regions. This study proposes the deep relocating option policy (DROP) that supervises vehicle agents to escape from oversupply areas and effectively relocate to potentially underserved areas. We propose to learn the Laplacian embedding of a time-expanded relocation graph, as an approximation representation of the system relocation policy. The embedding generates task-agnostic signals, which in combination with task-dependent signals, constitute the pseudo-reward function for generating DROPs. We present a hierarchical learning framework that trains a high-level relocation policy and a set of low-level DROPs. The effectiveness of our approach is demonstrated using a custom-built high-fidelity simulator with real-world trip record data. We report that DROP significantly improves baseline models with 15.7% more hourly revenue and can effectively resolve the dithering issue in low-demand areas.

Citations (17)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.