Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 156 tok/s

Gemini 2.5 Pro 44 tok/s Pro

GPT-5 Medium 23 tok/s Pro

GPT-5 High 22 tok/s Pro

GPT-4o 109 tok/s Pro

Kimi K2 168 tok/s Pro

GPT OSS 120B 455 tok/s Pro

Claude Sonnet 4.5 32 tok/s Pro

2000 character limit reached

Quantile Markov Decision Process (1711.05788v4)

Published 15 Nov 2017 in cs.AI

Abstract: The goal of a traditional Markov decision process (MDP) is to maximize expected cumulativereward over a defined horizon (possibly infinite). In many applications, however, a decision maker may beinterested in optimizing a specific quantile of the cumulative reward instead of its expectation. In this paperwe consider the problem of optimizing the quantiles of the cumulative rewards of a Markov decision process(MDP), which we refer to as a quantile Markov decision process (QMDP). We provide analytical resultscharacterizing the optimal QMDP value function and present a dynamic programming-based algorithm tosolve for the optimal policy. The algorithm also extends to the MDP problem with a conditional value-at-risk(CVaR) objective. We illustrate the practical relevance of our model by evaluating it on an HIV treatmentinitiation problem, where patients aim to balance the potential benefits and risks of the treatment.

Citations (5)

View on Semantic Scholar