The Termination Critic (1902.09996v1)

Published 26 Feb 2019 in cs.AI, cs.LG, and stat.ML

Abstract: In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We propose an algorithm that focuses on the termination condition, as opposed to -- as is common -- the policy. The termination condition is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option's encoding -- arguably a key reason for using abstractions. To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a "critic" for the termination condition. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning and planning.

Citations (43)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

The Option-Critic Architecture (2016)
Discovery of Options via Meta-Learned Subgoals (2021)
Diversity-Enriched Option-Critic (2020)
Learning Diverse Options via InfoMax Termination Critic (2020)
Learning with Options that Terminate Off-Policy (2017)