Papers
Topics
Authors
Recent
2000 character limit reached

Variable-Agnostic Causal Exploration for Reinforcement Learning (2407.12437v1)

Published 17 Jul 2024 in cs.LG and cs.AI

Abstract: Modern reinforcement learning (RL) struggles to capture real-world cause-and-effect dynamics, leading to inefficient exploration due to extensive trial-and-error actions. While recent efforts to improve agent exploration have leveraged causal discovery, they often make unrealistic assumptions of causal variables in the environments. In this paper, we introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL), incorporating causal relationships to drive exploration in RL without specifying environmental causal variables. Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms. Subsequently, it constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion. This can be leveraged to generate intrinsic rewards or establish a hierarchy of subgoals to enhance exploration efficiency. Experimental results showcase a significant improvement in agent performance in grid-world, 2d games and robotic domains, particularly in scenarios with sparse rewards and noisy actions, such as the notorious Noisy-TV environments.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.