Emergent Mind

Abstract

We analyze the hidden activations of neural network policies of deep reinforcement learning (RL) agents and show, empirically, that it's possible to know a priori if a state representation will lend itself to fast learning. RL agents in high-dimensional states have two main learning burdens: (1) to learn an action-selection policy and (2) to learn to discern between useful and non-useful information in a given state. By learning a latent representation of these high-dimensional states with an auxiliary model, the latter burden is effectively removed, thereby leading to accelerated training progress. We examine this phenomenon across tasks in the PyBullet Kuka environment, where an agent must learn to control a robotic gripper to pick up an object. Our analysis reveals how neural network policies learn to organize their internal representation of the state space throughout training. The results from this analysis provide three main insights into how deep RL agents learn. First, a well-organized internal representation within the policy network is a prerequisite to learning good action-selection. Second, a poor initial representation can cause an unrecoverable collapse within a policy network. Third, a good initial representation allows an agent's policy network to organize its internal representation even before any training begins.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.