A Role of Environmental Complexity on Representation Learning in Deep Reinforcement Learning Agents (2407.03436v2)
Abstract: We developed a simulated environment to train deep reinforcement learning agents on a shortcut usage navigation task, motivated by the Dual Solutions Paradigm test used for human navigators. We manipulated the frequency with which agents were exposed to a shortcut and a navigation cue, to investigate how these factors influence shortcut usage development. We find that all agents rapidly achieve optimal performance in closed shortcut trials once initial learning starts. However, their navigation speed and shortcut usage when it is open happen faster in agents with higher shortcut exposure. Analysis of the agents' artificial neural networks activity revealed that frequent presentation of a cue initially resulted in better encoding of the cue in the activity of individual nodes, compared to agents who encountered the cue less often. However, stronger cue representations were ultimately formed through the use of the cue in the context of navigation planning, rather than simply through exposure. We found that in all agents, spatial representations develop early in training and subsequently stabilize before navigation strategies fully develop, suggesting that having spatially consistent activations is necessary for basic navigation, but insufficient for advanced strategies. Further, using new analysis techniques, we found that the planned trajectory rather than the agent's immediate location is encoded in the agent's networks. Moreover, the encoding is represented at the population rather than the individual node level. These techniques could have broader applications in studying neural activity across populations of neurons or network nodes beyond individual activity patterns.
- Effects of home environment structure on navigation preference and performance: A comparison in veneto, italy and utah, usa. Journal of Environmental Psychology, 74:101580, 2021.
- A geometric perspective on optimal representations for reinforcement learning. Advances in neural information processing systems, 32, 2019.
- Gender differences in relation to wayfinding strategies, navigational support design, and wayfinding task difficulty. Journal of environmental psychology, 29(2):220–226, 2009.
- On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
- Dopamine neurons report an error in the temporal prediction of reward during learning. Nature neuroscience, 1(4):304–309, 1998.
- Vizdoom: A doom-based ai research platform for visual reinforcement learning. In 2016 IEEE conference on computational intelligence and games (CIG), pages 1–8. IEEE, 2016.
- Task-dependent representations in rat hippocampal place neurons. Journal of Neurophysiology, 78(2):597–613, 1997.
- Ilya Kostrikov. Pytorch implementations of reinforcement learning algorithms, 2018.
- Investigating navigation strategies in the morris water maze through deep reinforcement learning. Neural Networks, 172:106050, 2024.
- On the effect of auxiliary tasks on representation dynamics. In International Conference on Artificial Intelligence and Statistics, pages 1–9. PMLR, 2021.
- Understanding plasticity in neural networks. arXiv preprint arXiv:2303.01486, 2023.
- Cognitive mappers to creatures of habit: differential engagement of place and response learning mechanisms predicts human navigational behavior. Journal of neuroscience, 31(43):15264–15268, 2011.
- Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
- Path integration and the neural basis of the ’cognitive map’. Nature Reviews Neuroscience, 7(8):663–678, 2006.
- Aging and spatial cues influence the updating of navigational memories. Scientific Reports, 9(1):11469, 2019.
- Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673, 2016.
- Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928–1937. PMLR, 2016.
- Place cells, grid cells, and memory. Cold Spring Harbor perspectives in biology, 7(2):a021808, 2015.
- A meta-analysis of sex differences in human navigation skills. Psychonomic bulletin & review, 26:1503–1528, 2019.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- The development of spatial representations of large-scale environments. Advances in child development and behavior, 10:9–55, 1975.
- The dormant neuron phenomenon in deep reinforcement learning. arXiv preprint arXiv:2302.12902, 2023.
- Reinforcement learning: An introduction. MIT press, 2018.
- Cognitive maps: Some people make them, some people struggle. Current directions in psychological science, 27(4):220–226, 2018.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.