- The paper introduces a novel self-supervised navigation approach that learns via Dynamic Graph Memory without labeled data.
- The methodology integrates reinforcement and imitation learning to iteratively build a graph-based representation of the environment.
- Empirical results show that the policy generalizes in photorealistic 3D scenes, enabling robust robot navigation with 250,000 interactions.
Introduction to Self-Supervised Navigation
The field of robotics has seen a marked increase in the ability of robots to navigate through environments. Central to this improvement is teaching robots to autonomously learn navigation skills that adapt to their surroundings. Learning-based methods typically use a significant amount of labeled data or human interventions, but such methods are either limited to simulated scenarios or require substantial human effort to collect data. Aiming to overcome these limitations, this paper introduces a novel approach that allows robots to learn navigation policies using only on-board observations, without any labeled data or human intervention.
Dynamic Graph Memory (DGMem)
The key innovation in this work is the development of Dynamic Graph Memory (DGMem). DGMem enables a robot to autonomously create a navigation policy by actively exploring its environment and using the gathered data to iteratively update its memory. This memory consists of a graph that represents the robot's surroundings, with nodes depicting key locations and edges signifying traversable paths between these nodes. As the robot encounters new areas, the graph evolves, enhancing the robot's understanding of the environment and refining the navigation policy through reinforcement learning (RL) and imitation learning (IL).
Training Without Labels
The researchers addressed the challenge of training without labels by incorporating a self-supervised learning setup. They designed a hierarchical navigation policy that combines RL and IL objectives to improve data efficiency and enforce active exploration strategies. The algorithm assigns navigation goals based on exploring novel areas of the environment, incrementally expanding the robot’s knowledge of the space. DGMem also serves as a planner, informing the execution of long-horizon navigation trajectories, reducing the learning burden on the policy network.
Empirical Results and Contributions
The model was evaluated in photorealistic 3D indoor scenes, and the results demonstrated that DGMem achieved significant performance gains in navigation tasks compared to existing exploration methods. Empirical studies showed that the robot could generalize the learned skills to navigate towards arbitrary goal images in a novel scene within 250,000 interactions, starting with random weight initialization.
The contribution of this research can be summarized in three primary aspects. First, the introduction of the self-supervised navigation task as an approach to training robots in real-world environments. Second, the novel memorizing scheme, DGMem, which acts both as a planner and as a trainer for the navigation policy. Lastly, empirical evidence showing that the policy network trained via DGMem can effectively acquire generalizable navigation skills in new environments.
Conclusion
This research makes a strong case for the potential of self-supervised learning in training robots for navigation tasks. The proposed DGMem offers a data-efficient and adaptable way for robots to learn and improve upon their navigation capabilities, moving closer to the practical deployment of autonomous robots in real-world settings.