Real-time Vision-based Navigation for a Robot in an Indoor Environment (2307.00666v1)

Published 2 Jul 2023 in cs.CV

Abstract: This paper presents a study on the development of an obstacle-avoidance navigation system for autonomous navigation in home environments. The system utilizes vision-based techniques and advanced path-planning algorithms to enable the robot to navigate toward the destination while avoiding obstacles. The performance of the system is evaluated through qualitative and quantitative metrics, highlighting its strengths and limitations. The findings contribute to the advancement of indoor robot navigation, showcasing the potential of vision-based techniques for real-time, autonomous navigation.

References (2)

Authors (1)

Sagar Manglani (4 papers)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces a vision-based method using semantic segmentation and BEV transformation to enable real-time indoor navigation.
It employs cost mapping and grid-based A* path planning to achieve efficient obstacle avoidance on a low-cost, 3D-printed quadruped robot.
Experimental results validate the system's accuracy in dynamic indoor environments while highlighting challenges for future improvements.

An Examination of Vision-based Robotics Navigation in Indoor Environments

This paper, authored by Sagar Manglani, presents a scientific investigation into the development of an autonomous robot navigation system specifically designed for indoor environments. The primary focus of this research is on the implementation of vision-based techniques to facilitate obstacle avoidance and efficient path planning for a low-cost, 3D-printed, four-legged robot. The paper contributes to the robotic navigation field by emphasizing the use of RGB images from an onboard camera to achieve real-time navigation, diverging from the more common reliance on LiDAR measurements or odometry.

Methodological Overview

The navigation system in this research utilizes a series of methodical steps to transform camera images into actionable navigation paths. The core process involves semantic segmentation, cost mapping, and Birds-Eye View (BEV) transformation to facilitate effective obstacle avoidance.

Semantic Segmentation: The robot uses a semantic segmentation network, described as Segment-Anything by Meta, to identify obstacles and recognize traversable floor areas. This forms the basis for constructing a cost map, which assigns traversal costs based on perceived environmental characteristics.
Cost Mapping and BEV Transformation: Following segmentation, a transformation matrix is computed to convert perspective images into a BEV format, which is more conducive for path planning. This plays a pivotal role in the grid-based A* algorithm utilized to identify and navigate the optimal path.
Optimization for Real-time Performance: The research addresses real-time efficiency by implementing model quantization and resolution reduction, allowing the system to operate efficiently on the Nvidia Jetson Xavier NX platform.

Empirical Results

The empirical evaluation involved using a dedicated dataset comprising manually annotated images and sequences with dynamic scenarios. The navigation system's performance was quantitatively assessed by benchmarking the planned paths against manually annotated ground truths. Numerical results showed highly accurate navigation with few deviations in well-structured environments. However, the system did exhibit limitations when obstructed by tall obstacles, presenting areas for future improvements.

Qualitative results demonstrated the efficacy of real-time path overlays on visual data, underscoring the robot's obstacle avoidance capabilities. Given 1,200 sequential test images, the system successfully demonstrated navigation and adaptability in dynamic environments, albeit occasionally navigating sub-optimal paths when encountering ambiguities in the segmentation.

Implications and Future Directions

The results underscore the potential of vision-based systems for autonomous navigation in indoor environments, highlighting the importance of robust image segmentation for obstacle detection. The integration of vision-only data processing reduces reliance on more complex hardware such as LiDAR, offering a cost-effective solution for domestic robotic applications.

Challenges identified in this paper suggest promising directions for subsequent research. Enhancements in segmentation accuracy, especially at extended distances, and the incorporation of partially observable search methods could significantly refine navigation outcomes. Moreover, expanding the navigation framework to address a wider array of household environmental challenges including staircases and varying surfaces is anticipated to further underpin the system's applicability to real-world settings.

In conclusion, Manglani's work provides a solid foundation for future investigations into vision-based robotic navigation, proposing a viable alternative to traditional sensor-based approaches with its emphasis on cost-effectiveness and adaptability. As autonomous systems become increasingly integral to human environments, the refined methodologies developed here are poised to play a crucial role in the broader landscape of indoor robotics.

PDF Markdown