BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects (2403.09799v2)

Published 14 Mar 2024 in cs.CV and cs.RO

Abstract: We present the evaluation methodology, datasets and results of the BOP Challenge 2023, the fifth in a series of public competitions organized to capture the state of the art in model-based 6D object pose estimation from an RGB/RGB-D image and related tasks. Besides the three tasks from 2022 (model-based 2D detection, 2D segmentation, and 6D localization of objects seen during training), the 2023 challenge introduced new variants of these tasks focused on objects unseen during training. In the new tasks, methods were required to learn new objects during a short onboarding stage (max 5 minutes, 1 GPU) from provided 3D object models. The best 2023 method for 6D localization of unseen objects (GenFlow) notably reached the accuracy of the best 2020 method for seen objects (CosyPose), although being noticeably slower. The best 2023 method for seen objects (GPose) achieved a moderate accuracy improvement but a significant 43% run-time improvement compared to the best 2022 counterpart (GDRNPP). Since 2017, the accuracy of 6D localization of seen objects has improved by more than 50% (from 56.9 to 85.6 AR_C). The online evaluation system stays open and is available at: http://bop.felk.cvut.cz/.

References (67)

Authors (10)

Tomas Hodan (22 papers)
Martin Sundermeyer (18 papers)
Van Nguyen Nguyen (9 papers)
Gu Wang (25 papers)
Eric Brachmann (27 papers)
Bertram Drost (8 papers)
Vincent Lepetit (101 papers)
Carsten Rother (74 papers)
Jiri Matas (133 papers)
Yann Labbe (4 papers)

Citations (22)

View on Semantic Scholar

Summary

The paper expands evaluation to include both seen and unseen objects, advancing model-based 6D pose estimation techniques.
State-of-the-art methods like GPose and GenFlow achieved notable improvements in localization accuracy and runtime efficiency.
Results highlight that narrowing performance gaps for unseen objects and occluded scenarios remains a key challenge for future research.

Overview of the BOP Challenge 2023: Advancements in 6D Object Pose Estimation

Introduction

The BOP Challenge 2023 has extended the evaluation of model-based 6D object pose estimation tasks to include a focus on unseen objects, alongside the traditional tasks involving objects seen during training. This new addition responds to the need for systems capable of adapting to novel objects in practical settings without extensive retraining. The challenge benchmarks three core tasks for both seen and unseen objects: model-based 2D object detection, 2D object segmentation, and 6D object localization.

Challenge Tasks

The challenge tasks are differentiated primarily based on whether the target objects are seen or unseen during training:

Tasks for Seen Objects:

6D Localization: Methods are evaluated on their ability to estimate the pose of objects visible in the training dataset.
2D Detection: The task involves detecting objects by predicting amodal 2D bounding boxes.
2D Segmentation: The challenge here is to segment objects by predicting modal 2D binary masks.

Tasks for Unseen Objects: These tasks mirror the ones for seen objects but require the methods to adapt to novel objects provided only during the testing phase.

Evaluation

The evaluation covered numerous entries across all tasks, with a diverse range of approaches from pure deep learning models to hybrid techniques. The methods were notably assessed on their accuracy, runtime, and applicability to different object complexities.

Key Results:

Seen Objects: The GPose method emerged as the top performer in 6D localization, showcasing a significant runtime improvement while moderately enhancing accuracy over the previous year's best. This advancement underlines the rapid progress in making 6D pose estimation more practical for real-time applications.
Unseen Objects: GenFlow stood out in the 6D localization of unseen objects, reaching accuracy comparable to the top method for seen objects from 2020. This achievement marks a significant milestone, showing that the gap between the performance on seen and unseen objects is narrowing.

Implications

The BOP Challenge 2023 results reflect a significant shift toward developing more versatile and efficient 6D pose estimation methods that can quickly adapt to new objects. However, the efficiency of adapting to unseen objects and the performance in occluded object detection are identified as critical areas needing further improvement. Specifically, the detection and segmentation of unseen objects in occluded scenarios remain a challenge, suggesting a potential direction for future research.

Future Directions

The challenge has set the stage for next steps in the field of 6D pose estimation. Efforts to bridge the performance gap in tasks involving unseen objects are particularly crucial. The 2023 challenge results hint at the importance of developing efficient onboarding protocols for new objects and improving the robustness of methods in detecting occluded instances. Looking ahead, introducing more challenging variants, such as onboarding based only on reference images, could push the boundaries of current methodologies further.

In sum, the BOP Challenge 2023 has not only showcased the current state of the art in 6D pose estimation but also highlighted the evolving challenges and opportunities in the field, setting a clear agenda for future research endeavors.

PDF Markdown

Related Papers

Tweets

https://twitter.com/eric_brachmann/status/1770793958995575140

https://twitter.com/ma_sundermeyer/status/1796107564238741516

https://twitter.com/ducha_aiki/status/1769633066987766179

https://twitter.com/CSVisionPapers/status/1769699272457339354