Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects (2403.09799v2)

Published 14 Mar 2024 in cs.CV and cs.RO

Abstract: We present the evaluation methodology, datasets and results of the BOP Challenge 2023, the fifth in a series of public competitions organized to capture the state of the art in model-based 6D object pose estimation from an RGB/RGB-D image and related tasks. Besides the three tasks from 2022 (model-based 2D detection, 2D segmentation, and 6D localization of objects seen during training), the 2023 challenge introduced new variants of these tasks focused on objects unseen during training. In the new tasks, methods were required to learn new objects during a short onboarding stage (max 5 minutes, 1 GPU) from provided 3D object models. The best 2023 method for 6D localization of unseen objects (GenFlow) notably reached the accuracy of the best 2020 method for seen objects (CosyPose), although being noticeably slower. The best 2023 method for seen objects (GPose) achieved a moderate accuracy improvement but a significant 43% run-time improvement compared to the best 2022 counterpart (GDRNPP). Since 2017, the accuracy of 6D localization of seen objects has improved by more than 50% (from 56.9 to 85.6 AR_C). The online evaluation system stays open and is available at: http://bop.felk.cvut.cz/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. GPose2023, a submission to the BOP Challenge 2023. 2023. http://bop.felk.cvut.cz/method_info/410/.
  2. Learning 6D object pose estimation using 3D object coordinates. ECCV, 2014.
  3. Shapenet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.
  4. 3d model-based zero-shot pose estimation pipeline. arXiv preprint arXiv:2305.17934, 2023.
  5. BlenderProc: Reducing the reality gap with photorealistic rendering. RSS Workshops, 2020.
  6. Blenderproc. arXiv preprint arXiv:1911.01911, 2019.
  7. Blenderproc2: A procedural pipeline for photorealistic rendering. Journal of Open Source Software, 8(82):4901, 2023.
  8. Recovering 6D object pose and predicting next-best-view in the crowd. CVPR, 2016.
  9. Google scanned objects: A high-quality dataset of 3D scanned household items. ICRA, 2022.
  10. Introducing MVTec ITODD – A dataset for 3D object recognition in industry. ICCVW, 2017.
  11. Model globally, match locally: Efficient and robust 3D object recognition. CVPR, 2010.
  12. YOLOX: Exceeding YOLO series in 2021. arXiv preprint arXiv:2107.08430, 2021.
  13. PointPoseNet: Accurate object detection and 6 DoF pose estimation in point clouds. arXiv preprint arXiv:1912.09057, 2019.
  14. Shape-constraint recurrent flow for 6d object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4831–4840, 2023.
  15. Shape, light & material decomposition from images using monte carlo rendering and denoising. NeurIPS, 2022.
  16. SurfEmb: Dense and continuous correspondence distributions for object pose estimation with learnt surface embeddings. CVPR, 2022.
  17. Mask R-CNN. ICCV, 2017.
  18. Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. ACCV, 2012.
  19. On pre-trained image features and synthetic images for deep learning. ECCVW, 2018.
  20. Tomáš Hodaň. Pose estimation of specific rigid objects. PhD Thesis, Czech Technical University in Prague, 2021.
  21. EPOS: Estimating 6D pose of objects with symmetries. CVPR, 2020.
  22. BOP Challenge 2019. https://bop.felk.cvut.cz/media/bop_challenge_2019_results.pdf, 2019.
  23. T-LESS: An RGB-D dataset for 6D pose estimation of texture-less objects. WACV, 2017.
  24. On evaluation of 6D object pose estimation. ECCVW, 2016.
  25. BOP: Benchmark for 6D object pose estimation. ECCV, 2018.
  26. BOP Challenge 2020 on 6D object localization. ECCV, 2020.
  27. Photorealistic image synthesis for object instance detection. ICIP, 2019.
  28. Perspective flow aggregation for data-limited 6d object pose estimation. arXiv preprint arXiv:2203.09836, 2022.
  29. Ultralytics YOLO, Jan. 2023.
  30. HomebrewedDB: RGB-D dataset for 6D pose estimation of 3D objects. ICCVW, 2019.
  31. SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. ICCV, 2017.
  32. A hybrid approach for 6dof pose estimation. ECCVW, 2020.
  33. CosyPose: Consistent multi-view multi-object 6D pose estimation. ECCV, 2020.
  34. MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare. In CoRL, 2022.
  35. CDPN: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. ICCV, 2019.
  36. Sam-6d: Segment anything model meets zero-shot 6d object pose estimation. arXiv preprint arXiv:2311.15707, 2023.
  37. Microsoft COCO: Common objects in context. ECCV, 2014.
  38. Coupled iterative refinement for 6d multi-object pose estimation. In CVPR, 2022.
  39. Leaping from 2D detection to efficient 6DoF object pose estimation. ECCVW, 2020.
  40. GDRNPP. https://github.com/shanice-l/gdrnpp_bop2022, 2022.
  41. GenFlow, a submission to the BOP Challenge 2023. Unpublished, 2023. http://bop.felk.cvut.cz/method_info/437/.
  42. Extracting triangular 3d models, materials, and lighting from images. CVPR, 2022.
  43. KinectFusion: Real-time dense surface mapping and tracking. ISMAR, 2011.
  44. CNOS: A Strong Baseline for CAD-based Novel Object Segmentation. In ICCVW, 2023.
  45. Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions. In CVPR, 2022.
  46. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
  47. Pix2Pose: Pixel-wise coordinate regression of objects for 6D pose estimation. ICCV, 2019.
  48. Geotransformer: Fast and robust point cloud registration with geometric transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  49. Using 2 point+normal sets for fast registration of point clouds with small overlap. ICRA, 2017.
  50. Common objects in 3D: Large-scale learning and evaluation of real-life 3D category reconstruction. In ICCV, 2021.
  51. A dataset for improved RGBD-based object detection and pose estimation for warehouse pick-and-place. RA-L, 2016.
  52. Deep segmentation leverages geometric pose estimation in computer-aided total knee arthroplasty. Healthcare Technology Letters, 2019.
  53. Osop: A multi-stage one shot object pose estimation framework. In CVPR, 2022.
  54. ZebraPose: Coarse to fine surface encoding for 6DoF object pose estimation. CVPR, 2022.
  55. Multi-path learning for object pose estimation across domains. CVPR, 2020.
  56. BOP challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. CVPRW, 2023.
  57. Augmented Autoencoders: Implicit 3D orientation learning for 6D object detection. IJCV, 2019.
  58. Latent-class hough forests for 3D object detection and pose estimation. ECCV, 2014.
  59. 6-DoF pose estimation of household objects for robotic manipulation: An accessible dataset and benchmark. IROS, 2022.
  60. Ref-NeRF: Structured view-dependent appearance for neural radiance fields. In CVPR, 2022.
  61. A method for 6D pose estimation of free-form rigid objects using point pair features on range data. Sensors, 2018.
  62. GDR-Net: Geometry-guided direct regression network for monocular 6D object pose estimation. CVPR, 2021.
  63. Full 3D reconstruction of transparent objects. ACM TOG, 2018.
  64. Keypoint cascade voting for point cloud based 6DoF pose estimation. arXiv preprint arXiv:2210.08123, 2022.
  65. PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. RSS, 2018.
  66. TEASER: Fast and Certifiable Point Cloud Registration. IEEE Trans. Robotics, 2020.
  67. DPOD: 6D pose object detector and refiner. ICCV, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Tomas Hodan (22 papers)
  2. Martin Sundermeyer (18 papers)
  3. Van Nguyen Nguyen (9 papers)
  4. Gu Wang (25 papers)
  5. Eric Brachmann (27 papers)
  6. Bertram Drost (8 papers)
  7. Vincent Lepetit (101 papers)
  8. Carsten Rother (74 papers)
  9. Jiri Matas (133 papers)
  10. Yann Labbe (4 papers)
Citations (22)

Summary

  • The paper expands evaluation to include both seen and unseen objects, advancing model-based 6D pose estimation techniques.
  • State-of-the-art methods like GPose and GenFlow achieved notable improvements in localization accuracy and runtime efficiency.
  • Results highlight that narrowing performance gaps for unseen objects and occluded scenarios remains a key challenge for future research.

Overview of the BOP Challenge 2023: Advancements in 6D Object Pose Estimation

Introduction

The BOP Challenge 2023 has extended the evaluation of model-based 6D object pose estimation tasks to include a focus on unseen objects, alongside the traditional tasks involving objects seen during training. This new addition responds to the need for systems capable of adapting to novel objects in practical settings without extensive retraining. The challenge benchmarks three core tasks for both seen and unseen objects: model-based 2D object detection, 2D object segmentation, and 6D object localization.

Challenge Tasks

The challenge tasks are differentiated primarily based on whether the target objects are seen or unseen during training:

  • Tasks for Seen Objects:
  1. 6D Localization: Methods are evaluated on their ability to estimate the pose of objects visible in the training dataset.
  2. 2D Detection: The task involves detecting objects by predicting amodal 2D bounding boxes.
  3. 2D Segmentation: The challenge here is to segment objects by predicting modal 2D binary masks.
  • Tasks for Unseen Objects: These tasks mirror the ones for seen objects but require the methods to adapt to novel objects provided only during the testing phase.

Evaluation

The evaluation covered numerous entries across all tasks, with a diverse range of approaches from pure deep learning models to hybrid techniques. The methods were notably assessed on their accuracy, runtime, and applicability to different object complexities.

Key Results:

  • Seen Objects: The GPose method emerged as the top performer in 6D localization, showcasing a significant runtime improvement while moderately enhancing accuracy over the previous year's best. This advancement underlines the rapid progress in making 6D pose estimation more practical for real-time applications.
  • Unseen Objects: GenFlow stood out in the 6D localization of unseen objects, reaching accuracy comparable to the top method for seen objects from 2020. This achievement marks a significant milestone, showing that the gap between the performance on seen and unseen objects is narrowing.

Implications

The BOP Challenge 2023 results reflect a significant shift toward developing more versatile and efficient 6D pose estimation methods that can quickly adapt to new objects. However, the efficiency of adapting to unseen objects and the performance in occluded object detection are identified as critical areas needing further improvement. Specifically, the detection and segmentation of unseen objects in occluded scenarios remain a challenge, suggesting a potential direction for future research.

Future Directions

The challenge has set the stage for next steps in the field of 6D pose estimation. Efforts to bridge the performance gap in tasks involving unseen objects are particularly crucial. The 2023 challenge results hint at the importance of developing efficient onboarding protocols for new objects and improving the robustness of methods in detecting occluded instances. Looking ahead, introducing more challenging variants, such as onboarding based only on reference images, could push the boundaries of current methodologies further.

In sum, the BOP Challenge 2023 has not only showcased the current state of the art in 6D pose estimation but also highlighted the evolving challenges and opportunities in the field, setting a clear agenda for future research endeavors.