A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics (2404.12963v1)
Abstract: With the current demand for automation in the agro-food industry, accurately detecting and localizing relevant objects in 3D is essential for successful robotic operations. However, this is a challenge due the presence of occlusions. Multi-view perception approaches allow robots to overcome occlusions, but a tracking component is needed to associate the objects detected by the robot over multiple viewpoints. Multi-object tracking (MOT) algorithms can be categorized between two-stage and single-stage methods. Two-stage methods tend to be simpler to adapt and implement to custom applications, while single-stage methods present a more complex end-to-end tracking method that can yield better results in occluded situations at the cost of more training data. The potential advantages of single-stage methods over two-stage methods depends on the complexity of the sequence of viewpoints that a robot needs to process. In this work, we compare a 3D two-stage MOT algorithm, 3D-SORT, against a 3D single-stage MOT algorithm, MOT-DETR, in three different types of sequences with varying levels of complexity. The sequences represent simpler and more complex motions that a robot arm can perform in a tomato greenhouse. Our experiments in a tomato greenhouse show that the single-stage algorithm consistently yields better tracking accuracy, especially in the more challenging sequences where objects are fully occluded or non-visible during several viewpoints.
- G. Kootstra, X. Wang, P. M. Blok, J. Hemming, and E. van Henten, “Selective Harvesting Robotics: Current Research, Trends, and Future Directions,” Current Robotics Reports, vol. 2, no. 1, pp. 95–104, Mar. 2021. [Online]. Available: https://doi.org/10.1007/s43154-020-00034-1
- J. Crowley, “Dynamic world modeling for an intelligent mobile robot using a rotating ultra-sonic ranging device,” in Proceedings. 1985 IEEE International Conference on Robotics and Automation, vol. 2. St. Louis, MO, USA: Institute of Electrical and Electronics Engineers, 1985, pp. 128–135. [Online]. Available: http://ieeexplore.ieee.org/document/1087380/
- J. Elfring, S. van den Dries, M. van de Molengraft, and M. Steinbuch, “Semantic world modeling using probabilistic multiple hypothesis anchoring,” Robotics and Autonomous Systems, vol. 61, no. 2, pp. 95–105, Feb. 2013. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0921889012002163
- B. Arad, J. Balendonck, R. Barth, O. Ben‐Shahar, Y. Edan, T. Hellström, J. Hemming, P. Kurtser, O. Ringdahl, T. Tielen, and B. v. Tuijl, “Development of a sweet pepper harvesting robot,” Journal of Field Robotics, vol. n/a, no. n/a, 2020, _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/rob.21937. [Online]. Available: https://www.onlinelibrary.wiley.com/doi/abs/10.1002/rob.21937
- A. K. Burusa, J. Scholten, D. R. Rincon, X. Wang, E. J. van Henten, and G. Kootstra, “Efficient Search and Detection of Relevant Plant Parts using Semantics-Aware Active Vision,” June 2023, arXiv:2306.09801 [cs]. [Online]. Available: http://arxiv.org/abs/2306.09801
- A. Persson, P. Z. D. Martires, A. Loutfi, and L. De Raedt, “Semantic Relational Object Tracking,” IEEE Transactions on Cognitive and Developmental Systems, vol. 12, no. 1, pp. 84–97, Mar. 2020, arXiv: 1902.09937. [Online]. Available: http://arxiv.org/abs/1902.09937
- D. Rapado-Rincón, E. J. van Henten, and G. Kootstra, “Development and evaluation of automated localisation and reconstruction of all fruits on tomato plants in a greenhouse based on multi-view perception and 3D multi-object tracking,” Biosystems Engineering, vol. 231, pp. 78–91, July 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1537511023001162
- A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, “Simple Online and Realtime Tracking,” 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468, Sept. 2016, arXiv: 1602.00763. [Online]. Available: http://arxiv.org/abs/1602.00763
- N. Wojke, A. Bewley, and D. Paulus, “Simple online and realtime tracking with a deep association metric,” in 2017 IEEE International Conference on Image Processing (ICIP), Sept. 2017, pp. 3645–3649, iSSN: 2381-8549.
- Y. Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, “FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking,” International Journal of Computer Vision, vol. 129, no. 11, pp. 3069–3087, Nov. 2021. [Online]. Available: https://doi.org/10.1007/s11263-021-01513-4
- M. Halstead, C. McCool, S. Denman, T. Perez, and C. Fookes, “Fruit Quantity and Ripeness Estimation Using a Robotic Vision System,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 2995–3002, Oct. 2018. [Online]. Available: https://ieeexplore.ieee.org/document/8392450/
- R. Kirk, M. Mangan, and G. Cielniak, “Robust Counting of Soft Fruit Through Occlusions with Re-identification,” in Computer Vision Systems, ser. Lecture Notes in Computer Science, M. Vincze, T. Patten, H. I. Christensen, L. Nalpantidis, and M. Liu, Eds. Cham: Springer International Publishing, 2021, pp. 211–222.
- M. Halstead, A. Ahmadi, C. Smitt, O. Schmittmann, and C. McCool, “Crop Agnostic Monitoring Driven by Deep Learning,” Frontiers in Plant Science, vol. 12, 2021. [Online]. Available: https://www.frontiersin.org/article/10.3389/fpls.2021.786702
- N. Hu, D. Su, S. Wang, P. Nyamsuren, and Y. Qiao, “LettuceTrack: Detection and tracking of lettuce for robotic precision spray in agriculture,” Frontiers in Plant Science, vol. 13, Sept. 2022, publisher: Frontiers. [Online]. Available: https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2022.1003243/full
- J. Villacrés, M. Viscaino, J. Delpiano, S. Vougioukas, and F. Auat Cheein, “Apple orchard production estimation using deep learning strategies: A comparison of tracking-by-detection algorithms,” Computers and Electronics in Agriculture, vol. 204, p. 107513, Jan. 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0168169922008213
- D. Rapado-Rincón, E. J. van Henten, and G. Kootstra, “MinkSORT: A 3D deep feature extractor using sparse convolutions to improve 3D multi-object tracking in greenhouse tomato plants,” July 2023, arXiv:2307.05219 [cs]. [Online]. Available: http://arxiv.org/abs/2307.05219
- D. Rapado-Rincon, H. Nap, K. Smolenova, E. J. van Henten, and G. Kootstra, “MOT-DETR: 3D Single Shot Detection and Tracking with Transformers to build 3D representations for Agro-Food Robots,” Feb. 2024, arXiv:2311.15674 [cs]. [Online]. Available: http://arxiv.org/abs/2311.15674
- “ultralytics/ultralytics: NEW - YOLOv8 in PyTorch > ONNX > OpenVINO > CoreML > TFLite.” [Online]. Available: https://github.com/ultralytics/ultralytics
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” May 2020, arXiv:2005.12872 [cs]. [Online]. Available: http://arxiv.org/abs/2005.12872
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.