Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 34 tok/s Pro
2000 character limit reached

Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation (2403.10216v1)

Published 15 Mar 2024 in cs.CV and cs.AI

Abstract: Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. Lee, E.-J., Plishker, W., Liu, X., Kane, T., Bhattacharyya, S. S., and Shekhar, R., “Segmentation of surgical instruments in laparoscopic videos: training dataset generation and deep-learning-based framework,” in [Medical Imaging 2019: Image-Guided Procedures, Robotic Interventions, and Modeling ], 10951, 461–469, SPIE (2019).
  2. Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., and Maier-Hein, K. H., “nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,” Nature methods 18(2), 203–211 (2021).
  3. Baumgartner, M., Jäger, P. F., Isensee, F., and Maier-Hein, K. H., “nndetection: a self-configuring method for medical object detection,” in [Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24 ], 530–539, Springer (2021).
  4. Isensee, F., Ulrich, C., Wald, T., and Maier-Hein, K. H., “Extending nnu-net is all you need,” in [BVM Workshop ], 12–17, Springer (2023).
  5. McConnell, N., Miron, A., Wang, Z., and Li, Y., “Integrating residual, dense, and inception blocks into the nnunet,” in [2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS) ], 217–222, IEEE (2022).
  6. Zhou, H.-Y., Guo, J., Zhang, Y., Han, X., Yu, L., Wang, L., and Yu, Y., “nnformer: volumetric medical image segmentation via a 3d transformer,” IEEE Transactions on Image Processing (2023).
  7. Liu, L., Zhang, J., He, R., Liu, Y., Wang, Y., Tai, Y., Luo, D., Wang, C., Li, J., and Huang, F., “Learning by analogy: Reliable supervision from transformations for unsupervised optical flow estimation,” in [Proceedings of the IEEE/CVF conference on computer vision and pattern recognition ], 6489–6498 (2020).
  8. Lai, H.-Y., Tsai, Y.-H., and Chiu, W.-C., “Bridging stereo matching and optical flow via spatiotemporal correspondence,” in [Proceedings of the IEEE/CVF conference on computer vision and pattern recognition ], 1890–1899 (2019).
  9. Rashed, H., Yogamani, S., El-Sallab, A., Krizek, P., and El-Helw, M., “Optical flow augmented semantic segmentation networks for automated driving,” arXiv preprint arXiv:1901.07355 (2019).
  10. Twinanda, A. P., Shehata, S., Mutter, D., Marescaux, J., De Mathelin, M., and Padoy, N., “Endonet: a deep architecture for recognition tasks on laparoscopic videos,” IEEE transactions on medical imaging 36(1), 86–97 (2016).
  11. Hong, W.-Y., Kao, C.-L., Kuo, Y.-H., Wang, J.-R., Chang, W.-L., and Shih, C.-S., “Cholecseg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on cholec80,” arXiv preprint arXiv:2012.12453 (2020).

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.