Live Laparoscopic Video Retrieval with Compressed Uncertainty (2203.04301v2)
Abstract: Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care. However the primitive and most common approach to retrieval, involving text in the form of keywords, is severely limited when dealing with complex media formats. Content-based retrieval offers a way to overcome this limitation, by using rich media as the query itself. Surgical video-to-video retrieval in particular is a new and largely unexplored research problem with high clinical value, especially in the real-time case: using real-time video hashing, search can be achieved directly inside of the operating room. Indeed, the process of hashing converts large data entries into compact binary arrays or hashes, enabling large-scale search operations at a very fast rate. However, due to fluctuations over the course of a video, not all bits in a given hash are equally reliable. In this work, we propose a method capable of mitigating this uncertainty while maintaining a light computational footprint. We present superior retrieval results (3-4 % top 10 mean average precision) on a multi-task evaluation protocol for surgery, using cholecystectomy phases, bypass phases, and coming from an entirely new dataset introduced here, critical events across six different surgery types. Success on this multi-task benchmark shows the generalizability of our approach for surgical video retrieval.
- Encouraging lstms to anticipate actions very early, in: IEEE International Conference on Computer Vision (ICCV), pp. 280–289. URL: https://doi.org/10.1109/ICCV.2017.39.
- Cataracts: Challenge on automatic tool annotation for cataract surgery. Medical Image Analysis 52, 24–41. URL: https://doi.org/10.1016/j.media.2018.11.008.
- Video retrieval system for meniscal surgery to improve health care services. J. Sensors 2018, 4390703:1–4390703:10. URL: https://doi.org/10.1155/2018/4390703.
- Quo vadis, action recognition? A new model and the kinetics dataset, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, IEEE Computer Society. pp. 4724–4733. URL: https://doi.org/10.1109/CVPR.2017.502.
- Order-sensitive deep hashing for multimorbidity medical image retrieval, in: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2018 - 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part I, Springer. pp. 620–628. URL: https://doi.org/10.1007/978-3-030-00928-1_70.
- Hashing with residual networks for image retrieval, in: Descoteaux, M., Maier-Hein, L., Franz, A.M., Jannin, P., Collins, D.L., Duchesne, S. (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2017 - 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III, Springer. pp. 541–549. URL: https://doi.org/10.1007/978-3-319-66179-7_62.
- A discriminative distance learning-based CBIR framework for characterization of indeterminate liver lesions, in: Müller, H., Greenspan, H., Syeda-Mahmood, T.F. (Eds.), Medical Content-Based Retrieval for Clinical Decision Support - Second MICCAI International Workshop, MCBR-CDS 2011, Toronto, ON, Canada, September 22, 2011, Revised Selected Papers, Springer. pp. 92–104. URL: https://doi.org/10.1007/978-3-642-28460-1_9.
- Tecno: Surgical phase recognition with multi-stage temporal convolutional networks, in: Martel, A.L., Abolmaesumi, P., Stoyanov, D., Mateus, D., Zuluaga, M.A., Zhou, S.K., Racoceanu, D., Joskowicz, L. (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2020 - 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part III, Springer. pp. 343–352. URL: https://doi.org/10.1007/978-3-030-59716-0_33.
- Long-term recurrent convolutional networks for visual recognition and description, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, IEEE Computer Society. pp. 2625–2634. URL: https://doi.org/10.1109/CVPR.2015.7298878.
- Computer-aided retinal surgery using data from the video compressed stream. International Journal Of Image And Video Processing: Theory And Application 1, 1–10.
- Temporal coherence-based self-supervised learning for laparoscopic workflow analysis, in: OR 2.0 Context-Aware Operating Theaters, in Conjunction with MICCAI 2018, Granada, Spain, September 16 and 20, 2018, Proceedings, Springer. pp. 85–93. URL: https://doi.org/10.1007/978-3-030-01201-4_11.
- Predicting the future: A jointly learnt model for action anticipation, in: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, IEEE. pp. 5561–5570. URL: https://doi.org/10.1109/ICCV.2019.00566.
- Medical image retrieval using multi-graph learning for MCI diagnostic assistance, in: Navab, N., Hornegger, J., III, W.M.W., Frangi, A.F. (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th International Conference Munich, Germany, October 5-9, 2015, Proceedings, Part II, Springer. pp. 86–93. URL: https://doi.org/10.1007/978-3-319-24571-3_11.
- Lexicographic unranking of combinations revisited. Algorithms 14, 97. URL: https://doi.org/10.3390/a14030097.
- Unsupervised feature learning for endomicroscopy image retrieval, in: Descoteaux, M., Maier-Hein, L., Franz, A.M., Jannin, P., Collins, D.L., Duchesne, S. (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2017 - 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III, Springer. pp. 64–71. URL: https://doi.org/10.1007/978-3-319-66179-7_8.
- Superpixel-based interest points for effective bags of visual words medical image retrieval, in: Müller, H., Greenspan, H., Syeda-Mahmood, T.F. (Eds.), Medical Content-Based Retrieval for Clinical Decision Support - Second MICCAI International Workshop, MCBR-CDS 2011, Toronto, ON, Canada, September 22, 2011, Revised Selected Papers, Springer. pp. 58–68. URL: https://doi.org/10.1007/978-3-642-28460-1_6.
- Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks. Medical Image Anal. 47, 203–218. URL: https://doi.org/10.1016/j.media.2018.05.001.
- Activitynet: A large-scale video benchmark for human activity understanding, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, IEEE Computer Society. pp. 961–970. URL: https://doi.org/10.1109/CVPR.2015.7298698.
- Early action prediction by soft regression. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2568–2583. URL: https://doi.org/10.1109/TPAMI.2018.2863279.
- Exploiting feature and class relationships in video categorization with regularized deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40, 352–364. URL: https://doi.org/10.1109/TPAMI.2017.2670560.
- Sv-rcnet: Workflow recognition from surgical videos using recurrent convolutional network. IEEE Trans. Medical Imaging 37, 1114–1126. URL: https://doi.org/10.1109/TMI.2017.2787657.
- Multi-task recurrent convolutional network with correlation loss for surgical video analysis. Medical Image Anal. 59. URL: https://doi.org/10.1016/j.media.2019.101572.
- Goldminer: A radiology image search engine. American Journal of Roentgenology 188, 1475–1478. URL: https://doi.org/10.2214/AJR.06.1740. pMID: 17515364.
- Future-state predicting LSTM for early surgery type recognition. IEEE Trans. Medical Imaging 39, 556–566. URL: https://doi.org/10.1109/TMI.2019.2931158.
- Deep sequential context networks for action prediction, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, IEEE Computer Society. pp. 3662–3670. URL: https://doi.org/10.1109/CVPR.2017.390.
- Neighborhood preserving hashing for scalable video retrieval, in: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 8211–8220. URL: https://doi.org/10.1109/ICCV.2019.00830.
- Content-based medical ultrasound image retrieval using a hierarchical method, in: 2009 2nd International Congress on Image and Signal Processing, pp. 1–4. URL: https://doi.org/10.1109/CISP.2009.5304139.
- Deep video hashing. IEEE Transactions on Multimedia 19, 1209–1219. URL: https://doi.org/10.1109/TMM.2016.2645404.
- Or black box and surgical control tower: Recording and streaming data and analytics to improve surgical care. Journal of Visceral Surgery 158, S18–S25. URL: https://www.sciencedirect.com/science/article/pii/S1878788621000163, doi:https://doi.org/10.1016/j.jviscsurg.2021.01.004. innovations in surgery.
- Artificial intelligence for surgical safety: Automatic assessment of the critical view of safety in laparoscopic cholecystectomy using deep learning. Annals of Surgery URL: https://doi.org/10.1097/SLA.0000000000004351.
- Hashing forests for morphological search and retrieval in neuroscientific image databases, in: Navab, N., Hornegger, J., III, W.M.W., Frangi, A.F. (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th International Conference Munich, Germany, October 5-9, 2015, Proceedings, Part II, Springer. pp. 135–143. URL: https://doi.org/10.1007/978-3-319-24571-3_17.
- Recognition of instrument-tissue interactions in endoscopic videos via action triplets, in: Medical Image Computing and Computer Assisted Intervention - MICCAI 2020 - 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part III, Springer. pp. 364–374. URL: https://doi.org/10.1007/978-3-030-59716-0_35, doi:10.1007/978-3-030-59716-0_35.
- Multi-task learning of a deep k-nearest neighbour network for histopathological image classification and retrieval, in: Shen, D., Liu, T., Peters, T.M., Staib, L.H., Essert, C., Zhou, S., Yap, P., Khan, A.R. (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2019 - 22nd International Conference, Shenzhen, China, October 13-17, 2019, Proceedings, Part I, Springer. pp. 676–684. URL: https://doi.org/10.1007/978-3-030-32239-7_75.
- Binary convolutional neural network features off-the-shelf for image to video linking in endoscopic multimedia databases. Multim. Tools Appl. 77, 28817–28842. URL: https://doi.org/10.1007/s11042-018-6016-3.
- Outcome trends and safety measures after 30 years of laparoscopic cholecystectomy: a systematic review and pooled data analysis. Surgical Endoscopy 32, 2175 – 2183. URL: https://doi.org/10.1007/s00464-017-5974-2.
- Multi-task temporal convolutional networks for joint recognition of surgical phases and steps in gastric bypass procedures. Int. J. Comput. Assist. Radiol. Surg. 16, 1111–1119. URL: https://doi.org/10.1007/s11548-021-02388-z.
- Dissecting self-supervised learning methods for surgical computer vision. Medical Image Analysis URL: https://doi.org/10.48550/arXiv.2207.00449, doi:10.48550/arXiv.2207.00449.
- Action anticipation by predicting future dynamic images. European Conference on Computer Vision (ECCV) , 89–105URL: https:doi.org/10.1007/978-3-030-11015-4_10.
- Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans. Image Process. 27, 3210–3221. URL: https://doi.org/10.1109/TIP.2018.2814344.
- An svd bypass latent semantic analysis for image retrieval, in: MCBR-CDS, pp. 122–132. URL: 10.1007/978-3-642-36678-9_12.
- Endonet: A deep architecture for recognition tasks on laparoscopic videos. IEEE Transactions on Medical Imaging 36. URL: https://doi.org/10.1109/TMI.2016.2593957.
- Vision-based approaches for surgical activity recognition using laparoscopic and RBGD videos. Ph.D. thesis. Université de Strasbourg. URL: http://www.theses.fr/2017STRAD005. thèse de doctorat dirigée par De Mathelin, Michel Image et vision Strasbourg 2017.
- Fisher kernel based task boundary retrieval in laparoscopic database with single video query, in: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R.D. (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2014 - 17th International Conference, Boston, MA, USA, September 14-18, 2014, Proceedings, Part III, Springer. pp. 409–416. URL: https://doi.org/10.1007/978-3-319-10443-0_52.
- Single- and multi-task architecture for surgical workflow at m2cai 2016. arXiv: Computer Vision and Pattern Recognition .
- Progressive teacher-student learning for early action prediction, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3556–3565. doi:https://doi.org/10.1109/CVPR.2019.00367.
- Unsupervised feature disentanglement for video retrieval in minimally invasive surgery. Medical Image Analysis 75, 102296. doi:https://doi.org/10.1016/j.media.2021.102296.
- Unsupervised deep video hashing via balanced code for large-scale video retrieval. IEEE Trans. Image Process. 28, 1993–2007. URL: https://doi.org/10.1109/TIP.2018.2882155.
- Unsupervised deep video hashing with balanced rotation, in: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, ijcai.org. pp. 3076–3082. URL: https://doi.org/10.24963/ijcai.2017/429.
- Yale image finder (yif): A new search engine for retrieving biomedical images. Bioinformatics (Oxford, England) 24, 1968–70. URL: https://doi.org/10.1093/bioinformatics/btn340.
- Encode the unseen: Predictive video hashing for scalable mid-stream retrieval, in: Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30 - December 4, 2020, Revised Selected Papers, Part V, Springer. pp. 427–442. URL: https://doi.org/10.1007/978-3-030-69541-5_26.
- Play and rewind: Optimizing binary representations of videos by self-supervised temporal hashing, in: Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15-19, 2016, ACM. pp. 781–790. URL: https://doi.org/10.1145/2964284.2964308.
- Tong Yu (119 papers)
- Pietro Mascagni (27 papers)
- Juan Verde (2 papers)
- Jacques Marescaux (22 papers)
- Didier Mutter (37 papers)
- Nicolas Padoy (93 papers)