Multi Kernel Positional Embedding ConvNeXt for Polyp Segmentation (2301.06673v2)
Abstract: Medical image segmentation is the technique that helps doctor view and has a precise diagnosis, particularly in Colorectal Cancer. Specifically, with the increase in cases, the diagnosis and identification need to be faster and more accurate for many patients; in endoscopic images, the segmentation task has been vital to helping the doctor identify the position of the polyps or the ache in the system correctly. As a result, many efforts have been made to apply deep learning to automate polyp segmentation, mostly to ameliorate the U-shape structure. However, the simple skip connection scheme in UNet leads to deficient context information and the semantic gap between feature maps from the encoder and decoder. To deal with this problem, we propose a novel framework composed of ConvNeXt backbone and Multi Kernel Positional Embedding block. Thanks to the suggested module, our method can attain better accuracy and generalization in the polyps segmentation task. Extensive experiments show that our model achieves the Dice coefficient of 0.8818 and the IOU score of 0.8163 on the Kvasir-SEG dataset. Furthermore, on various datasets, we make competitive achievement results with other previous state-of-the-art methods.
- M. M. Center, A. Jemal, R. A. Smith, and E. Ward, “Worldwide variations in colorectal cancer,” CA: a cancer journal for clinicians, vol. 59, no. 6, pp. 366–378, 2009.
- J. M. Church, “Experience in the endoscopic management of large colonic polyps,” ANZ journal of surgery, vol. 73, no. 12, pp. 988–995, 2003.
- Z. Levi, S. Birkenfeld, A. Vilkin, M. Bar-Chana, I. Lifshitz, M. Chared, E. Maoz, and Y. Niv, “A higher detection rate for colorectal cancer and advanced adenomatous polyp for screening with immunochemical fecal occult blood test than guaiac fecal occult blood test, despite lower compliance rate. a prospective, controlled, feasibility study,” International journal of cancer, vol. 128, no. 10, pp. 2415–2424, 2011.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241.
- Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested u-net architecture for medical image segmentation”, booktitle=”deep learning in medical image analysis and multimodal learning for clinical decision support,” 2018.
- F. I. Diakogiannis, F. Waldner, P. Caccetta, and C. Wu, “Resunet-a: A deep learning framework for semantic segmentation of remotely sensed data,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 162, pp. 94–114, 2020.
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s.”
- J. Su, Y. Lu, S. Pan, B. Wen, and Y. Liu, “Roformer: Enhanced transformer with rotary position embedding,” arXiv preprint arXiv:2104.09864, 2021.
- D. Jha, S. A. Hicks, K. Emanuelsen, H. Johansen, D. Johansen, T. de Lange, M. A. Riegler, and P. Halvorsen, “Medico multimedia task at mediaeval 2020: Automatic polyp segmentation,” arXiv preprint arXiv:2012.15244, 2020.
- X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes,” IEEE transactions on medical imaging, vol. 37, no. 12, pp. 2663–2674, 2018.
- D. Jha, P. H. Smedsrud, M. A. Riegler, D. Johansen, T. De Lange, P. Halvorsen, and H. D. Johansen, “ResUNet++: An Advanced Architecture for Medical Image Segmentation,” in Proc. of International Symposium on Multimedia, 2019, pp. 225–230.
- D. Jha, N. K. Tomar, S. Ali, M. A. Riegler, H. D. Johansen, D. Johansen, T. de Lange, and P. Halvorsen, “Nanonet: Real-time polyp segmentation in video capsule endoscopy and colonoscopy,” in 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), 2021, pp. 37–43.
- N. K. Tomar, D. Jha, S. Ali, H. D. Johansen, D. Johansen, M. A. Riegler, and P. Halvorsen, “Ddanet: Dual decoder attention network for automatic polyp segmentation,” in Pattern Recognition. ICPR International Workshops and Challenges, A. Del Bimbo, R. Cucchiara, S. Sclaroff, G. M. Farinella, T. Mei, M. Bertini, H. J. Escalante, and R. Vezzani, Eds. Cham: Springer International Publishing, 2021, pp. 307–314.
- H. Gao, H. Yuan, Z. Wang, and S. Ji, “Pixel transposed convolutional networks.”
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition.”
- S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks.”
- S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift.”
- M. A. Islam*, S. Jia*, and N. D. B. Bruce, “How much position information do convolutional neural networks encode?” in International Conference on Learning Representations, 2020.
- M. Amirul Islam, M. Kowal, S. Jia, K. G. Derpanis, and N. D. B. Bruce, “Global pooling, more than meets the eye: Position information is encoded channel-wise in cnns,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
- O. Semih Kayhan and J. C. van Gemert, “On translation invariance in cnns: Convolutional layers can exploit absolute spatial location,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- M. Amirul Islam, M. Kowal, S. Jia, K. G. Derpanis, and N. D. B. Bruce, “Position, padding and predictions: A deeper look at position information in cnns,” in arXiv preprint arXiv:2101.12322, 2021.
- J. Bertels, T. Eelbode, M. Berman, D. Vandermeulen, F. Maes, R. Bisschops, and M. B. Blaschko, “Optimizing the dice score and jaccard index for medical image segmentation: Theory and practice,” in International conference on medical image computing and computer-assisted intervention. Springer, 2019, pp. 92–100.
- D. Jha, P. H. Smedsrud, M. A. Riegler, P. Halvorsen, T. d. Lange, D. Johansen, and H. D. Johansen, “Kvasir-seg: A segmented polyp dataset,” in International Conference on Multimedia Modeling. Springer, 2020, pp. 451–462.
- T. DeVries and G. W. Taylor, “Dataset augmentation in feature space,” arXiv preprint arXiv:1702.05538, 2017.
- ——, “Improved regularization of convolutional neural networks with cutout,” arXiv preprint arXiv:1708.04552, 2017.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-iou loss: Faster and better learning for bounding box regression,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 12 993–13 000.
- R. R. Shamir, Y. Duchin, J. Kim, G. Sapiro, and N. Harel, “Continuous dice coefficient: a method for evaluating probabilistic segmentations,” arXiv preprint arXiv:1906.11031, 2019.
- D. Jha, P. H. Smedsrud, D. Johansen, T. de Lange, H. D. Johansen, P. Halvorsen, and M. A. Riegler, “A comprehensive study on colorectal polyp segmentation with resunet++, conditional random field and test-time augmentation,” IEEE journal of biomedical and health informatics, vol. 25, no. 6, pp. 2029–2040, 2021.
- Trong-Hieu Nguyen Mau (4 papers)
- Quoc-Huy Trinh (16 papers)
- Nhat-Tan Bui (9 papers)
- Minh-Triet Tran (70 papers)
- Hai-Dang Nguyen (13 papers)