Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Referring Camouflaged Object Detection (2306.07532v2)

Published 13 Jun 2023 in cs.CV

Abstract: We consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects. We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios. Then, we develop a simple but strong dual-branch framework, dubbed R2CNet, with a reference branch embedding the common representations of target objects from referring images and a segmentation branch identifying and segmenting camouflaged objects under the guidance of the common representations. In particular, we design a Referring Mask Generation module to generate pixel-level prior mask and a Referring Feature Enrichment module to enhance the capability of identifying specified camouflaged objects. Extensive experiments show the superiority of our Ref-COD methods over their COD counterparts in segmenting specified camouflaged objects and identifying the main body of target objects. Our code and dataset are publicly available at https://github.com/zhangxuying1004/RefCOD.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (97)
  1. D.-P. Fan, G.-P. Ji, M.-M. Cheng, and L. Shao, “Concealed object detection,” IEEE TPAMI, vol. 44, no. 10, pp. 6024–6042, 2022.
  2. Y. Pang, X. Zhao, T.-Z. Xiang, L. Zhang, and H. Lu, “Zoom in and out: A mixed-scale triplet network for camouflaged object detection,” in IEEE CVPR, 2022.
  3. G.-P. Ji, D.-P. Fan, Y.-C. Chou, D. Dai, A. Liniger, and L. Van Gool, “Deep gradient learning for efficient camouflaged object detection,” MIR, vol. 20, no. 1, pp. 92–108, 2023.
  4. Y. Sun, S. Wang, C. Chen, and T.-Z. Xiang, “Boundary-guided camouflaged object detection,” in IJCAI, 2022.
  5. D.-P. Fan, G.-P. Ji, T. Zhou, G. Chen, H. Fu, J. Shen, and L. Shao, “Pranet: Parallel reverse attention network for polyp segmentation,” in MICCAI, 2020.
  6. D.-P. Fan, T. Zhou, G.-P. Ji, Y. Zhou, G. Chen, H. Fu, J. Shen, and L. Shao, “Inf-net: Automatic covid-19 lung infection segmentation from ct images,” IEEE TMI, vol. 39, no. 8, pp. 2626–2637, 2020.
  7. D. Tabernik, S. Šela, J. Skvarč, and D. Skočaj, “Segmentation-based deep-learning approach for surface-defect detection,” J. Intell. Manuf., vol. 31, no. 3, pp. 759–776, 2020.
  8. X. Le, J. Mei, H. Zhang, B. Zhou, and J. Xi, “A learning-based approach for surface defect detection using small image datasets,” Neurocomputing, 2020.
  9. M. Türkoğlu and D. Hanbay, “Plant disease and pest detection using deep learning-based features,” Turk J Elec Eng & Comp Sci, vol. 27, no. 3, pp. 1636–1651, 2019.
  10. T. Troscianko, C. P. Benton, P. G. Lovell, D. J. Tolhurst, and Z. Pizlo, “Camouflage and visual perception,” Philos. Trans. R. Soc. B: Biol. Sci., vol. 364, no. 1516, pp. 449–461, 2009.
  11. R. Hu, M. Rohrbach, and T. Darrell, “Segmentation from natural language expressions,” in ECCV, 2016.
  12. C. Zhang, G. Lin, F. Liu, R. Yao, and C. Shen, “Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning,” in IEEE CVPR, 2019.
  13. T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in IEEE CVPR, 2017.
  14. F. Perazzi, P. Krähenbühl, Y. Pritch, and A. Hornung, “Saliency filters: Contrast based filtering for salient region detection,” in IEEE CVPR, 2012.
  15. D.-P. Fan, M.-M. Cheng, Y. Liu, T. Li, and A. Borji, “Structure-measure: A new way to evaluate foreground maps,” in IEEE ICCV, 2017.
  16. D.-P. Fan, C. Gong, Y. Cao, B. Ren, M.-M. Cheng, and A. Borji, “Enhanced-alignment measure for binary foreground map evaluation,” in IJCAI, 2018.
  17. R. Margolin, L. Zelnik-Manor, and A. Tal, “How to evaluate foreground maps?” in IEEE CVPR, 2014.
  18. D.-P. Fan, G.-P. Ji, G. Sun, M.-M. Cheng, J. Shen, and L. Shao, “Camouflaged object detection,” in IEEE CVPR, 2020.
  19. Y. Sun, G. Chen, T. Zhou, Y. Zhang, and N. Liu, “Context-aware cross-level fusion network for camouflaged object detection,” arXiv preprint arXiv:2105.12555, 2021.
  20. M.-C. Chou, H.-J. Chen, and H.-H. Shuai, “Finding the achilles heel: Progressive identification network for camouflaged object detection,” in IEEE ICME, 2022.
  21. M. Zhuge, X. Lu, Y. Guo, Z. Cai, and S. Chen, “Cubenet: X-shape connection for camouflaged object detection,” PR, vol. 127, p. 108644, 2022.
  22. G. Chen, S.-J. Liu, Y.-J. Sun, G.-P. Ji, Y.-F. Wu, and T. Zhou, “Camouflaged object detection via context-aware cross-level fusion,” IEEE TCSVT, vol. 32, no. 10, pp. 6981–6993, 2022.
  23. Q. Jia, S. Yao, Y. Liu, X. Fan, R. Liu, and Z. Luo, “Segment, magnify and reiterate: Detecting camouflaged objects the hard way,” in IEEE CVPR, 2022.
  24. M. Zhang, S. Xu, Y. Piao, D. Shi, S. Lin, and H. Lu, “Preynet: Preying on camouflaged objects,” in ACM MM, 2022.
  25. K. Wang, H. Bi, Y. Zhang, C. Zhang, Z. Liu, and S. Zheng, “D 22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT c-net: A dual-branch, dual-guidance and cross-refine network for camouflaged object detection,” IEEE TIE, vol. 69, no. 5, pp. 5364–5374, 2021.
  26. Q. Zhai, X. Li, F. Yang, C. Chen, H. Cheng, and D.-P. Fan, “Mutual graph learning for camouflaged object detection,” in IEEE CVPR, 2021.
  27. R. He, Q. Dong, J. Lin, and R. W. Lau, “Weakly-supervised camouflaged object detection with scribble annotations,” in AAAI, 2022.
  28. A. Li, J. Zhang, Y. Lv, B. Liu, T. Zhang, and Y. Dai, “Uncertainty-aware joint salient object and camouflaged object detection,” in IEEE CVPR, 2021.
  29. J. Liu, J. Zhang, and N. Barnes, “Modeling aleatoric uncertainty for camouflaged object detection,” in IEEE WACV, 2022.
  30. N. Kajiura, H. Liu, and S. Satoh, “Improving camouflaged object detection with the uncertainty of pseudo-edge labels,” in ACM MM Asia, 2021.
  31. H. Mei, G.-P. Ji, Z. Wei, X. Yang, X. Wei, and D.-P. Fan, “Camouflaged object segmentation with distraction mining,” in IEEE CVPR, 2021.
  32. B. Yin, X. Zhang, Q. Hou, B.-Y. Sun, D.-P. Fan, and L. Van Gool, “Camoformer: Masked separable attention for camouflaged object detection,” arXiv preprint arXiv:2212.06570, 2022.
  33. W. Zhai, Y. Cao, H. Xie, and Z.-J. Zha, “Deep texton-coherence network for camouflaged object detection,” IEEE TMM, 2022.
  34. C. Zhang, K. Wang, H. Bi, Z. Liu, and L. Yang, “Camouflaged object detection via neighbor connection and hierarchical information transfer,” CVIU, vol. 221, p. 103450, 2022.
  35. Y. Cheng, H.-Z. Hao, Y. Ji, Y. Li, and C.-P. Liu, “Attention-based neighbor selective aggregation network for camouflaged object detection,” in IEEE IJCNN, 2022, pp. 1–8.
  36. D.-P. Fan, G.-P. Ji, P. Xu, M.-M. Cheng, C. Sakaridis, and L. Van Gool, “Advances in deep concealed scene understanding,” VINT, 2023.
  37. H. Zhu, P. Li, H. Xie, X. Yan, D. Liang, D. Chen, M. Wei, and J. Qin, “I can find you! boundary-guided separated attention network for camouflaged object detection,” in AAAI, 2022.
  38. T. Zhou, Y. Zhou, C. Gong, J. Yang, and Y. Zhang, “Feature aggregation and propagation network for camouflaged object detection,” IEEE TIP, vol. 31, pp. 7036–7047, 2022.
  39. X. Qin, D.-P. Fan, C. Huang, C. Diagne, Z. Zhang, A. C. Sant’Anna, A. Suarez, M. Jagersand, and L. Shao, “Boundary-aware segmentation network for mobile and web applications,” arXiv preprint arXiv:2101.04704, 2021.
  40. G.-P. Ji, L. Zhu, M. Zhuge, and K. Fu, “Fast camouflaged object detection via edge-based reversible re-calibration network,” PR, vol. 123, p. 108414, 2022.
  41. J. Zhu, X. Zhang, S. Zhang, and J. Liu, “Inferring camouflaged objects by texture-aware interactive guidance network,” in AAAI, 2021.
  42. J. Ren, X. Hu, L. Zhu, X. Xu, Y. Xu, W. Wang, Z. Deng, and P.-A. Heng, “Deep texture-aware features for camouflaged object detection,” IEEE TCSVT, 2021.
  43. Y. Zhong, B. Li, L. Tang, S. Kuang, S. Wu, and S. Ding, “Detecting camouflaged object in frequency domain,” in IEEE CVPR, 2022.
  44. J. Lin, X. Tan, K. Xu, L. Ma, and R. W. Lau, “Frequency-aware camouflaged object detection,” ACM TMCCA, vol. 19, no. 2, pp. 1–16, 2023.
  45. J. Zhang, Y. Lv, M. Xiang, A. Li, Y. Dai, and Y. Zhong, “Depth-guided camouflaged object detection,” arXiv preprint arXiv:2106.13217, 2021.
  46. M. Xiang, J. Zhang, Y. Lv, A. Li, Y. Zhong, and Y. Dai, “Exploring depth contribution for camouflaged object detection,” arXiv e-prints, pp. arXiv–2106, 2021.
  47. Z. Wu, D. P. Paudel, D.-P. Fan, J. Wang, S. Wang, C. Demonceaux, R. Timofte, and L. Van Gool, “Source-free depth for object pop-out,” arXiv preprint arXiv:2212.05370, 2022.
  48. M.-M. Cheng, N. J. Mitra, X. Huang, P. H. Torr, and S.-M. Hu, “Global contrast based salient region detection,” IEEE TPAMI, vol. 37, no. 3, pp. 569–582, 2014.
  49. L. Wang, H. Lu, X. Ruan, and M.-H. Yang, “Deep networks for saliency detection via local estimation and global search,” in IEEE CVPR, 2015.
  50. J. Kim and V. Pavlovic, “A shape-based approach for salient object detection using deep learning,” in ECCV, 2016.
  51. S. He, R. W. Lau, W. Liu, Z. Huang, and Q. Yang, “Supercnn: A superpixelwise convolutional neural network for salient object detection,” IJCV, vol. 115, pp. 330–344, 2015.
  52. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in IEEE CVPR, 2015.
  53. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in NeurIPS, 2017.
  54. X. Zhang, X. Sun, Y. Luo, J. Ji, Y. Zhou, Y. Wu, F. Huang, and R. Ji, “Rstnet: Captioning with adaptive attention on visual and non-visual words,” in IEEE CVPR, 2021.
  55. M. Wu, X. Zhang, X. Sun, Y. Zhou, C. Chen, J. Gu, X. Sun, and R. Ji, “Difnet: Boosting visual information flow for image captioning,” in IEEE CVPR, 2022.
  56. M. Zhuge, D.-P. Fan, N. Liu, D. Zhang, D. Xu, and L. Shao, “Salient object detection via integrity learning,” IEEE TPAMI, vol. 45, no. 3, pp. 3738–3752, 2023.
  57. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in MICCAI, 2015.
  58. Q. Fan, D.-P. Fan, H. Fu, C.-K. Tang, L. Shao, and Y.-W. Tai, “Group collaborative learning for co-salient object detection,” in IEEE CVPR, 2021.
  59. Z. Zhang, W. Jin, J. Xu, and M.-M. Cheng, “Gradient-induced co-saliency detection,” in ECCV, 2020.
  60. Q. Hou, M.-M. Cheng, X. Hu, A. Borji, Z. Tu, and P. H. Torr, “Deeply supervised salient object detection with short connections,” in IEEE CVPR, 2017.
  61. N. Liu, J. Han, and M.-H. Yang, “Picanet: Learning pixel-wise contextual attention for saliency detection,” in IEEE CVPR, 2018.
  62. S. Chen, X. Tan, B. Wang, and X. Hu, “Reverse attention for salient object detection,” in ECCV, 2018.
  63. A. Borji, S. Frintrop, D. N. Sihite, and L. Itti, “Adaptive object tracking by learning background context,” in IEEE CVPRW, 2012.
  64. Z.-Y. Li, S. Gao, and M.-M. Cheng, “Exploring feature self-relation for self-supervised transformer,” arXiv preprint arXiv:2206.05184, 2022.
  65. C. Guo and L. Zhang, “A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression,” IEEE TIP, vol. 19, no. 1, pp. 185–198, 2009.
  66. M.-M. Cheng, F.-L. Zhang, N. J. Mitra, X. Huang, and S.-M. Hu, “Repfinder: finding approximately repeated scene elements for image editing,” ACM TOG, vol. 29, no. 4, pp. 1–8, 2010.
  67. A. Shaban, S. Bansal, Z. Liu, I. Essa, and B. Boots, “One-shot learning for semantic segmentation,” in BMVC, 2017.
  68. X. Zhang, Y. Wei, Y. Yang, and T. S. Huang, “Sg-one: Similarity guidance network for one-shot semantic segmentation,” IEEE TCYB, vol. 50, no. 9, pp. 3855–3865, 2020.
  69. Z. Tian, H. Zhao, M. Shu, Z. Yang, R. Li, and J. Jia, “Prior guided feature enrichment network for few-shot segmentation,” IEEE TPAMI, vol. 44, no. 2, pp. 1050–1065, 2020.
  70. C. Lang, G. Cheng, B. Tu, and J. Han, “Learning what not to segment: A new perspective on few-shot segmentation,” in IEEE CVPR, 2022.
  71. R. Li, K. Li, Y.-C. Kuo, M. Shu, X. Qi, X. Shen, and J. Jia, “Referring image segmentation via recurrent refinement networks,” in IEEE CVPR, 2018.
  72. C. Liu, Z. Lin, X. Shen, J. Yang, X. Lu, and A. Yuille, “Recurrent multimodal interaction for referring image segmentation,” in IEEE ICCV, 2017.
  73. H. Shi, H. Li, F. Meng, and Q. Wu, “Key-word-aware network for referring expression image segmentation,” in ECCV, 2018.
  74. Y. Zhou, R. Ji, G. Luo, X. Sun, J. Su, X. Ding, C.-W. Lin, and Q. Tian, “A real-time global inference network for one-stage referring expression comprehension,” IEEE TNNLS, 2021.
  75. G. Luo, Y. Zhou, X. Sun, L. Cao, C. Wu, C. Deng, and R. Ji, “Multi-task collaborative network for joint referring expression comprehension and segmentation,” in IEEE CVPR, 2020.
  76. X. Sun, X. Zhang, L. Cao, Y. Wu, F. Huang, and R. Ji, “Exploring language prior for mode-sensitive visual attention modeling,” in ACM MM, 2020.
  77. F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung, “A benchmark dataset and evaluation methodology for video object segmentation,” in IEEE CVPR, 2016.
  78. W. Wang, J. Shen, F. Guo, M.-M. Cheng, and A. Borji, “Revisiting video saliency: A large-scale benchmark and a new model,” in IEEE CVPR, 2018.
  79. T.-N. Le, T. V. Nguyen, Z. Nie, M.-T. Tran, and A. Sugimoto, “Anabranch network for camouflaged object segmentation,” CVIU, vol. 184, pp. 45–56, 2019.
  80. Y. Lv, J. Zhang, Y. Dai, A. Li, B. Liu, N. Barnes, and D.-P. Fan, “Simultaneously localize, segment and rank the camouflaged objects,” in IEEE CVPR, 2021.
  81. B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, “Places: A 10 million image database for scene recognition,” IEEE TPAMI, vol. 40, no. 6, pp. 1452–1464, 2017.
  82. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE CVPR, 2016.
  83. B. Huang, D. Lian, W. Luo, and S. Gao, “Look before you leap: Learning landmark features for one-stage visual grounding,” in IEEE CVPR, 2021.
  84. Z. Yang, B. Gong, L. Wang, W. Huang, D. Yu, and J. Luo, “A fast and accurate one-stage approach to visual grounding,” in IEEE ICCV, 2019.
  85. X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” in NeurIPS, 2015.
  86. M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in ICML, 2019.
  87. S.-H. Gao, M.-M. Cheng, K. Zhao, X.-Y. Zhang, M.-H. Yang, and P. Torr, “Res2net: A new multi-scale backbone architecture,” IEEE TPAMI, vol. 43, no. 2, pp. 652–662, 2019.
  88. Y. Jing, T. Kong, W. Wang, L. Wang, L. Li, and T. Tan, “Locate then segment: A strong pipeline for referring image segmentation,” in IEEE CVPR, 2021.
  89. J. Liu, Y. Bao, G.-S. Xie, H. Xiong, J.-J. Sonke, and E. Gavves, “Dynamic prototype convolution network for few-shot semantic segmentation,” in IEEE CVPR, 2022.
  90. J. Wei, S. Wang, and Q. Huang, “F33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTnet: fusion, feedback and focus for salient object detection,” in AAAI, 2020.
  91. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in ICLR, 2015.
  92. I. Loshchilov and F. Hutter, “Sgdr: Stochastic gradient descent with warm restarts,” in ICLR, 2017.
  93. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” in NeurIPS, 2019.
  94. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE TPAMI, vol. 40, no. 4, pp. 834–848, 2017.
  95. S. Liu, D. Huang et al., “Receptive field block net for accurate and fast object detection,” in ECCV, 2018.
  96. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in ICML, 2021.
  97. T. Lüddecke and A. Ecker, “Image segmentation using text and image prompts,” in IEEE CVPR, 2022.
Citations (8)

Summary

We haven't generated a summary for this paper yet.