Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards End-to-End Unsupervised Saliency Detection with Self-Supervised Top-Down Context (2310.09533v1)

Published 14 Oct 2023 in cs.CV

Abstract: Unsupervised salient object detection aims to detect salient objects without using supervision signals eliminating the tedious task of manually labeling salient objects. To improve training efficiency, end-to-end methods for USOD have been proposed as a promising alternative. However, current solutions rely heavily on noisy handcraft labels and fail to mine rich semantic information from deep features. In this paper, we propose a self-supervised end-to-end salient object detection framework via top-down context. Specifically, motivated by contrastive learning, we exploit the self-localization from the deepest feature to construct the location maps which are then leveraged to learn the most instructive segmentation guidance. Further considering the lack of detailed information in deepest features, we exploit the detail-boosting refiner module to enrich the location labels with details. Moreover, we observe that due to lack of supervision, current unsupervised saliency models tend to detect non-salient objects that are salient in some other samples of corresponding scenarios. To address this widespread issue, we design a novel Unsupervised Non-Salient Suppression (UNSS) method developing the ability to ignore non-salient objects. Extensive experiments on benchmark datasets demonstrate that our method achieves leading performance among the recent end-to-end methods and most of the multi-stage solutions. The code is available.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Frequency-tuned salient region detection. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 1597–1604.
  2. Adam Bielski and Paolo Favaro. [n. d.]. MOVE: Unsupervised Movable Object Segmentation and Detection. In Advances in Neural Information Processing Systems.
  3. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.
  4. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 539–546.
  5. Efficient saliency-based object detection in remote sensing images using deep belief networks. IEEE geoscience and remote sensing letters 13, 2 (2016), 137–141.
  6. Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421 (2018).
  7. Saliency for fine-grained object recognition in domains with scarce training data. Pattern Recognition 94 (2019), 62–73.
  8. Weakly-Supervised Salient Object Detection Using Point Supervision. In AAAI.
  9. Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. Advances in Neural Information Processing Systems 33 (2020), 11309–11321.
  10. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, 1735–1742.
  11. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.
  12. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
  13. Saliency detection via absorbing markov chain. In Proceedings of the IEEE international conference on computer vision. 1665–1672.
  14. Supervised contrastive learning. Advances in neural information processing systems 33 (2020), 18661–18673.
  15. Philipp Krähenbühl and Vladlen Koltun. 2011. Efficient inference in fully connected crfs with gaussian edge potentials. Advances in neural information processing systems 24 (2011).
  16. Deep saliency with encoded low level distance map and high level features. In Proceedings of the IEEE conference on computer vision and pattern recognition. 660–668.
  17. Guanbin Li and Yizhou Yu. 2015. Visual saliency based on multiscale deep features. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5455–5463.
  18. Prototypical Contrastive Learning of Unsupervised Representations. In International Conference on Learning Representations.
  19. Saliency detection via dense and sparse reconstruction. In Proceedings of the IEEE international conference on computer vision. 2976–2983.
  20. The secrets of salient object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 280–287.
  21. A causal debiasing framework for unsupervised salient object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1610–1619.
  22. A simple pooling-based design for real-time salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3917–3926.
  23. Learning to detect a salient object. IEEE Transactions on Pattern analysis and machine intelligence 33, 2 (2010), 353–367.
  24. Deepusps: Deep robust unsupervised saliency prediction via self-supervision. Advances in Neural Information Processing Systems 32 (2019).
  25. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
  26. Multi-scale interactive network for salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9413–9422.
  27. Msednet: multi-scale deep saliency learning for moving object detection. In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 1670–1675.
  28. Mfnet: Multi-filter directive network for weakly supervised salient object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4136–4145.
  29. Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7479–7489.
  30. Learning affinity from attention: end-to-end weakly-supervised semantic segmentation with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16846–16855.
  31. Imagenet large scale visual recognition challenge. International journal of computer vision 115 (2015), 211–252.
  32. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815–823.
  33. Hierarchical image saliency detection on extended CSSD. IEEE transactions on pattern analysis and machine intelligence 38, 4 (2015), 717–729.
  34. Unsupervised salient object detection with spectral cluster voting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3971–3980.
  35. Normalized cut loss for weakly-supervised cnn segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1818–1827.
  36. Bi-directional object-context prioritization learning for saliency ranking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5882–5891.
  37. Deep networks for saliency detection via local estimation and global search. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3183–3192.
  38. Learning to detect salient objects with image-level supervision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 136–145.
  39. Exploring cross-image pixel contrast for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7303–7313.
  40. Multi-source uncertainty mining for deep unsupervised saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11727–11736.
  41. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3907–3916.
  42. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733–3742.
  43. Unsupervised deep embedding for clustering analysis. In International conference on machine learning. PMLR, 478–487.
  44. C2AM: Contrastive Learning of Class-Agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 989–998.
  45. Hierarchical saliency detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1155–1162.
  46. Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3166–3173.
  47. Joint unsupervised learning of deep representations and image clusters. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5147–5156.
  48. Unsupervised embedding learning via invariant and spreading instance feature. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6210–6219.
  49. Structure-consistent weakly supervised salient object detection with local saliency coherence. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 3234–3242.
  50. Multi-source weak supervision for saliency detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6074–6083.
  51. Supervision by fusion: Towards unsupervised learning of deep salient object detector. In Proceedings of the IEEE international conference on computer vision. 4048–4056.
  52. Learning noise-aware encoder-decoder from noisy labels by alternating back-propagation for saliency detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16. Springer, 349–366.
  53. Weakly-supervised salient object detection via scribble annotations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12546–12555.
  54. Deep unsupervised saliency detection: A multiple noisy labeling perspective. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9029–9038.
  55. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2921–2929.
  56. Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection. IEEE Transactions on Circuits and Systems for Video Technology (2022).
  57. Texture-Guided Saliency Distilling for Unsupervised Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7257–7267.
  58. Interactive two-stream decoder for accurate and fast saliency detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9141–9150.
  59. Saliency optimization from robust background detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2814–2821.
Citations (1)

Summary

We haven't generated a summary for this paper yet.