Unlocking the Power of Open Set : A New Perspective for Open-Set Noisy Label Learning (2305.04203v2)
Abstract: Learning from noisy data has attracted much attention, where most methods focus on closed-set label noise. However, a more common scenario in the real world is the presence of both open-set and closed-set noise. Existing methods typically identify and handle these two types of label noise separately by designing a specific strategy for each type. However, in many real-world scenarios, it would be challenging to identify open-set examples, especially when the dataset has been severely corrupted. Unlike the previous works, we explore how models behave when faced with open-set examples, and find that \emph{a part of open-set examples gradually get integrated into certain known classes}, which is beneficial for the separation among known classes. Motivated by the phenomenon, we propose a novel two-step contrastive learning method CECL (Class Expansion Contrastive Learning) which aims to deal with both types of label noise by exploiting the useful information of open-set examples. Specifically, we incorporate some open-set examples into closed-set classes to enhance performance while treating others as delimiters to improve representative ability. Extensive experiments on synthetic and real-world datasets with diverse label noise demonstrate the effectiveness of CECL.
- A Closer Look at Memorization in Deep Networks. In Precup, D.; and Teh, Y. W., eds., Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, 233–242. PMLR.
- Learning models with uniform performance via distributionally robust optimization. The Annals of Statistics, 49(3): 1378–1406.
- Is Out-of-Distribution Detection Learnable? In Oh, A. H.; Agarwal, A.; Belgrave, D.; and Cho, K., eds., Advances in Neural Information Processing Systems.
- Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in neural information processing systems, 31.
- Deep self-learning from noisy labels. In Proceedings of the IEEE/CVF international conference on computer vision, 5138–5147.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
- Self-adaptive training: beyond empirical risk minimization. Advances in neural information processing systems, 33: 19365–19376.
- Twin contrastive learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11661–11670.
- Supervised contrastive learning. Advances in Neural Information Processing Systems, 33: 18661–18673.
- Krizhevsky, A. 2009. Learning Multiple Layers of Features from Tiny Images. 32–33.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7): 3.
- LeCun, Y. 1998. The MNIST database of handwritten digits. http://yann. lecun. com/exdb/mnist/.
- Cleannet: Transfer learning for scalable image classifier training with label noise. In Proceedings of the IEEE conference on computer vision and pattern recognition, 5447–5456.
- Dividemix: Learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394.
- Mopro: Webly supervised learning with momentum prototypes. arXiv preprint arXiv:2009.07995.
- Learning from noisy data with robust representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9485–9494.
- Selective-supervised contrastive learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 316–325.
- Estimating noise transition matrix with label correlations for noisy multi-label learning. Advances in Neural Information Processing Systems, 35: 24184–24198.
- Crowdsourcing aggregation with deep Bayesian learning. Science China Information Sciences, 64: 1–11.
- Multi-label learning from crowds. IEEE Transactions on Knowledge and Data Engineering, 31(7): 1369–1382.
- Early-learning regularization prevents memorization of noisy labels. Advances in neural information processing systems, 33: 20331–20342.
- Visualizing data using t-SNE. Journal of machine learning research, 9(11).
- Decoupling” when to update” from” how to update”. Advances in neural information processing systems, 30.
- Provable guarantees for understanding out-of-distribution detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 8.
- Multi-objective interpolation training for robustness to label noise. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6606–6615.
- Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1944–1952.
- Suppressing mislabeled data via grouping and self-attention. In European Conference on Computer Vision, 786–802. Springer.
- Learning from crowds with sparse and imbalanced annotations. Machine Learning, 112(6): 1823–1845.
- Selfie: Refurbishing unclean samples for robust deep learning. In International Conference on Machine Learning, 5907–5915. PMLR.
- Crssc: salvage reusable samples from noisy data for robust learning. In Proceedings of the 28th ACM International Conference on Multimedia, 92–101.
- PNP: Robust Learning From Noisy Labels by Probabilistic Noise Prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5311–5320.
- Webly supervised fine-grained recognition: Benchmark datasets and an approach. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10602–10611.
- Learning from noisy labels by regularized estimation of annotator confusion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11244–11253.
- ProMix: Combating Label Noise via Maximizing Clean Sample Utility. arXiv preprint arXiv:2207.10276.
- Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, 9929–9939. PMLR.
- Imae for noise-robust learning: Mean absolute error does not treat examples equally and gradient magnitude’s variance matters. arXiv preprint arXiv:1903.12141.
- Iterative learning with open-set noisy labels. In Proceedings of the IEEE conference on computer vision and pattern recognition, 8688–8696.
- Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 322–330.
- Combating noisy labels by agreement: A joint training method with co-regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13726–13735.
- Open-set label noise can improve robustness against inherent label noise. Advances in Neural Information Processing Systems, 34: 7978–7992.
- Ngc: A unified framework for learning with open-world noisy data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 62–71.
- Extended T𝑇Titalic_T T: Learning With Mixed Closed-Set and Open-Set Noisy Labels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3): 3047–3058.
- Combating noisy labels with sample selection by mining high-discrepancy examples. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1833–1843.
- Are anchor points really indispensable in label-noise learning? Advances in neural information processing systems, 32.
- Learning from massive noisy labeled data for image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2691–2699.
- L_dmi: A novel information-theoretic loss function for training deep nets robust to label noise. Advances in neural information processing systems, 32.
- S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification. ACM Transactions on Knowledge Discovery from Data (TKDD), 16: 34:1–34:27.
- Not All Out-of-Distribution Data Are Harmful to Open-Set Active Learning. In NeurIPS. New Orleans, Louisiana.
- Jo-src: A contrastive approach for combating noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5192–5201.
- Probabilistic end-to-end noise correction for learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7017–7025.
- How does disagreement help generalization against label corruption? In International Conference on Machine Learning, 7164–7173. PMLR.
- Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3): 107–115.
- Learning with feature-dependent label noise: A progressive approach. arXiv preprint arXiv:2103.07756.
- RankMatch: Fostering Confidence and Consistency in Learning with Noisy Labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1644–1654.