UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning (2203.14542v4)

Published 28 Mar 2022 in cs.CV and cs.LG

Abstract: Supervised deep learning methods require a large repository of annotated data; hence, label noise is inevitable. Training with such noisy data negatively impacts the generalization performance of deep neural networks. To combat label noise, recent state-of-the-art methods employ some sort of sample selection mechanism to select a possibly clean subset of data. Next, an off-the-shelf semi-supervised learning method is used for training where rejected samples are treated as unlabeled data. Our comprehensive analysis shows that current selection methods disproportionately select samples from easy (fast learnable) classes while rejecting those from relatively harder ones. This creates class imbalance in the selected clean set and in turn, deteriorates performance under high label noise. In this work, we propose UNICON, a simple yet effective sample selection method which is robust to high label noise. To address the disproportionate selection of easy and hard samples, we introduce a Jensen-Shannon divergence based uniform selection mechanism which does not require any probabilistic modeling and hyperparameter tuning. We complement our selection method with contrastive learning to further combat the memorization of noisy labels. Extensive experimentation on multiple benchmark datasets demonstrates the effectiveness of UNICON; we obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate. Our code is publicly available

Citations (82)

View on Semantic Scholar

Summary

The paper introduces a uniform sample selection integrated with contrastive learning to counter label noise and class imbalance in deep learning.
It achieves an 11.4% accuracy boost on CIFAR100 at a 90% noise rate, outperforming existing state-of-the-art methods.
The approach enhances pseudo-label quality and offers broader applicability in domains like text analysis and bioinformatics.

An Evaluation of "UniCon: Combating Label Noise Through Uniform Selection and Contrastive Learning"

The paper "UniCon: Combating Label Noise Through Uniform Selection and Contrastive Learning" presents a novel strategy for addressing the pervasive issue of label noise in supervised deep learning. The research highlights the deleterious effects of label noise, particularly in large datasets sourced from the web. The authors focus on sample selection methods and the inherent bias these methods have towards selecting samples from easier, fast learnable classes, which results in class imbalance. This imbalance exacerbates performance degradation, especially under high label noise conditions.

Methodology and Contributions

UniCon introduces a uniform sample selection technique designed to mitigate the preferential selection from easier classes. The method integrates uniform selection with contrastive learning, a robust approach against noisy data memorization. The contributions of the paper are:

Uniform Sample Selection: The proposed method enforces class-balancing in the clean data subset selection, ensuring that samples from all classes are represented equally, regardless of their innate difficulty.
Contrastive Learning Integration: Leveraging contrastive learning, the authors aim to reduce the risk of memorizing noisy labels. Contrastive learning does not rely on labels, making it inherently more resistant to such errors.
Improved Performance Metrics: UniCon demonstrates a significant improvement over state-of-the-art methods, achieving an 11.4\% increase in accuracy on the CIFAR100 dataset with a 90\% noise rate.

Evaluation and Results

Extensive experiments conducted across multiple benchmark datasets, including CIFAR10, CIFAR100, Tiny-ImageNet, Clothing1M, and Webvision, provide robust evidence of the efficacy of UniCon. The performance gains are particularly pronounced in scenarios with severe label noise, underscoring the approach's utility in such challenging contexts. Notably, UniCon consistently reduces class imbalance and enhances pseudo-label quality during semi-supervised learning phases.

Implications and Future Work

The strong numerical results presented in the paper suggest the potential of UniCon to be applied in areas beyond traditional image classification. The uniform sample selection mechanism could be adapted for other domains where data class balance and label noise are concerns, such as text analysis and bioinformatics. Meanwhile, the integration with contrastive learning opens avenues for further exploration into unsupervised learning techniques that may benefit from the robustness offered against label noise.

Looking ahead, future research could investigate automated adjustments of the selection parameters based on dataset characteristics, enhancing the adaptability of UniCon. Additionally, exploring the synergy between UniCon's methodology and advanced semi-supervised learning techniques could yield further improvements, particularly in low-resource settings.

In conclusion, "UniCon: Combating Label Noise Through Uniform Selection and Contrastive Learning" presents a compelling framework that effectively addresses the challenges posed by label noise and class imbalance. The paper contributes meaningfully to the broader discourse on improving deep learning models' resilience and performance in noisy environments, ensuring that they remain robust and accurate across diverse applications.

PDF Markdown

Related Papers

YouTube

Show All Videos