Evaluating Multi-label Classifiers with Noisy Labels (2102.08427v1)

Published 16 Feb 2021 in cs.LG and stat.ML

Abstract: Multi-label classification (MLC) is a generalization of standard classification where multiple labels may be assigned to a given sample. In the real world, it is more common to deal with noisy datasets than clean datasets, given how modern datasets are labeled by a large group of annotators on crowdsourcing platforms, but little attention has been given to evaluating multi-label classifiers with noisy labels. Exploiting label correlations now becomes a standard component of a multi-label classifier to achieve competitive performance. However, this component makes the classifier more prone to poor generalization - it overfits labels as well as label dependencies. We identify three common real-world label noise scenarios and show how previous approaches per-form poorly with noisy labels. To address this issue, we present a Context-Based Multi-LabelClassifier (CbMLC) that effectively handles noisy labels when learning label dependencies, without requiring additional supervision. We compare CbMLC against other domain-specific state-of-the-art models on a variety of datasets, under both the clean and the noisy settings. We show CbMLC yields substantial improvements over the previous methods in most cases.

Authors (2)

Wenting Zhao (44 papers)
Carla Gomes (26 papers)

Citations (14)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Evaluating Multi-label Classifiers with Noisy Labels (2102.08427v1)

Summary

Related Papers