Papers
Topics
Authors
Recent
2000 character limit reached

An Active Learning Approach for Jointly Estimating Worker Performance and Annotation Reliability with Crowdsourced Data (1401.3836v1)

Published 16 Jan 2014 in cs.LG and cs.HC

Abstract: Crowdsourcing platforms offer a practical solution to the problem of affordably annotating large datasets for training supervised classifiers. Unfortunately, poor worker performance frequently threatens to compromise annotation reliability, and requesting multiple labels for every instance can lead to large cost increases without guaranteeing good results. Minimizing the required training samples using an active learning selection procedure reduces the labeling requirement but can jeopardize classifier training by focusing on erroneous annotations. This paper presents an active learning approach in which worker performance, task difficulty, and annotation reliability are jointly estimated and used to compute the risk function guiding the sample selection procedure. We demonstrate that the proposed approach, which employs active learning with Bayesian networks, significantly improves training accuracy and correctly ranks the expertise of unknown labelers in the presence of annotation noise.

Citations (13)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.