Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels (1310.5764v6)
Abstract: Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided by crowdsourcing workers is Dawid-Skene estimator. In this paper, we prove convergence rates of a projected EM algorithm for the Dawid-Skene estimator. The revealed exponent in the rate of convergence is shown to be optimal via a lower bound argument. Our work resolves the long standing issue of whether Dawid-Skene estimator has sound theoretical guarantees besides its good performance observed in practice. In addition, a comparative study with majority voting illustrates both advantages and pitfalls of the Dawid-Skene estimator.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.