Emergent Mind

On the Relationship between Data Efficiency and Error for Uncertainty Sampling

(1806.06123)
Published Jun 15, 2018 in cs.LG and stat.ML

Abstract

While active learning offers potential cost savings, the actual data efficiencythe reduction in amount of labeled data needed to obtain the same error rateobserved in practice is mixed. This paper poses a basic question: when is active learning actually helpful? We provide an answer for logistic regression with the popular active learning algorithm, uncertainty sampling. Empirically, on 21 datasets from OpenML, we find a strong inverse correlation between data efficiency and the error rate of the final classifier. Theoretically, we show that for a variant of uncertainty sampling, the asymptotic data efficiency is within a constant factor of the inverse error rate of the limiting classifier.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.