Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 186 tok/s
Gemini 2.5 Pro 55 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 41 tok/s Pro
GPT-4o 124 tok/s Pro
Kimi K2 184 tok/s Pro
GPT OSS 120B 440 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Mitigating Bias in Calibration Error Estimation (2012.08668v3)

Published 15 Dec 2020 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: For an AI system to be reliable, the confidence it expresses in its decisions must match its accuracy. To assess the degree of match, examples are typically binned by confidence and the per-bin mean confidence and accuracy are compared. Most research in calibration focuses on techniques to reduce this empirical measure of calibration error, ECE_bin. We instead focus on assessing statistical bias in this empirical measure, and we identify better estimators. We propose a framework through which we can compute the bias of a particular estimator for an evaluation data set of a given size. The framework involves synthesizing model outputs that have the same statistics as common neural architectures on popular data sets. We find that binning-based estimators with bins of equal mass (number of instances) have lower bias than estimators with bins of equal width. Our results indicate two reliable calibration-error estimators: the debiased estimator (Brocker, 2012; Ferro and Fricker, 2012) and a method we propose, ECE_sweep, which uses equal-mass bins and chooses the number of bins to be as large as possible while preserving monotonicity in the calibration function. With these estimators, we observe improvements in the effectiveness of recalibration methods and in the detection of model miscalibration.

Citations (83)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.