Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage? (2402.06160v3)
Abstract: This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies by Bengs et al. identify limitations of the existing methods to conclude their learned epistemic uncertainties are unreliable, e.g., in that they are non-vanishing even with infinite data. Building on and sharpening such analysis, we 1) provide a sharper understanding of the asymptotic behavior of a wide class of EDL methods by unifying various objective functions; 2) reveal that the EDL methods can be better interpreted as an out-of-distribution detection algorithm based on energy-based-models; and 3) conduct extensive ablation studies to better assess their empirical effectiveness with real-world datasets. Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.
- A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Inf. fusion, 76:243–297, 2021.
- The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence, 1(1):20–23, 2019.
- Pitfalls of epistemic uncertainty quantification through loss minimisation. In Adv. Neural Inf. Proc. Syst., 35, pp. 29205–29216, 2022.
- A general framework for updating belief distributions. J. R. Stat. Soc. B, 78(5):1103–1130, November 2016.
- Weight uncertainty in neural network. In Adv. Neural Inf. Proc. Syst., 37, pp. 1613–1622. PMLR, July 2015. URL https://proceedings.mlr.press/v37/blundell15.html.
- Posterior network: Uncertainty estimation without OOD samples via Density-Based Pseudo-Counts. In Adv. Neural Inf. Proc. Syst., 33, pp. 1356–1367, June 2020.
- Natural posterior network: Deep Bayesian uncertainty for exponential family distributions. In Int. Conf. Learn. Repr., 2022.
- A Variational Dirichlet Framework for Out-of-Distribution Detection. arXiv preprint arXiv:1811.07308, November 2018.
- Elements of information theory. John Wiley & Sons, 2006.
- Uncertainty in clinical medicine. In Philosophy of medicine, pp. 299–356. Elsevier, 2011.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proc. Int. Conf. Mach. Learn., 48, pp. 1050–1059. PMLR, June 2016. URL https://proceedings.mlr.press/v48/gal16.html.
- Gal, Y. et al. Uncertainty in deep learning. PhD thesis, University of Cambridge, 2016.
- A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342, 2021.
- Multi-source Domain Adaptation with Mixture of Experts. arXiv, September 2018.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach. Learn., 110(3):457–506, March 2021. ISSN 0885-6125, 1573-0565. doi: 10.1007/s10994-021-05946-3.
- Being Bayesian about categorical probability. In Proc. Int. Conf. Mach. Learn., 119, pp. 4950–4961, February 2020.
- Normalizing flows: An introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell., 43(11):3964–3979, 2020.
- Simple and scalable predictive uncertainty estimation using deep ensembles. In Adv. Neural Inf. Proc. Syst., 30, 2017.
- Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In Proc. Int. Conf. Mach. Learn., 119, pp. 6028–6039. PMLR, July 2020. URL https://proceedings.mlr.press/v119/liang20a.html.
- Predictive uncertainty estimation via prior networks. In Adv. Neural Inf. Proc. Syst., 31, February 2018.
- Reverse KL-divergence training of prior networks: Improved uncertainty and adversarial robustness. In Adv. Neural Inf. Proc. Syst., 32, May 2019.
- Ensemble distribution distillation. In Int. Conf. Learn. Repr., 2020. URL https://openreview.net/forum?id=BygSP6Vtvr.
- Towards maximizing the representation gap between in-domain & out-of-distribution examples. In Adv. Neural Inf. Proc. Syst., 33, pp. 9239–9250, 2020.
- Neal, R. M. Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012.
- Dataset shift in machine learning. Mit Press, 2008.
- Approximation analysis of stochastic gradient Langevin dynamics by using Fokker-Planck equation and Ito process. In Proc. Int. Conf. Mach. Learn., number 2 in 32, pp. 982–990. PMLR, June 2014. URL https://proceedings.mlr.press/v32/satoa14.html.
- Post-hoc uncertainty learning using a dirichlet meta-model. In Proc. AAAI Conf. Artif. Intell., volume 37, pp. 9772–9781, 2023.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- A new distribution on the simplex with auto-encoding applications. In Adv. Neural Inf. Proc. Syst., 32, 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/43207fd5e34f87c48d584fc5c11befb8-Paper.pdf.
- Prior and posterior networks: A survey on evidential deep learning methods for uncertainty estimation. Trans. Mach. Learn. Res., 2023. ISSN 2835-8856. https://www.jmlr.org/tmlr/papers/.