Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deployment of a Robust and Explainable Mortality Prediction Model: The COVID-19 Pandemic and Beyond (2311.17133v1)

Published 28 Nov 2023 in cs.LG and cs.AI

Abstract: This study investigated the performance, explainability, and robustness of deployed AI models in predicting mortality during the COVID-19 pandemic and beyond. The first study of its kind, we found that Bayesian Neural Networks (BNNs) and intelligent training techniques allowed our models to maintain performance amidst significant data shifts. Our results emphasize the importance of developing robust AI models capable of matching or surpassing clinician predictions, even under challenging conditions. Our exploration of model explainability revealed that stochastic models generate more diverse and personalized explanations thereby highlighting the need for AI models that provide detailed and individualized insights in real-world clinical settings. Furthermore, we underscored the importance of quantifying uncertainty in AI models which enables clinicians to make better-informed decisions based on reliable predictions. Our study advocates for prioritizing implementation science in AI research for healthcare and ensuring that AI solutions are practical, beneficial, and sustainable in real-world clinical environments. By addressing unique challenges and complexities in healthcare settings, researchers can develop AI models that effectively improve clinical practice and patient outcomes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. A short guide for medical professionals in the era of artificial intelligence. npj Digital Medicine, 3(1):1–8, 2020.
  2. Machine learning in medicine: addressing ethical challenges. PLoS medicine, 15(11):e1002689, 2018.
  3. The ability of the national early warning score (news) to discriminate patients at risk of early cardiac arrest, unanticipated intensive care unit admission, and death. Resuscitation, 84(4):465–470, 2013.
  4. Machine learning models for early sepsis recognition in the neonatal intensive care unit using readily available electronic health record data. PloS one, 14(2):e0212665, 2019.
  5. Madeleine Clare Elish. The stakes of uncertainty: developing and integrating machine learning in clinical care. In Ethnographic Praxis in Industry Conference Proceedings, volume 2018, pages 364–380. Wiley Online Library, 2018.
  6. Minimal impact of implemented early warning score and best practice alert for patient deterioration. Critical care medicine, 47(1):49, 2019.
  7. Clinician perception of the effectiveness of an automated early warning and response system for sepsis in an academic medical center. Annals of the American Thoracic Society, 12(10):1514–1519, 2015.
  8. Performance of a clinical decision support tool to identify picu patients at high-risk for clinical deterioration. Pediatric critical care medicine: a journal of the Society of Critical Care Medicine and the World Federation of Pediatric Intensive and Critical Care Societies, 21(2):129, 2020.
  9. Clinician perception of a machine learning-based early warning system designed to predict severe sepsis and septic shock. Critical care medicine, 47(11):1477, 2019.
  10. What clinicians want: contextualizing explainable machine learning for clinical end use. In Machine learning for healthcare conference, pages 359–380. PMLR, 2019.
  11. Drug discovery with explainable artificial intelligence. Nature Machine Intelligence, 2(10):573–584, 2020.
  12. Machine intelligence in healthcare—perspectives on trustworthiness, explainability, usability, and transparency. NPJ digital medicine, 3(1):47, 2020.
  13. A manifesto on explainability for artificial intelligence in medicine. Artificial Intelligence in Medicine, 133:102423, 2022.
  14. Alex John London. Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Center Report, 49(1):15–21, 2019.
  15. Evalattai: A holistic approach to evaluating attribution maps in robust and non-robust models. IEEE Access, 2023.
  16. Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.
  17. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 427–436, 2015.
  18. On calibration of modern neural networks. In International conference on machine learning, pages 1321–1330. PMLR, 2017.
  19. Failure detection in deep neural networks for medical imaging. Frontiers in Medical Technology, 4, 2022.
  20. Development, implementation, and impact of an automated early warning and response system for sepsis. Journal of hospital medicine, 10(1):26–31, 2015.
  21. Evaluating alert fatigue over time to ehr-based clinical trial alerts: findings from a randomized controlled study. Journal of the American Medical Informatics Association, 19(e1):e145–e148, 2012.
  22. Key challenges for delivering clinical impact with artificial intelligence. BMC medicine, 17(1):1–9, 2019.
  23. Rethinking clinical prediction: why machine learning must consider year of care and feature aggregation. arXiv preprint arXiv:1811.12583, 2018.
  24. Why is my classifier discriminatory? Advances in neural information processing systems, 31, 2018.
  25. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017.
  26. Towards an explainable mortality prediction model. In 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6. IEEE, 2020.
  27. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
  28. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. Journal of medicinal chemistry, 63(16):8761–8777, 2019.
  29. E-catboost: An efficient machine learning framework for predicting icu mortality using the eicu collaborative research database. Plos one, 17(5):e0262895, 2022.
  30. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine Learning, 110(3):457–506, 2021.
  31. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30, 2017.
  32. Weight uncertainty in neural network. In International conference on machine learning, pages 1613–1622. PMLR, 2015.
  33. Extended variational inference for propagating uncertainty in convolutional neural networks. In 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6. IEEE, 2019.
  34. Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.
  35. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digital Medicine, 4(1):4, 2021.
  36. Trends in icu mortality from coronavirus disease 2019: a tale of three surges. Critical Care Medicine, 50(2):245, 2022.
  37. An interpretable mortality prediction model for covid-19 patients. Nature machine intelligence, 2(5):283–288, 2020.
  38. Replication of a mortality prediction model in dutch patients with covid-19. Nature Machine Intelligence, 3(1):23–24, 2021.
  39. A nonparametric updating method to correct clinical prediction model drift. Journal of the American Medical Informatics Association, 26(12):1448–1457, 2019.
  40. Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
  41. The eicu collaborative research database, a freely available multi-center database for critical care research. Scientific data, 5(1):1–13, 2018.
  42. mice: Multivariate imputation by chained equations in r. Journal of statistical software, 45:1–67, 2011.
  43. Lukas Biewald. Experiment tracking with weights and biases, 2020. Software available from wandb.com.
  44. Diagnostic tests 4: likelihood ratios. Bmj, 329(7458):168–169, 2004.
  45. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357, 2002.
  46. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17):1–5, 2017.
  47. Bayes-SAR net: Robust SAR image classification with uncertainty estimation using bayesian convolutional neural network. In 2020 IEEE International Radar Conference (RADAR), pages 362–367. IEEE, 2020.
  48. Premium-cnn: Propagating uncertainty towards robust convolutional neural networks. IEEE Transactions on Signal Processing, 2021.
  49. Maximum likelihood estimation for the tensor normal distribution: Algorithm, minimum sample size, and empirical bias and dispersion. Journal of Computational and Applied Mathematics, 239:37–49, 2013.
  50. A new simplified acute physiology score (saps ii) based on a european/north american multicenter study. Jama, 270(24):2957–2963, 1993.
  51. Acute physiology and chronic health evaluation (apache) iv: hospital mortality assessment for today’s critically ill patients. Critical care medicine, 34(5):1297–1310, 2006.
  52. A comparison of feature selection techniques for first-day mortality prediction in the icu. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1–5, 2023.
  53. Influence functions in deep learning are fragile. arXiv preprint arXiv:2006.14651, 2020.
  54. Revisiting the fragility of influence functions. Neural Networks, 162:581–588, 2023.
  55. " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
  56. Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 180–186, 2020.
  57. " how do i fool you?" manipulating user trust via misleading black box explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 79–85, 2020.
  58. Explaining data-driven decisions made by ai systems: the counterfactual approach. arXiv preprint arXiv:2001.07417, 2020.
  59. Dan Simon. Optimal state estimation: Kalman, H infinity, and nonlinear approaches. John Wiley & Sons, 2006.

Summary

We haven't generated a summary for this paper yet.