Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Why do explanations fail? A typology and discussion on failures in XAI (2405.13474v1)

Published 22 May 2024 in cs.LG, cs.AI, and cs.HC

Abstract: As Machine Learning (ML) models achieve unprecedented levels of performance, the XAI domain aims at making these models understandable by presenting end-users with intelligible explanations. Yet, some existing XAI approaches fail to meet expectations: several issues have been reported in the literature, generally pointing out either technical limitations or misinterpretations by users. In this paper, we argue that the resulting harms arise from a complex overlap of multiple failures in XAI, which existing ad-hoc studies fail to capture. This work therefore advocates for a holistic perspective, presenting a systematic investigation of limitations of current XAI methods and their impact on the interpretation of explanations. By distinguishing between system-specific and user-specific failures, we propose a typological framework that helps revealing the nuanced complexities of explanation failures. Leveraging this typology, we also discuss some research directions to help AI practitioners better understand the limitations of XAI systems and enhance the quality of ML explanations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (91)
  1. Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artificial Intelligence, 298.
  2. Fairwashing: the risk of rationalization. In Proc. of ICML.
  3. Permutation importance: a corrected feature importance measure. Bioinformatics, 26(10).
  4. On the robustness of interpretability methods. ICML Workshop on Human Interpretability.
  5. Towards robust interpretability with self-explaining neural networks. Proc. of NeurIPS, 31.
  6. Axiomatic foundations of explainability. In Proc. of IJCAI.
  7. Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58: 82–115.
  8. How cognitive biases affect XAI-assisted decision-making: A systematic review. In Proc. of AEIS.
  9. Evaluating and aggregating feature-based model explanations. In Proc. of IJCAI.
  10. Bhattacherjee, A. 2001. Understanding information systems continuance: An expectation-confirmation model. MIS quarterly.
  11. Post-hoc explanations fail to achieve their purpose in adversarial contexts. In Proc. of FAccT.
  12. Contextualization and exploration of local feature importance explanations to improve understanding and satisfaction of non-expert users. In Proc. of IUI.
  13. Investigating the Intelligibility of Plural Counterfactual Examples for Non-Expert Users: an Explanation User Interface Proposition and User Study. In Proc. of IUI.
  14. The influence of prior knowledge on memory: a developmental cognitive neuroscience perspective. Frontiers in behavioral neuroscience, 7.
  15. Visualizing the feature importance for black box models. In Proc. of ECML PKDD.
  16. Explaining decision-making algorithms through UI: Strategies to help non-expert stakeholders. In Proc. of CHI.
  17. Human-XAI interaction: a review and design principles for explanation user interfaces. In Proc. of INTERACT.
  18. I Think I Get Your Point, AI! The Illusion of Explanatory Depth in Explainable AI. In Proc. of IUI.
  19. Instance-level explanations for fraud detection: A case study. In ICML Workshop WHI.
  20. How people explain action (and autonomous intelligent systems should too). In Proc. of AAAI.
  21. Mapping prior knowledge: A framework for discussion among researchers. Europ. J. of Psych. of Education.
  22. Explanations can be manipulated and geometry is to blame. Proc. of NeurIPS, 32.
  23. Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Computing Surveys, 55(9).
  24. The impact of placebic explanations on trust in intelligent systems. In Proc. of CHI.
  25. On cognitive preferences and the plausibility of rule-based models. Machine Learning, 109(4).
  26. Explaining the explainer: A first theoretical analysis of LIME. In Proc. of AISTATS.
  27. Interpretation of neural networks is fragile. In Proc. of the AAAI Conf. on artificial intelligence, volume 33.
  28. Manipulation Risks in XAI: The Implications of the Disagreement Problem. arXiv:2306.13885.
  29. Do not trust additive explanations. aXiv1903.11420.
  30. Which explanation should i choose? a function approximation perspective to characterizing post hoc explanations. Proc. of NeurIPS, 35.
  31. Hancox-Li, L. 2020. Robustness in ML explanations: Does it matter? In Proc. of FAccT.
  32. Fairness by explicability and adversarial SHAP learning. In Machine Learning and Knowledge Discovery in Databases: European Conf., ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proc., Part III. Springer.
  33. Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance. Statistics and Computing, 31.
  34. Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? In ACL.
  35. Integrating Prior Knowledge in Post-hoc Explanations. In Proc. of IPMU.
  36. Drug discovery with explainable artificial intelligence. Nature Machine Intelligence, 2(10).
  37. Interpreting interpretability: understanding data scientists’ use of interpretability tools for ML. In Proc. of CHI.
  38. The (un) reliability of saliency methods. Explainable AI: Interpreting, explaining and visualizing deep learning.
  39. A review of possible effects of cognitive biases on interpretation of rule-based machine learning models. Artificial Intelligence, 295.
  40. The disagreement problem in explainable machine learning: A practitioner’s perspective. arXiv:2202.01602.
  41. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proc. of FAccT.
  42. Achieving Diversity in Counterfactual Explanations: a Review and Discussion. In Proc. of FAccT.
  43. Issues with post-hoc counterfactual explanations: a discussion. arXiv:1906.04774.
  44. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proc. of CHI.
  45. A unified approach to interpreting model predictions. Proc. of NeurIPS, 30.
  46. A user-centred framework for XAI in human-robot interaction. arXiv:2109.12912.
  47. Miller, T. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267.
  48. A survey on the robustness of feature importance and counterfactual explanations. arXiv:2111.00358.
  49. Model cards for model reporting. In Proc. of FAccT.
  50. A survey of evaluation methods and measures for interpretable machine learning. arXiv:1811.11839.
  51. Molnar, C. 2020. Interpretable machine learning. 2019.
  52. General pitfalls of model-agnostic interpretation methods for machine learning models. In Int. Workshop on Extending Explainable AI Beyond Deep Models and Classifiers.
  53. Explaining machine learning classifiers through diverse counterfactual explanations. In Proc. of FAccT.
  54. Explanation in human-AI systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable AI. arXiv:1902.01876.
  55. Order in the court: Explainable ai methods prone to disagreement. arXiv:2105.03287.
  56. Nickerson, R. 1998. Confirmation bias: A ubiquitous phenomenon in many guises. Review of general psychology, 2(2).
  57. Anchoring bias affects mental model formation and user reliance in explainable AI systems. In Proc. of IUI.
  58. Summarize with Caution: Comparing Global Feature Attributions. IEEE Data Eng. Bull., 44(4).
  59. Explaining Recommendations in E-Learning: Effects on Adolescents’ Trust. In Proc. of IUI.
  60. AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization. In Proc. of IUI.
  61. How model accuracy and explanation fidelity influence user trust in AI. In XAI@IJCAI.
  62. AGREE: a feature attribution aggregation framework to address explainer disagreements with alignment metrics. In Workshop on Case-Based Reasoning for the Explanation of Intelligent Systems, XCBR2023@ICCBR. CEUR Workshop Proc.
  63. On the overlooked issue of defining explanation objectives for local-surrogate explainers. ECML-PKDD XKDD.
  64. Automatic vigilance: the attention-grabbing power of negative social information. J. of personality and social psych.
  65. Exploring the role of local and global explanations in recommender systems. In Proc. of CHI.
  66. ”Why should I trust you?” Explaining the predictions of any classifier. In Proc. of KDD.
  67. Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. In Proc. of ICML.
  68. The challenges of providing explanations of AI systems when they do not behave like users expect. In Proc. of UMAP.
  69. Why Don’t XAI Techniques Agree? Characterizing the Disagreements Between Post-hoc Explanations of Defect Predictions. In ICSME.
  70. The misunderstood limits of folk science: An illusion of explanatory depth. Cognitive science, 26(5).
  71. Reckoning with the Disagreement Problem: Explanation Consensus as a Training Objective. arXiv:2303.13299.
  72. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proc. of CVPR.
  73. ALIME: Autoencoder based approach for local interpretability. In Proc. of IDEAL.
  74. Certifai: A common framework to provide explanations and analyse the fairness and robustness of black-box models. In Proc. of AIES.
  75. Reliable post hoc explanations: Modeling uncertainty in explainability. Proc. of NeurIPS, 34.
  76. Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In Proc. of AIES.
  77. Bayesian computation through cortical latent dynamics. Neuron, 103(5).
  78. The Role of Lexical Alignment in Human Understanding of Explanations by Conversational Agents. In Proc. of IUI.
  79. FCE: Feedback Based Counterfactual Explanations for Explainable AI. IEEE Access, 10.
  80. The many Shapley values for model explanation. In Proc. of ICML.
  81. Evaluating the Explainers: Black-Box Explainable Machine Learning for Student Success Prediction in MOOCs. Int. Educational Data Mining Society.
  82. Visual, textual or hybrid: the effect of user expertise on different explanations. In Proc. of IUI.
  83. Thagard, P. 1989. Extending explanatory coherence. Behavioral and brain sciences, 12(3).
  84. Integrated design-stage failure analysis of software-driven hardware systems. IEEE Trans. on Computers.
  85. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial Intelligence, 291: 103404.
  86. Statistical stability indices for LIME: Obtaining reliable explanations for ML models. J. of the Operational Research Society, 73(1).
  87. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harvard J. of Law & Technology, 31.
  88. Designing Theory-Driven User-Centric Explainable AI. In Proc. of CHI, 1–15.
  89. The unreliability of explanations in few-shot prompting for textual reasoning. Proc. of NeurIPS, 35.
  90. DLIME: A deterministic local interpretable model-agnostic explanations approach for computer-aided diagnosis systems. arXiv:1906.10263.
  91. S-lime: Stabilized-lime for model explanation. In Proc. of KDD.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Clara Bove (1 paper)
  2. Thibault Laugel (18 papers)
  3. Marie-Jeanne Lesot (22 papers)
  4. Charles Tijus (2 papers)
  5. Marcin Detyniecki (41 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets