Emergent Mind

Benchmarking Counterfactual Algorithms for XAI: From White Box to Black Box

(2203.02399)
Published Mar 4, 2022 in cs.LG and cs.AI

Abstract

This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: decision-tree (fully transparent, interpretable, white-box model), a random forest (a semi-interpretable, grey-box model), and a neural network (a fully opaque, black-box model). We tested the counterfactual generation process using four algorithms (DiCE, WatcherCF, prototype, and GrowingSpheresCF) in the literature in five different datasets (COMPAS, Adult, German, Diabetes, and Breast Cancer). Our findings indicate that: (1) Different machine learning models have no impact on the generation of counterfactual explanations; (2) Counterfactual algorithms based uniquely on proximity loss functions are not actionable and will not provide meaningful explanations; (3) One cannot have meaningful evaluation results without guaranteeing plausibility in the counterfactual generation process. Algorithms that do not consider plausibility in their internal mechanisms will lead to biased and unreliable conclusions if evaluated with the current state-of-the-art metrics; (4) A qualitative analysis is strongly recommended (together with a quantitative analysis) to ensure a robust analysis of counterfactual explanations and the potential identification of biases.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.