Today, artificial intelligence (AI) systems make data-driven decisions in a multitude of applications and industries. Explanations of the reasoning behind these decisions are crucial to the success and acceptance of such systems, say HKUST researcher Carlos Fernández-Loría and colleagues. Rising to this challenge, the authors present and rigorously test a novel counterfactual framework that can be used to provide explanations for decisions made by general data-driven AI systems. Their analyses not only provide an important generalization of the literature but also offer practical insights for managers and other professionals across industries.
Stakeholders who benefit from a solid understanding of AI systems’ data-driven decisions include customers, managers, analysts, data scientists, and machine learning engineers. However, if an AI system’s decisions cannot be adequately explained, these stakeholders may “be skeptical and reluctant” to adopt it, say the researchers—even if the system has been demonstrated to provide improved decision-making performance.
“Many data-rich organizations struggle when adopting AI decision-making systems because of managerial and cultural challenges rather than issues related to data and technology,” the researchers note. The development of methods to explain predictive models and their predictions to improve stakeholders’ understanding of AI systems has thus become an important line of research.
The researchers therefore endeavor to explain the data-driven decisions made by AI systems from a causal perspective. They seek to answer the question “why did the system make a specific decision?” by analyzing the inputs that caused the system to make that decision. Doing so has a number of advantages, the researchers say: “it standardizes the form that an explanation can take; it does not require all features to be part of the explanation, and the explanations can be
separated from the specifics of the model.”
The researchers therefore adopt a counterfactual approach and propose a generalized framework that produces context-dependent explanations for the decisions made by model-based AI systems. This approach, as the researchers demonstrate, “defines an explanation as a set of the system’s data inputs that causally drives the decision (i.e., changing the inputs in the set changes the decision) and is irreducible (i.e., changing any subset of the inputs does not change the decision).” They also propose a heuristic procedure to identify the most useful explanations according to the particular context.
The researchers further contribute to the literature by demonstrating that explaining model predictions and explaining the decisions of a “system in practice” are different types of tasks. “Features that have a large impact on predictions may not have an important influence on decisions,” they note.
The researchers show that importance weighting methods, which are among the most popular methods for explaining model predictions, are not actually well suited to explaining system decisions. Notably, the researchers demonstrate that “importance weights are insufficient to communicate whether and how features affect decisions,” as they can be ambiguous or misleading. In contrast, the researchers, say, their counterfactual framework is “an alternative specifically designed for such a task.”
By explaining system decisions rather than model predictions, and by not requiring a specific method to produce counterfactual values, the researchers generalize prior work on counterfactual explanations in important ways. Indeed, the researchers show, counterfactual explanations “can be applied much more broadly to more problems and systems than many prior authors seem to have realized.” These include numerous business settings in which AI systems can be useful, such as targeted advertising and credit scoring. Offering invaluable practical insights, the researchers further demonstrate ways in which end users can tailor counterfactual explanations to their specific contexts.