Emergent Mind

Does AI help humans make better decisions? A methodological framework for experimental evaluation

(2403.12108)
Published Mar 18, 2024 in cs.AI , econ.GN , q-fin.EC , stat.AP , and stat.ME

Abstract

The use of AI based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to answer experimentally this question with no additional assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded experimental design, in which the provision of AI-generated recommendations is randomized across cases with a human making final decisions. Under this experimental design, we show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We apply the proposed methodology to the data from our own randomized controlled trial of a pretrial risk assessment instrument. We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail. Our analysis also shows that AI-alone decisions generally perform worse than human decisions with or without AI assistance. Finally, AI recommendations tend to impose cash bail on non-white arrestees more often than necessary when compared to white arrestees.

Overview

  • The paper introduces a methodological framework for experimentally evaluating the influence of AI recommendations on human decision-making.

  • A randomized controlled trial (RCT) assessed the impact of AI-generated risk assessments on judges' decisions in criminal hearings, finding no significant improvement in decision accuracy.

  • The study highlights shortcomings in AI recommendations, including racial disparities, challenging the assumption that AI naturally enhances decision accuracy.

  • It underscores the importance of context-specific evaluations of AI in decision-making and lays the groundwork for future research in dynamic and non-binary decision settings.

Evaluating the Impact of AI Recommendations on Human Decision-Making: Experimental Evidence from Pretrial Decision Decisions

Introduction to the Methodological Framework and Experimental Design

A novel methodological framework is introduced to experimentally evaluate whether AI-generated recommendations improve human decision-making compared to decisions made by humans alone or AI alone. This work navigates the challenging terrain of selective labels, where the outcomes of interest are inherently conditioned on the decisions made. Leveraging a single-blinded experimental design, this study randomizes the provision of AI recommendations to human decision-makers, thus maintaining the integrity of the experimental setup and ensuring that the effects of AI recommendations are isolated through their influence on human decisions.

The Experimental Context and Findings

The study is grounded in an empirical analysis involving a randomized controlled trial (RCT) assessing the impact of AI-generated predisposition risk assessment (called the PSA) on judges’ decisions regarding cash bail versus signature bond at a criminal first appearance hearing. The findings reveal a lack of significant improvement in the classification accuracy of judges' decisions when provided with AI recommendations. Moreover, decisions made solely by AI were generally found to underperform compared to those involving human judgment, either with or without AI input. Notably, a substantial disparity was identified in AI-alone decisions, where a higher false positive rate was observed for non-white arrestees in comparison to their white counterparts.

Implications of the Study

The outcomes of this research have both theoretical and practical significance. Theoretically, it highlights the intricate dynamics between human decision-makers and AI-based recommendations, challenging the assumption that AI integration naturally enhances decision accuracy. Practically, the findings signal to policymakers and practitioners the need for a cautious approach toward implementing AI in sensitive decision-making arenas like the judicial system. By revealing specific shortcomings in AI recommendations—particularly around racial disparities—the study underscores the urgency for rigorous, context-specific evaluations before widespread deployment.

Future Directions in AI and Human Decision-Making Research

Looking forward, this study lays a foundation for subsequent research paths that could explore various dimensions of AI-assisted decision-making. One potential avenue is extending the proposed methodological framework to non-binary decision-making settings, thereby expanding its applicability. Investigating the joint potential outcomes, rather than focusing solely on the baseline potential outcome, could also yield deeper insights into the nuanced impacts of AI on decision quality. Dynamic settings, where decisions and outcomes evolve over time, offer another rich context for future exploration. Lastly, the practical deployment of AI decision-making systems across different sectors presents an ongoing opportunity to refine and validate the framework introduced in this study.

Conclusion

This research provides a methodologically robust, empirically grounded critique of the integration of AI recommendations into human decision-making processes, particularly within the judicial context. By systematically examining the influence of AI on human judgment through a carefully designed RCT, the study offers valuable insights into the limitations and potential risks associated with AI assistance. It serves as a crucial reminder of the need for comprehensive evaluation and cautious implementation of AI technologies in decision-making processes that significantly affect human lives.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.