Evaluating human-AI team performance

Evaluating human-AI team performance
Operations research and systems analysis
Past Projects


Deferring decisions: Effects on human-AI team performance

Artificial intelligence (AI) technologies, such as biometrics and machine learning models (MLM), are increasingly being deployed for use in national security, defense and criminal justice. In these high-stakes applications, it is critical that the technology predict accurately, fairly and responsibly. As more AI and MLM technologies are being deployed in the field, it is imperative to understand how humans use this technology and understand the impact of these technologies on human performance.

In the interest of both efficiency and effective decision-making under legal and ethical constraints, implementing AI technology is a compromise between full automation and complete human agency. Current examples in homeland security applications where AI and humans work together include voice or face recognition systems that are trained to match travelers with their IDs or against existing databases, and under certain thresholds of uncertainty, can defer final identification of a traveler to a human agent.


Not much is known about joint AI-human decision-making when it comes to a deferral structure and its impact on workers’ perceived responsibility and task engagement. The central focus of this project is to study an AI deferral structure and its potential impacts on overall AI-human system performance. Various interaction structures will be tested as potential design interventions that are sensitive to worker accountability and appropriate trust in automation.

This research evaluates the joint performance of an AI-human system in a real-world use case and estimates the potential performance costs of implementing deferential AI with human decision-makers.

The goals of this project are to determine:

  • The effects of AI deferral rates and AI performance information provided in a face matching task on human task engagement and overall system performance
  • The impact of other relevant human factors on system performance, such as accountability for system outcomes, penalty-based versus reward-based consequences, and trust in automation


This project assists DHS in formulating tactical solutions for deploying biometric-capable processes by:

  • Testing AI-human decision-making scenarios to improve predictive performance and accountable decision-making in AI-human teaming
  • Providing design recommendations for AI deferral systems that take into account the human component of such systems, thus assisting in work and process flow design and optimization.

Research Leadership Team

Principal Investigator: Mickey Mancenido, Arizona State University
Co-PI: Erin Chiou, Arizona State University