Evaluating trustworthiness of AI-enabled systems

Home / Evaluating trustworthiness of AI-enabled systems

Overview

Trust in AI-Enabled Decision Support Systems

The advancement of artificial intelligence (AI), including machine learning (ML) and increasingly autonomous systems, has resulted in a push for technical standards that can assess the trustworthiness of these technologies. Developing standards have the potential to drive the design and development of trustworthy systems, and to more cost-effectively evaluate AI-enabled systems during technology acquisition or regulation. To aid in this effort, the Multisource AI Scorecard Table for System Evaluation (MAST) was developed as a standard checklist to help promote more understandable and trustable AI outputs by developers and users of AI/ML systems. However, a critical gap that remains in this effort is the validation of MAST with respect to user trust and as a tool for evaluating AI systems more broadly.

Solution 

The purpose of the project is to research the extent to which the Multisource AI Scorecard Table for System Evaluation (MAST) criteria can serve as a standard checklist for assessing trust in AI-enabled decision support systems. This project merges recent human factors research on trust in AI-enabled decision systems, with recent efforts in AI standards development, to test and to validate the extent to which MAST may be used to evaluate trust in AI outputs. Prior work by the project team has resulted in several use cases and testbeds for human-AI decision-making that can be used to evaluate the MAST criteria. Such test cases may include:

  • How to visualize uncertainty in predicting college acceptance (MAST standards Uncertainty, 2 and Visualization, 9)
  • How to visualize accuracy of classification in face recognition (MAST standards Accuracy, 8  and Visualization, 9)
  • Natural Language process (NLP) text analysis that distinguishes between the data, assumptions, and judgments (MAST standard distinguishing, 3) under varying levels of user skill/experience and varying types of work environments.

Impact 

Recent joint work by the TSA and CBP has included the piloting or deployment of AI technology for traveler identity matching tasks. The specific knowledge gap addressed in this project will clarify to the degree to which MAST may be used to evaluate the trustworthiness of advanced AI decision support systems.

Research Leadership Team 

Principal Investigator:  Erin Chiou, Arizona State University
Co-PI: Mickey (Michelle) Mancenido, Arizona State University 

Data analytics

Past

Discover more projects

The CAOE is committed to developing innovative tools and techniques to safeguard our homeland from potential threats and vulnerabilities.