Objective evaluation of off-the-shelf AI technologies

Objective evaluation of off-the-shelf AI technologies
Data Analytics

Overview

Development of a General Testing Methodology for Evaluating the Performance of Commercial Artificial Intelligent Technologies

Artificial intelligence (AI) is being embraced in both the private sector and within homeland security efforts.  The use cases for AI technology are wide ranging, from helping people navigate immigration systems, to predicting and pre-empting threat, to making critical infrastructure more resilient against possible attacks. The private sector and homeland security are partnering in many areas to develop and implement new AI solutions. With the fast pace of innovation in AI, it is vital that DHS has tools in place to manage and evaluate AI technologies for precision, accuracy and speed in varied environments.

Solution

The CAOE is developing a testing methodology for evaluating the performance of commercial AI solutions using a design framework that will:

  • identify the various test factors that could affect the performance of AI
  • outline performance metrics that matter to the user such as precision (e.g., repeatability and reproducibility of classification), accuracy (e.g., absence of implicit and explicit bias in classification) and speed of execution
  • determine the test factors that significantly affect performance (Figure 1)

 

Test Factors and Preformance Chart
Figure 1

 

This project will develop a testing methodology that can be implemented throughout DHS components to objectively evaluate commercial AI performance. This testing methodology will result in targeted improvement, without the expense of collecting and annotating large volumes of data or developing new testing methodology for each new AI technology.  

Impact

Objective evaluation of off-the-shelf AI technologies

The CAOE is creating objective methodologies for DHS to test new AI technologies to ensure that AI technologies provide fair results and meet outlined operational requirements in a variety of environments.   Resulting in greater accountability in the deployment and usage of AI technology.

Research Leadership Team 

Principal Investigator:  Mickey Mancenido, ASU