Defining AI system task and technical requirements: NIST AI RMF Map 2.1
The NIST AI Risk Management Framework (AI RMF) is designed as a voluntary framework applicable to any organisation involved in the design, development, deployment, or use of artificial intelligence systems. It is not a mandatory compliance requirement in the same way that some regulations are (for example, EU AI Act). However, it offers very useful guidelines - think of it as a guide to help your organisation ensure the benefits of AI are realised responsibly.
What is Map 2.1 about?
Is your AI system’s task clearly defined? Avoid these common pitfalls when aligning with NIST’s AI RMF 2.1:
- The definition is too vague
“The system will learn to improve customer satisfaction.” How will you measure the improvement? What actions will the AI system take?
The AI will learn to reduce customer wait times on phone support by predicting call volume and analysing expected call type and optimising agent scheduling.
- The scope is too ambitious
“The system will solve all our customer service problems.”
Really?
- Ignoring assumptions and limitations (related to the data, environment, or user behaviour)
Hidden assumptions can lead to unexpected failures. Limitations are not failures.
Assumptions:
- The patient’s medical history and symptoms are accurately recorded in the electronic health record.
- The images used for diagnosis are of sufficient quality and resolution.
- The language used in spam emails will be predominantly the languages the filter was trained on. Limitation: The system may struggle with questions that require nuanced understanding of sarcasm or humor.
4.Technical specifications and requirements are not clearly defined
The AI should be fast and accurate. The AI-powered fraud detection system must achieve a minimum F1-score of X on a held-out test dataset of at least Y transactions. The system must be able to handle a transaction volume of up to Z transactions per second.
- No documentation of development, testing, metrics, and performance
The system was trained on a large dataset and tested extensively. The translation system was developed using X architecture. The training data consisted of parallel texts in English and Spanish, sourced from publicly available datasets and our internal corpus. We employed BLEU score as the primary evaluation metric. The system achieved an average BLEU score of 0.85 on the test set.
- No reference to your accountability-based data management practices
Our organization adheres to OECD Privacy Principles. We have implemented the following data management and protection practices: …
- No alignment of specifications with goals and objectives
The technical specifications for the system include real-time inventory tracking with 99% accuracy, and automated ordering based on predefined thresholds and optimized for cost and delivery time. These specifications directly support the goals of reducing stockouts and minimizing waste by enabling more accurate demand prediction, better inventory visibility, and optimized ordering processes.
Review these points: by clearly and narrowly defining your system’s tasks, you make it easier to map benefits and risks, which improves risk management.