Model Selection & Model Card

Card BUILD

Purpose

Standardized model card documenting model purpose, capabilities, limitations, performance, and ethical considerations.

Related Controls

Model Selection & Documentation

ISO A.4 NIST MP-2

1. Model Overview

Provide basic identification and purpose information.

Model Name: [MODEL NAME]

Version: [VERSION]

Date: [DATE]

Model Type: Classification / Regression / Generation / Embedding / Agent / Other

Provider: Internal / [VENDOR NAME]

Owner: [NAME], [ROLE TITLE]

Purpose: [One paragraph describing what business problem this model solves]

Intended Users: [Who will use this model — roles, teams]

Intended Use Cases: [Specific approved use cases]

Out-of-Scope Uses: [Explicitly prohibited or unsupported uses]

2. Model Architecture & Training

Document technical details about the model's architecture and training.

Architecture: [Model architecture — e.g., Transformer, CNN, Gradient Boosting, LLM API]

Base Model: [If fine-tuned, identify the base model — e.g., Claude 3.5 Sonnet, GPT-4o, Llama 3]

Training Data:

Sources: [List data sources used for training]
Size: [Dataset size — records, tokens, images]
Date Range: [Time period covered by training data]
Data Classification: [Classification level of training data]

Fine-tuning:

Method: [Full fine-tune, LoRA, RLHF, prompt tuning, none]
Dataset: [Fine-tuning dataset description]
Hyperparameters: [Key hyperparameters if applicable]

Inference Requirements:

Compute: [CPU/GPU requirements]
Memory: [RAM/VRAM requirements]
Latency: [Expected response time]

3. Performance Metrics

Document quantitative performance on relevant benchmarks and test sets.

Metric	Test Set	Threshold
Accuracy	[TEST SET]	[MIN]
Precision	[TEST SET]	[MIN]
Recall	[TEST SET]	[MIN]
F1 Score	[TEST SET]	[MIN]
Latency (p50)	Production	[MAX ms]
Latency (p99)	Production	[MAX ms]
Throughput	Load Test	[MIN req/s]

Performance Notes: [Any caveats, known performance degradation scenarios, or edge cases]

4. Limitations & Risks

Be transparent about what the model cannot do and known risks.

Known Limitations

[LIMITATION — e.g., "Model performs poorly on languages other than English"]
[LIMITATION — e.g., "Accuracy degrades for inputs longer than 4096 tokens"]
[LIMITATION — e.g., "Model may hallucinate citations and references"]

Bias Assessment

Tested for bias across: [Protected attributes tested — gender, race, age, etc.]
Results: [Summary of bias testing results]
Mitigations: [What was done to address identified biases]

Risk Classification

Risk Tier: Low / Medium / High
Data Exposure Risk: [Assessment]
Fairness Risk: [Assessment]
Safety Risk: [Assessment]
Mitigations in Place: [List active mitigations]

5. Approval & Lifecycle

Document approval status and ongoing lifecycle management.

Selection Justification

Why this model was selected over alternatives:

[REASON — e.g., "Best accuracy/cost tradeoff for our use case"]
[REASON — e.g., "Vendor meets our security and privacy requirements"]
[REASON — e.g., "Compatible with existing infrastructure"]

Alternatives Considered:

[MODEL] — Rejected because [REASON]
[MODEL] — Rejected because [REASON]

Approvals

Technical Review: [NAME] — [DATE]
Security Review: [NAME] — [DATE]
Business Approval: [NAME] — [DATE]

Lifecycle

Deployment Date: [DATE]
Next Review Date: [DATE]
Retirement Criteria: [What would trigger model replacement or decommission]
Monitoring: [How model performance is tracked in production]

← Back to all templates