BUILD
Owner: Engineering Lead / ML Engineers / Security Champions
Secure AI Development & Integration
Ensure AI systems are developed, integrated, and tested with security, quality, and compliance built into the pipeline from day one.
Framework Mapping
Controls from each source framework that map to this domain.
| Framework | Mapped Controls |
|---|---|
| ISO 42001 |
A.4 Resources for AI Systems
A.5 Assessing AI System Impact
A.6 AI System Lifecycle
|
| NIST AI RMF |
MP-1 Criteria
MP-2 Data
MP-3 Build
MP-4 Acquire
MP-5 Management
|
| OWASP |
LLM01 Prompt Injection
LLM02 Insecure Output
LLM03 Training Data Poisoning
LLM09 Overreliance
ASI02 Misaligned Objectives
ASI05 Identity Exploitation
ASI09 Operational Disruption
|
Controls
7 controls across Tier 1 (essential) and Tier 2 (advanced).
Tier 1
ISO A.5
OWASP LLM09
AI-Generated Code Review Standards
Tier 1
ISO A.5
OWASP ASI09
"Vibe Coding" Risk Assessment
Tier 1
NIST MP-3
OWASP LLM09
Mandatory Human Review Gates
Tier 2
ISO A.6
NIST MP-4
OWASP LLM03
Data Pipeline Validation
Tier 2
ISO A.4
NIST MP-2
Model Selection & Documentation
Tier 1
OWASP LLM01
OWASP ASI02
Prompt Engineering Security
Tier 2
ISO A.5
Test Requirements for AI Code
Audit Checklist
Quick-reference checklist items grouped by control.
- ☐ Code review policy requires human review for AI-generated code
- ☐ Sample of recent PRs shows human review and AI-assisted tagging
- ☐ SAST tools integrated into CI/CD and enforced for all PRs
- ☐ Training delivered to engineering team on AI code review best practices
- ☐ Metrics tracked for AI-assisted code (volume, defect rate) and reviewed quarterly
- ☐ Risk assessment completed identifying high-risk scenarios where vibe coding is prohibited
- ☐ Test coverage minimums defined and enforced via CI/CD
- ☐ Peer review requirements documented and sample PRs show compliance
- ☐ Post-incident reviews include analysis of AI code contribution to issues
- ☐ Training delivered to engineers on vibe coding risks and guardrails
- ☐ All five review gates defined with clear criteria and artifacts
- ☐ Sample projects show all gates completed before production deployment
- ☐ Exception process documented and any exceptions logged with executive approval
- ☐ CI/CD pipeline enforces gate checks and blocks deployment if incomplete
- ☐ Gate completion metrics tracked and reviewed quarterly
- ☐ Data validation tests exist covering quality, bias, and compliance
- ☐ Validation integrated into CI/CD and blocks deployment on failure
- ☐ Sample pipeline runs show validation executed and logged
- ☐ Lineage tracked for all production data sources feeding AI systems
- ☐ Compliance validation (consent, retention) documented for regulated datasets
- ☐ Evaluation criteria defined for all production AI systems
- ☐ Benchmark results documented comparing 2+ alternatives
- ☐ Selection rationale exists explaining chosen model and trade-offs
- ☐ Model Cards published for all custom models with required sections
- ☐ Documentation reviewed and approved by architect or technical lead
- ☐ Prompt security guidelines exist and cover input validation, context isolation, privilege boundaries
- ☐ System prompts undergo peer review before deployment
- ☐ Input validation implemented for all user-controllable prompt components
- ☐ Context isolation demonstrated in sample prompts (delimiters, structured formats)
- ☐ Output sanitization tests exist and pass for XSS/injection scenarios
- ☐ Test requirements defined with coverage minimums and required test types
- ☐ CI/CD enforces coverage minimums and blocks merge if not met
- ☐ Sample PRs show comprehensive tests covering unit, integration, security, edge cases
- ☐ Code review checklist includes test verification steps
- ☐ Coverage metrics tracked over time showing sustained compliance