Security Controls
A directory of cross-cloud security control mappings for healthcare AI implementation
Best Practices for Healthcare AI Security
Implementing AI in regulated healthcare environments requires a comprehensive approach to security that addresses both traditional infrastructure concerns and AI-specific risks.
1. Data Protection & Privacy
- Encryption Everywhere: Enable encryption at rest for all PHI/PII data stores using customer-managed keys (CMKs), not provider-managed defaults, so your organization retains key control and can revoke access independently of the cloud provider (see Data Encryption at Rest). Enforce TLS 1.2+ for all data in transit (see Data Encryption in Transit). Apply the same encryption requirements to model artifacts and training datasets stored in object storage.
- Data Minimization: Collect and process only the minimum data necessary for each AI use case. Implement automated lifecycle policies on your object storage and databases to expire or archive data beyond its retention window. Implement field-level filtering in your ingestion pipelines and document the minimum necessary standard per model in your model registry (see AI Governance & Model Lineage).
- De-identification Before Training: Apply HIPAA Safe Harbor or Expert Determination de-identification before any data enters a training pipeline (see PII/PHI Redaction in AI Pipelines). ⚠ Requires Custom Code Automated tools have known recall gaps for rare clinical identifiers. Supplement with a custom validation step that samples redacted outputs and flags residual PHI for human review before data is approved for training.
- Data Lineage: Maintain end-to-end audit trails covering data origin, transformations, model training runs, and inference outputs (see AI Governance & Model Lineage). Pair with audit logging to capture data access events (see Audit Logging). Cross-system lineage stitching (e.g., from EHR source to model output) typically requires a custom metadata store or data catalog integration to bridge across services.
2. Access Control & Authentication
- Zero Trust Architecture: Apply least-privilege access at every layer (network, service, and data) and continuously re-verify identity rather than relying on perimeter trust (see Identity & Access Management). Pair with network controls to ensure workloads cannot communicate laterally without explicit allow rules (see Network Isolation).
- Role-Based Access Control (RBAC): Define distinct roles for data scientists, clinicians, ML engineers, auditors, and automated pipelines, each with the minimum permissions required. Use privileged access tooling to detect and remediate over-permissioned roles (see Privileged Access Management). Review role assignments quarterly and remove access immediately upon role change or offboarding.
- Multi-Factor Authentication: Enforce MFA for all human access to AI systems, model registries, and any environment that can reach PHI. AWS IAM Identity Center, Azure AD Conditional Access, and GCP Cloud Identity all support MFA enforcement at the organization level (see Identity & Access Management).
- Service-to-Service Authentication: Use platform-managed identities (AWS IAM Roles, Azure Managed Identities, GCP Service Accounts) for all automated AI pipeline components. Never embed long-lived credentials in code or environment variables. Store them in a secrets manager and rotate them automatically on a defined schedule (see Secrets Management).
3. Responsible AI & Guardrails
- Content Safety Guardrails: Configure platform-native guardrails before any LLM output reaches a clinical user (see AI Content Safety & Guardrails). Set thresholds conservatively for clinical contexts. A false positive (blocked output) is safer than a false negative (harmful output delivered to a patient).
- Prompt Injection Defense: Treat all user-supplied input as untrusted. Use prompt attack filters and prompt shields to detect direct and indirect injection attempts (see Prompt Injection & Jailbreak Prevention). ⚠ Requires Custom Code Structural defenses such as separating system prompts from user input, enforcing output schemas, and sandboxing tool-calling agents must be implemented in your application layer.
- PII/PHI Redaction in Pipelines: Run all data through automated redaction before it enters a model (see PII/PHI Redaction in AI Pipelines). Configure entity types explicitly. Do not rely on default settings, as healthcare data contains identifiers (device IDs, dates, geographic subdivisions) that require custom entity definitions beyond standard PII categories.
- Output Filtering: Apply post-generation validation to catch hallucinations, unsupported clinical claims, or inadvertent PHI in model responses. ⚠ Requires Custom Code No cloud service provides out-of-the-box hallucination detection for clinical content. You must implement a custom validation layer (for example, a secondary LLM judge, a retrieval-grounding check, or a rules-based filter) that evaluates outputs against a trusted knowledge base before delivery.
- Grounding & RAG Controls: When using retrieval-augmented generation, restrict the retrieval corpus to validated, access-controlled knowledge bases. ⚠ Requires Custom Code Enforcing patient-level access scoping within a RAG retrieval step (e.g., ensuring a query only retrieves documents the requesting user is authorized to see) requires custom authorization logic in your retrieval pipeline. Cloud vector stores do not natively enforce row-level security tied to clinical access controls.
4. Bias, Fairness & Explainability
- Pre-deployment Bias Audits: Before any model enters clinical use, evaluate it for disparate performance across protected attributes (age, race, sex, insurance status, language). Use bias detection tooling to compute pre- and post-training bias metrics and disaggregated performance views (see AI Bias Detection & Fairness). Document results in a model card and require sign-off before deployment.
- Ongoing Fairness Monitoring: Bias can emerge or worsen after deployment as patient populations shift. Track performance metrics disaggregated by subgroup in production (see AI Model Monitoring & Drift Detection). Set alert thresholds that trigger review when performance gaps exceed acceptable bounds.
- Explainable Predictions: For clinical decision support models, provide feature-level explanations alongside predictions so clinicians can evaluate and challenge AI recommendations (see AI Model Explainability). Surfacing explanations in a clinical UI in a format meaningful to non-technical users (e.g., natural language summaries of contributing factors) requires custom presentation logic beyond what these services provide.
- Human-in-the-Loop: Require qualified clinician review for high-stakes AI outputs (diagnosis, triage, medication recommendations). Build escalation logic that routes low-confidence predictions or out-of-distribution inputs to a human reviewer queue, with audit logging of the review decision.
- Disparate Impact Reporting: Produce and retain fairness reports as part of every model governance review cycle. Attach fairness metrics to model versions in your model registry (see AI Governance & Model Lineage). Reports should cover all subgroups relevant to the clinical use case and be retained for the duration of the model's deployment.
5. Model Security & Integrity
- Model Versioning & Signing: Every model artifact promoted to staging or production must be versioned and cryptographically signed to detect tampering. Use your model registry to enforce versioned promotion gates (see AI Governance & Model Lineage). ⚠ Requires Custom Code Cryptographic signing of model artifacts (e.g., SHA-256 checksums stored in a tamper-evident log) is not provided natively by these registries and must be implemented as a step in your CI/CD pipeline.
- Adversarial Robustness Testing: Before deployment, test models against adversarial inputs (perturbed images, edge-case lab values, or crafted text) to identify brittleness that could be exploited. Use open-source libraries (e.g., IBM ART, CleverHans) and build adversarial test suites tailored to your clinical domain as part of your pre-deployment validation process.
- Model Monitoring & Drift Detection: Deploy continuous monitoring to catch data drift, concept drift, and anomalous prediction patterns that may indicate data poisoning or population shift (see AI Model Monitoring & Drift Detection). Configure alerts that trigger retraining or rollback workflows when drift metrics exceed defined thresholds.
- Secure Training Pipelines: Isolate training environments in dedicated VPCs with no public internet egress (see Network Isolation). Validate all training data sources against a known-good manifest before ingestion. Scan training container images for vulnerabilities before execution (see Vulnerability Scanning).
- Model Lineage Tracking: Record full provenance (training datasets, preprocessing steps, hyperparameters, evaluation results, and approvers) for every model version (see AI Governance & Model Lineage). This record is required for FDA SaMD submissions and HIPAA audit responses.
6. Compliance & Governance
- Regulatory Alignment: Map each AI model to the specific regulations it must satisfy: HIPAA/HITECH for PHI handling, FDA SaMD guidance for clinical decision support, and state-level AI transparency laws. Use compliance monitoring tooling to continuously assess control coverage (see Compliance Monitoring).
- Business Associate Agreements: Confirm BAAs are executed with every cloud provider and third-party AI vendor that processes PHI, including foundation model API providers. AWS, Azure, and GCP all offer BAAs for their HIPAA-eligible services. Verify that the specific services in each row of the table above are covered under your active BAA. Note that BAA coverage does not extend to open-source models you self-host.
- AI Model Registry & Approval Workflows: Require formal sign-off before any model reaches production. Use your model registry to gate promotions and capture the approver identity, evaluation results, bias audit outcome, and regulatory classification for each release (see AI Governance & Model Lineage).
- Audit Readiness: Configure audit logging to capture all PHI access, model invocations, and administrative actions, and retain logs for a minimum of six years per HIPAA requirements (see Audit Logging). ⚠ Requires Custom Code Logging AI-specific events (prompt/response pairs, model version used per inference, confidence scores) requires instrumentation in your application code. Cloud audit logs capture infrastructure events only.
- Risk Assessments: Conduct formal risk assessments before deploying each AI model, covering security, bias, safety, and unintended use. Use compliance monitoring tooling to automate infrastructure risk scoring and supplement with a structured algorithmic impact assessment process (see Compliance Monitoring).
7. Infrastructure Security
- Network Segmentation: Deploy AI workloads in dedicated VPCs or VNets with no default internet egress. Enforce deny-by-default inbound and outbound rules (see Network Isolation). Separate training, inference, and data storage subnets with explicit allow rules only for required traffic paths.
- Secure APIs: Protect all AI service endpoints with an API gateway that enforces rate limiting, request size limits, and authentication (OAuth 2.0 / API keys) at the edge (see API Security). Pair with a web application firewall to block OWASP Top 10 attack patterns (see Web Application Firewall).
- Container Security: Scan all container images before deployment (see Vulnerability Scanning). Use minimal base images, run containers as non-root, and enforce read-only root filesystems. Apply runtime security policies through your container orchestration platform (see Container Security).
- Secrets Management: Store all API keys, database credentials, model signing keys, and certificates in a dedicated secrets manager (see Secrets Management). Enable automatic rotation for all secrets with a defined rotation period. Scan repositories and object storage for accidentally committed credentials using data loss prevention tooling (see Data Loss Prevention).
8. Incident Response & Recovery
- Incident Response Plan: Develop and test an AI-specific incident response runbook that covers PHI breaches, model compromise (e.g., poisoned weights, unauthorized model swap), and harmful output events. Use incident response tooling to automate detection-to-response workflows (see Incident Response). ⚠ Requires Custom Code AI-specific response actions such as automatically disabling a model endpoint, rolling back to a prior version, or quarantining a training dataset must be implemented as custom automation scripts triggered by your alerting pipeline.
- Backup & Recovery: Automate backups of model artifacts, training datasets, pipeline configurations, and inference logs (see Backup & Recovery). Test recovery procedures at least quarterly. Verify that a model can be restored to a known-good version and that restored data passes integrity checks before re-entering production.
- Breach Notification: Establish documented processes for HIPAA breach notification: affected individuals within 60 days of discovery, HHS without unreasonable delay, and media notification for breaches affecting more than 500 residents of a state. ⚠ Requires Custom Code Breach scope determination (identifying which patients were affected by a PHI exposure) requires custom tooling that correlates audit logs (see Audit Logging) with patient record identifiers.
- Post-Incident Analysis: Conduct a structured post-mortem within 30 days of any security incident. Use threat detection findings as inputs to the root cause analysis (see Threat Detection). Document control gaps, assign remediation owners, and track closure in your compliance monitoring system (see Compliance Monitoring).
9. Transparency & Explainability
- Model Documentation (Model Cards): Produce a model card for every model entering clinical use. It should cover intended use, out-of-scope uses, training data sources, evaluation results (including subgroup performance), known limitations, and the regulatory classification (see AI Model Documentation & Cards).
- Explainable AI in Clinical Interfaces: Surface feature-level explanations alongside AI predictions in clinical tools. Use explainability services to generate explanations at inference time (see AI Model Explainability).
- Patient Rights (Access, Correction, Deletion): Implement mechanisms for patients to request access to their data used in AI systems, correct inaccuracies, and request deletion under HIPAA and applicable state laws. This includes request intake, fulfillment, and audit trail capabilities, as well as the ability to identify and remove a specific patient's records from training datasets and document the impact on affected models.
- Clinical Validation Before Deployment: Require prospective clinical validation by qualified healthcare professionals before any AI model is used in patient care decisions. Clinical validation workflows, including study design, outcome tracking, and sign-off by clinical leadership, must be managed through your organization's clinical governance processes.
10. Supply Chain Security
- Dependency Scanning & SBOM: Generate a Software Bill of Materials (SBOM) for every AI application and pipeline. Scan all open-source and third-party dependencies for known vulnerabilities before deployment (see Vulnerability Scanning). Integrate dependency scanning into CI/CD gates so that builds fail when critical CVEs are detected.
- Artifact Integrity Verification: Require cryptographic signatures on all artifacts promoted through your pipeline, including container images, model files, and infrastructure-as-code modules (see Secure Software Supply Chain). Use binary authorization or admission controllers to block unsigned artifacts from reaching production environments.
- Third-Party Model Provenance: Before using any pre-trained or foundation model from an external source (Hugging Face, model zoos, vendor APIs), verify its provenance: publisher identity, training data disclosures, license terms, and known vulnerability advisories. Build an internal review and approval process that evaluates model origin, licensing compatibility, and security posture before any external model enters your environment.
- CI/CD Pipeline Hardening: Treat your AI build and deployment pipelines as high-value targets. Enforce branch protection, require code review for pipeline configuration changes, use ephemeral build environments, and restrict pipeline service accounts to least-privilege permissions (see Privileged Access Management). Audit all pipeline executions and retain logs (see Audit Logging).
- Private Package Registries: Host internal copies of approved dependencies in private artifact repositories (see Secure Software Supply Chain). Block direct pulls from public registries in production environments to prevent dependency confusion and typosquatting attacks. Regularly sync and scan approved packages against updated vulnerability databases.
Implementation Notes
HIPAA & BAA Coverage
All listed services support HIPAA-compliant configurations when properly implemented. Verify that each service you adopt is covered under your active Business Associate Agreement: AWS HIPAA Eligible Services, Azure HIPAA Offering, GCP HIPAA Compliance.
FDA SaMD & AI Regulation
Clinical decision support models may qualify as Software as a Medical Device (SaMD) under FDA AI/ML guidance.
Multi-Cloud Strategy
Use this mapping to maintain consistent security postures across cloud providers or to facilitate migration projects. Each provider publishes a shared responsibility model (AWS, Azure, GCP) that defines the boundary between provider-managed and customer-managed controls.
Defense in Depth
No single control is sufficient. Layer network isolation, encryption, identity, monitoring, and guardrails so that a failure in one control does not expose PHI or compromise model integrity. The NIST AI RMF provides a structured approach to identifying and mitigating AI-specific risks across these layers.
Logging & Audit Retention
HIPAA requires audit logs covering PHI access to be retained for a minimum of six years. Configure centralized log aggregation early, retrofitting retention policies after launch is significantly harder. Each provider offers long-term log storage options: CloudTrail log integrity, Azure Monitor retention, Cloud Logging routing.
State-Level AI Laws
Beyond federal requirements, several U.S. states have enacted or proposed AI-specific transparency and accountability laws. Colorado's SB 24-205 requires deployers of high-risk AI to conduct impact assessments, and similar legislation is advancing in other states. Track applicable state obligations alongside HIPAA and FDA requirements.