Glossary

Comprehensive definitions of key terms and concepts in healthcare AI implementation

AI/ML

A/B Testing

Experimental approach comparing two versions of a model or system to determine which performs better.

Active Learning

Machine learning approach where the algorithm selectively queries the most informative data points for labeling.

AI/ML (Artificial Intelligence/Machine Learning)

Technologies that enable computers to learn from data and make predictions or decisions without explicit programming.

Algorithmic Impact Assessment

Structured evaluation of the potential risks, biases, and societal effects of deploying an AI system before it enters production.

AUC-ROC (Area Under the Receiver Operating Characteristic Curve)

Performance metric measuring a classification model's ability to distinguish between classes.

Computer Vision

AI discipline enabling systems to interpret and analyze visual data such as medical images, pathology slides, and radiology scans.

Concept Drift

Change in the statistical relationship between input features and target outcomes over time, degrading model accuracy.

Confusion Matrix

Table used to evaluate classification model performance by comparing predicted and actual values.

Cross-Validation

Technique for assessing model performance by partitioning data into subsets for training and testing.

Drift Detection

Monitoring for changes in model performance or data distribution over time that may indicate degradation.

Ensemble Learning

Technique combining multiple models to produce better predictive performance than individual models.

Explainable AI (XAI)

AI systems designed to provide understandable explanations for their decisions and predictions.

F1 Score

Harmonic mean of precision and recall, providing a single metric for model performance evaluation.

Feature Store

Centralized repository for storing, managing, and serving machine learning features for training and inference.

Federated Learning

Machine learning approach where models are trained across multiple decentralized devices or servers without exchanging raw data.

Guardrails

Configurable safety filters applied to AI model inputs and outputs to block harmful, biased, or non-compliant content.

Hallucination

AI model output that appears plausible but is factually incorrect, fabricated, or unsupported by the input data or training corpus.

Human-in-the-Loop (HITL)

AI system design requiring human review and approval at critical decision points before actions are taken or outputs are delivered.

Hyperparameter Tuning

Process of optimizing the configuration settings that control the learning process of ML algorithms.

Inference

The process of using a trained AI model to make predictions on new, unseen data.

LLM (Large Language Model)

AI model trained on vast amounts of text data capable of generating, summarizing, and reasoning over natural language.

MLOps (Machine Learning Operations)

Set of practices combining ML, DevOps, and data engineering to deploy and maintain ML systems in production.

Model Bias

Systematic errors in AI model predictions that unfairly favor or disadvantage certain groups or outcomes.

Model Card

Standardized document describing a machine learning model's intended use, performance metrics, limitations, and ethical considerations.

Model Fairness

Principle ensuring AI models make equitable predictions across different demographic groups and populations.

Model Governance

Framework for managing AI/ML models throughout their lifecycle, ensuring compliance, performance, and accountability.

Model Registry

Centralized repository for storing, versioning, and managing machine learning models throughout their lifecycle.

Model Versioning

The practice of tracking and managing different versions of AI models throughout their lifecycle.

NLP (Natural Language Processing)

Branch of AI focused on enabling computers to understand, interpret, and generate human language from clinical notes and medical text.

Overfitting

Modeling error where a model learns training data too well, including noise, reducing generalization ability.

Precision and Recall

Metrics measuring the accuracy of positive predictions and the model's ability to find all positive instances.

RAG (Retrieval-Augmented Generation)

Technique that enhances LLM outputs by retrieving relevant information from external knowledge bases before generating a response.

Responsible AI

Practice of designing, developing, and deploying AI systems that are fair, transparent, accountable, and aligned with ethical principles.

Shadow Mode Deployment

Deployment strategy where a new model runs alongside production without affecting user-facing results.

Synthetic Data

Artificially generated data that mimics real data characteristics without containing actual patient information.

Transfer Learning

Machine learning technique where a model trained on one task is adapted for a related task.

Underfitting

Modeling error where a model is too simple to capture the underlying patterns in the data.

Cloud Computing

API Gateway

Server that acts as an intermediary for requests from clients seeking resources from backend services.

Blue-Green Deployment

Deployment strategy maintaining two identical production environments to enable zero-downtime releases.

Canary Deployment

Deployment strategy that gradually rolls out changes to a small subset of users before full deployment.

CI/CD (Continuous Integration/Continuous Deployment)

Software development practice that automates building, testing, and deploying code changes to production environments.

Container Orchestration

Automated management of containerized applications including deployment, scaling, and networking.

Data Lake

Centralized repository that stores structured and unstructured data at any scale in its native format.

Data Mesh

Decentralized data architecture that treats data as a product with domain-oriented ownership.

Data Residency

Legal requirement that data must be stored within specific geographic boundaries or jurisdictions.

Disaster Recovery (DR)

Set of policies and procedures to recover and protect IT infrastructure in the event of a disaster.

Edge Computing

Distributed computing paradigm that brings computation and data storage closer to data sources.

Immutable Infrastructure

Infrastructure paradigm where servers are never modified after deployment, only replaced with new versions.

Infrastructure as Code (IaC)

Managing and provisioning infrastructure through machine-readable definition files rather than manual processes.

Multi-Cloud Strategy

Approach using multiple cloud service providers to avoid vendor lock-in and optimize performance, cost, and compliance.

RPO (Recovery Point Objective)

Maximum acceptable amount of data loss measured in time before a disaster occurs.

RTO (Recovery Time Objective)

Maximum acceptable time that a system can be down after a failure or disaster.

Serverless Computing

Cloud computing model where the cloud provider manages infrastructure, allowing developers to focus on code.

Service Mesh

Infrastructure layer that handles service-to-service communication in microservices architectures.

VPC (Virtual Private Cloud)

Isolated virtual network within a public cloud that provides enhanced security and control.

Data Governance

Anonymization

Process of irreversibly removing identifying information from data so individuals cannot be re-identified.

Audit Trail

Chronological record of system activities that enables reconstruction and examination of events.

Consent Management

Process of obtaining, recording, and managing patient consent for data collection and usage.

Data Catalog

Organized inventory of data assets using metadata to help organizations find and manage their data.

Data Classification

Process of categorizing data by sensitivity level and regulatory requirements to apply appropriate security and handling controls.

Data Lineage

The tracking of data flow from its origin through various transformations to its final destination.

Data Sovereignty

Concept that data is subject to the laws and governance structures of the nation where it is collected or stored.

Data Stewardship

Management and oversight of an organization's data assets to ensure quality, security, and compliance.

De-identification

The process of removing or obscuring personal identifiers from data to protect individual privacy.

PII (Personally Identifiable Information)

Information that can be used to identify, contact, or locate a specific individual.

Pseudonymization

Data processing technique that replaces identifying information with artificial identifiers or pseudonyms.

Healthcare Regulations

21 CFR Part 11

FDA regulation establishing criteria for electronic records and signatures in healthcare and life sciences.

BAA (Business Associate Agreement)

A contract between a HIPAA-covered entity and a business associate that outlines how PHI will be handled and protected.

Compliance Monitoring

Continuous automated assessment of systems and configurations against regulatory requirements and organizational policies.

Data Portability

Right of individuals to receive their personal data and transmit it to another controller.

FDA (Food and Drug Administration)

U.S. federal agency responsible for regulating medical devices, including AI/ML-based medical software.

GDPR (General Data Protection Regulation)

European Union regulation governing data protection and privacy for individuals within the EU and EEA.

GxP (Good Practice Regulations)

Collection of quality guidelines and regulations including GMP, GLP, and GCP that govern pharmaceutical, medical device, and clinical trial activities.

HIPAA (Health Insurance Portability and Accountability Act)

U.S. federal law that establishes standards for protecting sensitive patient health information.

HITECH Act

Health Information Technology for Economic and Clinical Health Act, which strengthened HIPAA privacy and security protections.

HITRUST CSF

Comprehensive security framework that harmonizes healthcare regulations and standards into a single overarching framework.

ISO 27001

International standard for information security management systems (ISMS).

NIST (National Institute of Standards and Technology)

U.S. federal agency that develops cybersecurity frameworks, standards, and guidelines widely adopted in healthcare IT.

PHI (Protected Health Information)

Any individually identifiable health information held or transmitted by a covered entity or business associate.

Right to Erasure

Individual's right to have their personal data deleted under certain conditions (GDPR Article 17).

SaMD (Software as a Medical Device)

Software intended for medical purposes that performs these purposes without being part of a hardware medical device.

SOC 2 (Service Organization Control 2)

Auditing standard for service organizations that store customer data in the cloud.

Healthcare Standards

Clinical Decision Support System (CDSS)

Health information technology that provides clinicians with patient-specific assessments or recommendations.

DICOM (Digital Imaging and Communications in Medicine)

International standard for transmitting, storing, and sharing medical imaging information.

EHR (Electronic Health Record)

Digital version of a patient's medical history maintained by healthcare providers, including diagnoses, treatments, and test results.

HL7 FHIR (Fast Healthcare Interoperability Resources)

Standard for exchanging healthcare information electronically, enabling interoperability between healthcare systems.

Interoperability

Ability of different healthcare IT systems, devices, and applications to exchange, interpret, and use data seamlessly.

Security

Certificate Authority (CA)

Trusted entity that issues digital certificates for verifying identities on the internet.

Cloud Access Security Broker (CASB)

Security policy enforcement point between cloud service consumers and providers to monitor and control data access.

CMK (Customer-Managed Key)

Encryption keys that are created, owned, and managed by the customer rather than the cloud provider.

Data Loss Prevention (DLP)

Security strategy and tooling that detects and prevents unauthorized transmission or exfiltration of sensitive data.

Data Poisoning

Attack that compromises AI model integrity by injecting malicious or misleading data into the training dataset.

DDoS Protection

Security measures to prevent distributed denial-of-service attacks that overwhelm systems with traffic.

Defense in Depth

A security strategy that employs multiple layers of security controls to protect systems and data.

Differential Privacy

Mathematical framework for quantifying and limiting privacy loss when sharing aggregate information about datasets.

Encryption at Rest

Encryption of data when it is stored on disk or other storage media.

Encryption in Transit

Encryption of data as it moves between systems or across networks.

Hardware Security Module (HSM)

Physical device that safeguards and manages cryptographic keys and performs encryption operations.

Homomorphic Encryption

Encryption method that allows computations on encrypted data without decrypting it first.

IAM (Identity and Access Management)

Framework of policies and technologies for ensuring that the right individuals have appropriate access to resources.

Incident Response

Organized approach to detecting, containing, and recovering from security breaches or system compromises.

Intrusion Detection System (IDS)

Device or software that monitors network traffic for suspicious activity and policy violations.

Key Management Service (KMS)

Cloud service for creating and controlling encryption keys used to encrypt data.

Least Privilege

Security principle that users and systems should have only the minimum access rights needed to perform their functions.

MFA (Multi-Factor Authentication)

Security method requiring two or more verification factors to gain access to a resource.

mTLS (Mutual TLS)

Authentication method where both client and server verify each other's identity using certificates.

Penetration Testing

Authorized simulated cyberattack to evaluate the security of a system.

Privacy by Design

Approach embedding privacy into the design and architecture of IT systems and business practices.

Prompt Injection

Adversarial attack where malicious input is crafted to manipulate an LLM into ignoring its instructions or producing unintended outputs.

RBAC (Role-Based Access Control)

Access control method that assigns permissions to users based on their role within an organization.

SBOM (Software Bill of Materials)

Comprehensive inventory of all software components, libraries, and dependencies used in an application or system.

Secrets Management

Practice of securely storing, accessing, and rotating sensitive credentials such as API keys, passwords, and encryption keys.

Secure Multi-Party Computation (SMPC)

Cryptographic protocol enabling parties to jointly compute functions over their inputs while keeping those inputs private.

Security Information and Event Management (SIEM)

Solution providing real-time analysis of security alerts generated by applications and network hardware.

TLS/SSL (Transport Layer Security/Secure Sockets Layer)

Cryptographic protocols providing secure communication over computer networks.

Tokenization

Process of replacing sensitive data with non-sensitive placeholder tokens that can be mapped back to original data.

Vulnerability Assessment

Process of identifying, quantifying, and prioritizing security vulnerabilities in a system.

Web Application Firewall (WAF)

Security solution that filters and monitors HTTP traffic between a web application and the internet.

Zero Trust Architecture

Security model based on the principle of 'never trust, always verify' regardless of network location.

Additional Resources

Regulatory Guidance

For official definitions and detailed regulatory requirements, consult the HHS HIPAA website, FDA guidance documents, and relevant state healthcare privacy laws.

Cloud Provider Documentation

AWS, Azure, and GCP each provide comprehensive documentation on their security and compliance services referenced throughout this glossary.

Industry Standards

Many terms align with industry standards from NIST, ISO, and other standards bodies. Refer to these organizations for authoritative technical definitions.