Glossary
Comprehensive definitions of key terms and concepts in healthcare AI implementation
Categories
AI/ML
A/B Testing
Experimental approach comparing two versions of a model or system to determine which performs better.
Active Learning
Machine learning approach where the algorithm selectively queries the most informative data points for labeling.
AI/ML (Artificial Intelligence/Machine Learning)
Technologies that enable computers to learn from data and make predictions or decisions without explicit programming.
Algorithmic Impact Assessment
Structured evaluation of the potential risks, biases, and societal effects of deploying an AI system before it enters production.
AUC-ROC (Area Under the Receiver Operating Characteristic Curve)
Performance metric measuring a classification model's ability to distinguish between classes.
Computer Vision
AI discipline enabling systems to interpret and analyze visual data such as medical images, pathology slides, and radiology scans.
Concept Drift
Change in the statistical relationship between input features and target outcomes over time, degrading model accuracy.
Confusion Matrix
Table used to evaluate classification model performance by comparing predicted and actual values.
Cross-Validation
Technique for assessing model performance by partitioning data into subsets for training and testing.
Drift Detection
Monitoring for changes in model performance or data distribution over time that may indicate degradation.
Ensemble Learning
Technique combining multiple models to produce better predictive performance than individual models.
Explainable AI (XAI)
AI systems designed to provide understandable explanations for their decisions and predictions.
F1 Score
Harmonic mean of precision and recall, providing a single metric for model performance evaluation.
Feature Store
Centralized repository for storing, managing, and serving machine learning features for training and inference.
Federated Learning
Machine learning approach where models are trained across multiple decentralized devices or servers without exchanging raw data.
Guardrails
Configurable safety filters applied to AI model inputs and outputs to block harmful, biased, or non-compliant content.
Hallucination
AI model output that appears plausible but is factually incorrect, fabricated, or unsupported by the input data or training corpus.
Human-in-the-Loop (HITL)
AI system design requiring human review and approval at critical decision points before actions are taken or outputs are delivered.
Hyperparameter Tuning
Process of optimizing the configuration settings that control the learning process of ML algorithms.
Inference
The process of using a trained AI model to make predictions on new, unseen data.
LLM (Large Language Model)
AI model trained on vast amounts of text data capable of generating, summarizing, and reasoning over natural language.
MLOps (Machine Learning Operations)
Set of practices combining ML, DevOps, and data engineering to deploy and maintain ML systems in production.
Model Bias
Systematic errors in AI model predictions that unfairly favor or disadvantage certain groups or outcomes.
Model Card
Standardized document describing a machine learning model's intended use, performance metrics, limitations, and ethical considerations.
Model Fairness
Principle ensuring AI models make equitable predictions across different demographic groups and populations.
Model Governance
Framework for managing AI/ML models throughout their lifecycle, ensuring compliance, performance, and accountability.
Model Registry
Centralized repository for storing, versioning, and managing machine learning models throughout their lifecycle.
Model Versioning
The practice of tracking and managing different versions of AI models throughout their lifecycle.
NLP (Natural Language Processing)
Branch of AI focused on enabling computers to understand, interpret, and generate human language from clinical notes and medical text.
Overfitting
Modeling error where a model learns training data too well, including noise, reducing generalization ability.
Precision and Recall
Metrics measuring the accuracy of positive predictions and the model's ability to find all positive instances.
RAG (Retrieval-Augmented Generation)
Technique that enhances LLM outputs by retrieving relevant information from external knowledge bases before generating a response.
Responsible AI
Practice of designing, developing, and deploying AI systems that are fair, transparent, accountable, and aligned with ethical principles.
Shadow Mode Deployment
Deployment strategy where a new model runs alongside production without affecting user-facing results.
Synthetic Data
Artificially generated data that mimics real data characteristics without containing actual patient information.
Transfer Learning
Machine learning technique where a model trained on one task is adapted for a related task.
Underfitting
Modeling error where a model is too simple to capture the underlying patterns in the data.
Cloud Computing
API Gateway
Server that acts as an intermediary for requests from clients seeking resources from backend services.
Blue-Green Deployment
Deployment strategy maintaining two identical production environments to enable zero-downtime releases.
Canary Deployment
Deployment strategy that gradually rolls out changes to a small subset of users before full deployment.
CI/CD (Continuous Integration/Continuous Deployment)
Software development practice that automates building, testing, and deploying code changes to production environments.
Container Orchestration
Automated management of containerized applications including deployment, scaling, and networking.
Data Lake
Centralized repository that stores structured and unstructured data at any scale in its native format.
Data Mesh
Decentralized data architecture that treats data as a product with domain-oriented ownership.
Data Residency
Legal requirement that data must be stored within specific geographic boundaries or jurisdictions.
Disaster Recovery (DR)
Set of policies and procedures to recover and protect IT infrastructure in the event of a disaster.
Edge Computing
Distributed computing paradigm that brings computation and data storage closer to data sources.
Immutable Infrastructure
Infrastructure paradigm where servers are never modified after deployment, only replaced with new versions.
Infrastructure as Code (IaC)
Managing and provisioning infrastructure through machine-readable definition files rather than manual processes.
Multi-Cloud Strategy
Approach using multiple cloud service providers to avoid vendor lock-in and optimize performance, cost, and compliance.
RPO (Recovery Point Objective)
Maximum acceptable amount of data loss measured in time before a disaster occurs.
RTO (Recovery Time Objective)
Maximum acceptable time that a system can be down after a failure or disaster.
Serverless Computing
Cloud computing model where the cloud provider manages infrastructure, allowing developers to focus on code.
Service Mesh
Infrastructure layer that handles service-to-service communication in microservices architectures.
VPC (Virtual Private Cloud)
Isolated virtual network within a public cloud that provides enhanced security and control.
Data Governance
Anonymization
Process of irreversibly removing identifying information from data so individuals cannot be re-identified.
Audit Trail
Chronological record of system activities that enables reconstruction and examination of events.
Consent Management
Process of obtaining, recording, and managing patient consent for data collection and usage.
Data Catalog
Organized inventory of data assets using metadata to help organizations find and manage their data.
Data Classification
Process of categorizing data by sensitivity level and regulatory requirements to apply appropriate security and handling controls.
Data Lineage
The tracking of data flow from its origin through various transformations to its final destination.
Data Sovereignty
Concept that data is subject to the laws and governance structures of the nation where it is collected or stored.
Data Stewardship
Management and oversight of an organization's data assets to ensure quality, security, and compliance.
De-identification
The process of removing or obscuring personal identifiers from data to protect individual privacy.
PII (Personally Identifiable Information)
Information that can be used to identify, contact, or locate a specific individual.
Pseudonymization
Data processing technique that replaces identifying information with artificial identifiers or pseudonyms.
Healthcare Regulations
21 CFR Part 11
FDA regulation establishing criteria for electronic records and signatures in healthcare and life sciences.
BAA (Business Associate Agreement)
A contract between a HIPAA-covered entity and a business associate that outlines how PHI will be handled and protected.
Compliance Monitoring
Continuous automated assessment of systems and configurations against regulatory requirements and organizational policies.
Data Portability
Right of individuals to receive their personal data and transmit it to another controller.
FDA (Food and Drug Administration)
U.S. federal agency responsible for regulating medical devices, including AI/ML-based medical software.
GDPR (General Data Protection Regulation)
European Union regulation governing data protection and privacy for individuals within the EU and EEA.
GxP (Good Practice Regulations)
Collection of quality guidelines and regulations including GMP, GLP, and GCP that govern pharmaceutical, medical device, and clinical trial activities.
HIPAA (Health Insurance Portability and Accountability Act)
U.S. federal law that establishes standards for protecting sensitive patient health information.
HITECH Act
Health Information Technology for Economic and Clinical Health Act, which strengthened HIPAA privacy and security protections.
HITRUST CSF
Comprehensive security framework that harmonizes healthcare regulations and standards into a single overarching framework.
ISO 27001
International standard for information security management systems (ISMS).
NIST (National Institute of Standards and Technology)
U.S. federal agency that develops cybersecurity frameworks, standards, and guidelines widely adopted in healthcare IT.
PHI (Protected Health Information)
Any individually identifiable health information held or transmitted by a covered entity or business associate.
Right to Erasure
Individual's right to have their personal data deleted under certain conditions (GDPR Article 17).
SaMD (Software as a Medical Device)
Software intended for medical purposes that performs these purposes without being part of a hardware medical device.
SOC 2 (Service Organization Control 2)
Auditing standard for service organizations that store customer data in the cloud.
Healthcare Standards
Clinical Decision Support System (CDSS)
Health information technology that provides clinicians with patient-specific assessments or recommendations.
DICOM (Digital Imaging and Communications in Medicine)
International standard for transmitting, storing, and sharing medical imaging information.
EHR (Electronic Health Record)
Digital version of a patient's medical history maintained by healthcare providers, including diagnoses, treatments, and test results.
HL7 FHIR (Fast Healthcare Interoperability Resources)
Standard for exchanging healthcare information electronically, enabling interoperability between healthcare systems.
Interoperability
Ability of different healthcare IT systems, devices, and applications to exchange, interpret, and use data seamlessly.
Security
Certificate Authority (CA)
Trusted entity that issues digital certificates for verifying identities on the internet.
Cloud Access Security Broker (CASB)
Security policy enforcement point between cloud service consumers and providers to monitor and control data access.
CMK (Customer-Managed Key)
Encryption keys that are created, owned, and managed by the customer rather than the cloud provider.
Data Loss Prevention (DLP)
Security strategy and tooling that detects and prevents unauthorized transmission or exfiltration of sensitive data.
Data Poisoning
Attack that compromises AI model integrity by injecting malicious or misleading data into the training dataset.
DDoS Protection
Security measures to prevent distributed denial-of-service attacks that overwhelm systems with traffic.
Defense in Depth
A security strategy that employs multiple layers of security controls to protect systems and data.
Differential Privacy
Mathematical framework for quantifying and limiting privacy loss when sharing aggregate information about datasets.
Encryption at Rest
Encryption of data when it is stored on disk or other storage media.
Encryption in Transit
Encryption of data as it moves between systems or across networks.
Hardware Security Module (HSM)
Physical device that safeguards and manages cryptographic keys and performs encryption operations.
Homomorphic Encryption
Encryption method that allows computations on encrypted data without decrypting it first.
IAM (Identity and Access Management)
Framework of policies and technologies for ensuring that the right individuals have appropriate access to resources.
Incident Response
Organized approach to detecting, containing, and recovering from security breaches or system compromises.
Intrusion Detection System (IDS)
Device or software that monitors network traffic for suspicious activity and policy violations.
Key Management Service (KMS)
Cloud service for creating and controlling encryption keys used to encrypt data.
Least Privilege
Security principle that users and systems should have only the minimum access rights needed to perform their functions.
MFA (Multi-Factor Authentication)
Security method requiring two or more verification factors to gain access to a resource.
mTLS (Mutual TLS)
Authentication method where both client and server verify each other's identity using certificates.
Penetration Testing
Authorized simulated cyberattack to evaluate the security of a system.
Privacy by Design
Approach embedding privacy into the design and architecture of IT systems and business practices.
Prompt Injection
Adversarial attack where malicious input is crafted to manipulate an LLM into ignoring its instructions or producing unintended outputs.
RBAC (Role-Based Access Control)
Access control method that assigns permissions to users based on their role within an organization.
SBOM (Software Bill of Materials)
Comprehensive inventory of all software components, libraries, and dependencies used in an application or system.
Secrets Management
Practice of securely storing, accessing, and rotating sensitive credentials such as API keys, passwords, and encryption keys.
Secure Multi-Party Computation (SMPC)
Cryptographic protocol enabling parties to jointly compute functions over their inputs while keeping those inputs private.
Security Information and Event Management (SIEM)
Solution providing real-time analysis of security alerts generated by applications and network hardware.
TLS/SSL (Transport Layer Security/Secure Sockets Layer)
Cryptographic protocols providing secure communication over computer networks.
Tokenization
Process of replacing sensitive data with non-sensitive placeholder tokens that can be mapped back to original data.
Vulnerability Assessment
Process of identifying, quantifying, and prioritizing security vulnerabilities in a system.
Web Application Firewall (WAF)
Security solution that filters and monitors HTTP traffic between a web application and the internet.
Zero Trust Architecture
Security model based on the principle of 'never trust, always verify' regardless of network location.
Additional Resources
Regulatory Guidance
For official definitions and detailed regulatory requirements, consult the HHS HIPAA website, FDA guidance documents, and relevant state healthcare privacy laws.