Certified the Day It Shipped, Never Again

AI Continuous Learning Safety Certification Gap

digitalhealthtransport

Problem Statement

Traditional safety certification frameworks (IEC 61508, ISO 26262, DO-178C) require demonstrating that a specific, frozen version of software meets safety requirements. Machine learning systems that learn continuously after deployment — updating model parameters based on new data — fundamentally violate this assumption. No certification framework exists that can verify safety properties are preserved across model updates. The result: AI systems in safety-critical applications (medical devices, autonomous vehicles, industrial control) are either locked at certification time (preventing beneficial updates) or deployed without certification because no pathway exists.

Why This Matters

The EU AI Act classifies medical AI, autonomous vehicles, and industrial control as "high-risk" requiring conformity assessment — but the assessment methodology assumes a fixed software artifact. The FDA's AI/ML-based Software as Medical Device (SaMD) framework requires documenting algorithm change protocols, but no method exists to "validate the validator" when the model changes itself. Only 18% of enterprises using AI have implemented governance frameworks, despite 90% daily operational use. The gap between AI deployment speed and safety certification capability is widening.

What’s Been Tried

The FDA's 2021 AI/ML SaMD Action Plan proposed a "predetermined change control plan" where manufacturers declare in advance how a model will evolve, but this cannot address truly adaptive systems where the specific updates are unpredictable. ISO/PAS 8800:2024 provides guidance on AI safety but explicitly notes that existing safety standards' assumptions about deterministic software behavior don't hold. Runtime monitoring approaches (checking outputs for anomalies) can catch some failures but cannot provide the pre-deployment assurance that safety certification requires. Formal verification methods work for traditional software but scale poorly with neural network size and break down entirely for continuously updated parameters.

What Would Unlock Progress

A paradigm shift from "certify the artifact" to "certify the process and monitor the behavior" — with standardized metrics for measuring when an updated model has drifted outside its certified operating envelope. This requires (1) defining safety invariants that must be preserved across updates, (2) lightweight runtime verification that can detect invariant violations without excessive computational overhead, and (3) a regulatory framework that accepts process-based certification as equivalent to artifact-based certification.

Entry Points for Student Teams

A team could implement a simple continuously-learning system (e.g., anomaly detection for manufacturing quality) and design/evaluate runtime monitors that detect when model updates degrade safety properties. Alternatively, a team could propose a formal framework for defining and checking "safety invariants" for a specific ML architecture. Relevant skills: machine learning, formal methods, safety engineering.

Genome Tags

Constraint

regulatorytechnicalinstalled-base

Domain

digitalhealthtransport

Scale

global

Failure

regulatory-mismatchignored-context

Breakthrough

algorithmpolicy

Stakeholders

systemic

Temporal

worsening

Tractability

research-contribution

Source Notes

Distinct from `digital-ai-trustworthiness-heterogeneous-verification` (which covers compositional trust across heterogeneous AI components) and `digital-autonomous-system-runtime-resilience` (which covers autonomous system runtime resilience generally). This brief focuses specifically on the certification paradigm mismatch — the inability of existing safety certification frameworks to handle models that change post-certification. The ISO/PAS 8800 publication in 2024 and EU AI Act implementation timeline make this increasingly urgent.