Loading
Loading
Only 56% of FDA-Approved AI Medical Devices Have Published Clinical Evidence
The FDA and clinicians lack systematic infrastructure to monitor how medical devices perform after they reach the market. Premarket clinical trials are typically small, short, and conducted in controlled settings, but real-world device performance can differ dramatically. Only 55.9% of FDA-approved AI-enabled medical devices have publicly available clinical performance data at the time of clearance. The National Evaluation System for health Technology Coordinating Center (NESTcc) — intended as the backbone of a national active surveillance network — has only 19 collaborators as of March 2024, far from achieving population-scale device monitoring. The FDA's passive adverse event reporting system (MDRs) suffers from well-documented under-reporting, delayed reporting, and the fundamental inability to calculate incidence rates because no denominator data exists.
Approximately 257,000 different types of medical devices are on the U.S. market. For implantable devices with long service lives — hip implants, cardiac devices, hernia mesh — postmarket performance failures may take years to emerge and affect millions of patients. The metal-on-metal hip implant crisis, pelvic mesh litigation, and breast implant BIA-ALCL all illustrate scenarios where problems were detected years after widespread adoption. One-fifth of devices granted De Novo authorization were never evaluated in pivotal studies, and one-third failed to meet their primary effectiveness endpoints but were still authorized with postmarket study requirements — requirements that are frequently never completed.
In December 2025, the FDA finalized guidance permitting the use of de-identified real-world data from registries, EHRs, and claims databases in regulatory submissions without requiring patient-level identification — a significant policy change. However, this addresses data use for regulatory submissions, not systematic active surveillance. NESTcc and the Medical Device Epidemiology Network (MDEpiNet) continue developing coordinated data networks, but coverage remains incomplete. Post-approval study compliance rates for PMA devices have historically been poor, with many required studies never completed. The FDA requested $3 million in additional funding for active postmarket surveillance in FY 2024 but did not receive it. The fundamental barrier is structural: building a national active surveillance system requires sustained federal funding, standardized device identification across heterogeneous hospital EHR systems, and a regulatory mandate that does not currently exist.
A federated real-world evidence network that links device registries, EHR data, and claims databases — using UDI as the common identifier — could enable active surveillance without centralizing sensitive data. This requires completing UDI integration into clinical workflows (see health-device-recall-udi-tracking), developing standardized device performance outcome measures, and establishing governance models that address privacy concerns while enabling population-level analysis. Adjacent models include the FDA Sentinel System for drugs, which achieved active surveillance at scale using distributed data networks.
A student team could prototype a device performance dashboard that links publicly available FDA clearance data, MAUDE adverse event reports, and recall data for a specific device category, demonstrating how existing data sources could be combined for signal detection. Another approach would be to analyze post-approval study completion rates for a category of PMA devices and quantify the evidence gap. Relevant disciplines include health informatics, data science, epidemiology, and database engineering.
This brief draws on a 2024 GAO report on FDA postmarket surveillance, Applied Radiology analysis of clinical evidence gaps for AI-enabled devices, FDA real-world evidence guidance finalized in December 2025, and FDA Map reporting. Closely related to health-device-recall-udi-tracking (both depend on UDI infrastructure in clinical settings), health-ai-device-clinical-evidence-gap (AI device evidence specifically), and health-510k-predicate-creep (premarket evidence standards). Tagged `domain:digital` alongside `domain:health` because AI-enabled devices are a major driver of the evidence gap. Tagged `failure:not-attempted` because no serious national active surveillance system for devices has been built, despite the model existing for drugs (Sentinel). Tagged `failure:adoption-barrier` for UDI integration into clinical workflows. Tagged `temporal:worsening` because the number of devices, especially AI-enabled devices, entering the market is accelerating faster than surveillance capacity.
GAO, "Medical Devices: FDA Has Begun Building an Active Postmarket Surveillance System" (2024), https://www.gao.gov/assets/gao-24-106699.pdf, accessed 2026-02-19