No Ruler Fits Every Machine

Quantum Computing Cross-Platform Benchmarking Gap

digitalmaterials

Problem Statement

No agreed method exists to compare quantum computers across different hardware platforms (superconducting, trapped ion, photonic, neutral atom). Existing metrics like Quantum Volume are misleading — QV requires classical simulation to verify, which becomes computationally impossible as qubit counts grow beyond ~50. Vendors cherry-pick favorable benchmarks: only 3 of 31 trapped-ion QPUs in one survey even report gate speed. The ratio between coherence times and gate durations — the actual performance-determining factor — has no agreed measurement protocol, and time-varying noise means the same benchmark run twice on the same machine yields different results.

Why This Matters

Governments are investing billions annually in quantum computing (the U.S. NQI alone exceeds $1.2B/year) based on vendor performance claims that cannot be independently compared. Investment decisions, research priorities, and national technology roadmaps are distorted by incomparable metrics. The field cannot identify which hardware approaches to scale, delaying the timeline to practical quantum advantage by years.

What’s Been Tried

IBM introduced Quantum Volume in 2019 as a single-number metric, but it requires random circuit sampling validated by classical simulation — intractable beyond ~50 qubits. IonQ proposed Algorithmic Qubits, criticized for inflating results by combining runs and using tailored gate compilations. Google's linear cross-entropy benchmarking (used for their supremacy claim) sparked years of debate about whether classical algorithms could match it. Application-specific benchmarks (quantum chemistry, optimization) depend on compiler quality as much as hardware quality, making it impossible to separate software from hardware contributions. The IEEE P7131 and ISO/IEC JTC 3 working groups are attempting standardization but face the fundamental problem that different qubit modalities have gate durations differing by orders of magnitude and noise profiles that cannot be characterized in a unified way.

What Would Unlock Progress

A layered benchmarking framework that separates hardware-level metrics (gate fidelity, coherence, connectivity) from application-level performance, with standardized inter-laboratory comparison protocols. The key missing piece is a method to characterize time-varying noise that works across all hardware platforms — something analogous to how the SPEC CPU benchmark suite standardized computer performance comparison despite radical differences in processor architecture.

Entry Points for Student Teams

A team could design and implement a benchmark comparison study using cloud-accessible quantum hardware (IBM Quantum, IonQ, Amazon Braket) to quantify how different metrics rank the same platforms differently. Alternatively, a team could develop a noise characterization protocol for a specific platform and validate whether it predicts application-level performance. Relevant skills: quantum computing, statistics, experimental design.

Genome Tags

Constraint

technicaldata

Domain

digitalmaterials

Scale

global

Failure

disciplinary-silolab-to-field-gap

Breakthrough

sensingdata-integration

Stakeholders

institutional

Temporal

static

Tractability

research-contribution

Source Notes

Related to `physics-quantum-algorithms-excited-state-dynamics` (quantum algorithm challenges) but distinct — that brief covers algorithmic barriers, this covers the measurement infrastructure needed to evaluate hardware. IEEE P7131 and ISO/IEC JTC 3 are working in parallel on this problem. The five areas identified as needing standardization are: categories of metrics, agreed metric sets, hardware-specific metrics, inter-laboratory comparison studies, and reporting standards.