Loading
Loading
Software Bills of Materials Cannot Accurately Capture Transitive Dependencies in Open-Source Ecosystems
A Software Bill of Materials (SBOM) lists all components in a software product, intended to enable vulnerability tracking. US Executive Order 14028 (2021) mandates SBOMs for software sold to the federal government. However, modern software depends on deep transitive dependency chains — the average npm package has 79 transitive dependencies, and the average Java project pulls in 150+. Current SBOM generation tools (SPDX, CycloneDX) produce snapshots that are incomplete (missing build-time, test, and optional dependencies), inaccurate (version pinning varies by ecosystem), and instantly stale (dependencies update daily). When Log4Shell (CVE-2021-44228) was disclosed, most organizations couldn't determine within 72 hours whether they were affected because they didn't know which of their applications transitively included Log4j.
Open-source software underpins ~97% of commercial codebases. Software supply chain attacks increased 742% between 2019 and 2022 (Sonatype). The Solarwinds, Codecov, and Log4Shell incidents demonstrated that a single compromised dependency can propagate to thousands of downstream applications. SBOMs are supposed to be the foundational tool for supply chain visibility, but if they can't accurately represent transitive dependencies, they provide false confidence rather than real security. The gap is growing as dependency graphs deepen (npm, PyPI, and Maven all show increasing average dependency depth over time) and as attackers specifically target transitive dependencies (typosquatting, dependency confusion attacks).
Static analysis tools (Snyk, Dependabot, Trivy) scan declared dependency manifests but miss dependencies pulled in at build time, vendored (copied) code, and dynamically loaded plugins. SPDX and CycloneDX formats can represent dependency graphs but rely on generators that produce incomplete trees. Build-time instrumentation (recording actual artifacts downloaded during build) is more accurate but is build-system-specific, slows CI/CD pipelines, and doesn't capture runtime dynamic loading. The fundamental challenge is that dependency resolution is ecosystem-specific (npm, Maven, pip, Go modules each have different resolution algorithms), version-dependent (different versions of the same package may pull different transitive dependencies), and context-dependent (development, test, and production dependency sets differ). No single tool handles all ecosystems and resolution modes.
A runtime-observable SBOM — generated by monitoring actual library loading during execution rather than static manifest analysis — would capture what software components are actually present, regardless of how they were declared. This is analogous to the shift from "what should be running" (configuration management) to "what is running" (runtime observability) in infrastructure security. Alternatively, dependency ecosystems could enforce reproducible dependency resolution (as Go modules and Cargo/Rust already do) and require cryptographic signing of all packages — but retrofitting this onto npm and PyPI without breaking backward compatibility is a governance challenge as much as a technical one.
A team could select 10 popular open-source projects and compare SBOMs generated by different tools (Syft, Tern, CycloneDX generators) against a ground-truth SBOM built by instrumenting the actual build process, quantifying completeness and accuracy. Alternatively, a team could prototype a runtime SBOM generator using eBPF or dynamic library interposition to log all shared libraries loaded during execution and compare the result to static SBOMs. Skills: software engineering, security, systems programming, data analysis.
The "not-attempted" failure mode reflects that the problem of complete transitive dependency enumeration was not seriously addressed until the post-Log4Shell policy mandate — the problem was known but deprioritized. The worsening temporal tag reflects both deepening dependency graphs and increasing attack sophistication. Distinct from digital-computational-reproducibility-dependency-rot (which covers scientific software reproducibility) — this brief addresses security visibility across enterprise software supply chains. Cross-references: digital-computational-reproducibility-dependency-rot (dependency management in science), digital-food-chain-interoperability-failure (supply chain data visibility).
OpenSSF (Open Source Security Foundation), "The SBOM Landscape," 2023; NTIA Minimum Elements for SBOM, 2021; Log4Shell incident analysis (CVE-2021-44228); Linux Foundation Research, "The State of Software Bill of Materials," 2022; Enck & Williams, "Top Five Challenges in Software Supply Chain Security," IEEE Security & Privacy 20(2), 2022