From 44fefaac6409221fd2acd959b9cbc7799137faa0 Mon Sep 17 00:00:00 2001 From: Mark Atwood Date: Tue, 28 Apr 2026 14:42:48 -0700 Subject: [PATCH 01/16] feat: SBOM generation and OmniBOR build provenance (CRA compliance) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds two complementary supply chain transparency targets to the wolfSSL autotools build, and documentation covering both as a unified whole. Generates a Software Bill of Materials for EU Cyber Resilience Act (CRA) compliance. Produces three files in the build directory: wolfssl-.cdx.json CycloneDX 1.6 JSON wolfssl-.spdx.json SPDX 2.3 JSON wolfssl-.spdx SPDX 2.3 tag-value (validated) The SPDX JSON is validated by pyspdxtools before the tag-value file is written; make sbom fails if validation fails. SBOM contents: package name/version, supplier, license (parsed from LICENSING at generation time, not hardcoded), copyright, SHA-256 of the installed library, CPE, PURL, download location, and build configuration defines as a comment. Third-party dependencies (liboqs, libz, libxmss, liblms) are included when enabled. Implementation: scripts/gen-sbom (Python 3, stdlib only) stages a make install into a temporary directory, hashes the installed library, generates both SBOM formats, then removes the staging directory. configure.ac detects python3, pyspdxtools, and git via AC_PATH_PROG. install-sbom / uninstall-sbom targets install the three files to $(datadir)/doc/wolfssl/. make clean removes all generated files. Generates an OmniBOR artifact dependency graph using the Bomsh project (https://github.com/omnibor/bomsh), providing cryptographic traceability from every built binary back to the exact set of source files that produced it. Runs a full clean rebuild under bomtrace3 (a patched strace, userspace only — no kernel modifications required). bomtrace3 intercepts every compiler execve() syscall and records inputs and outputs; it cannot post-process an already-built tree, hence the clean rebuild. bomsh_create_bom.py processes the raw logfile to produce the OmniBOR artifact objects and metadata in omnibor/. If bomsh_sbom.py is available and wolfssl-.spdx.json exists (from make sbom), annotates that SPDX document with a PERSISTENT-ID gitoid ExternalRef, producing omnibor.wolfssl-.spdx.json. This enriched SPDX bridges component identity and build provenance in a single document. configure.ac detects bomtrace3, bomsh_create_bom.py, and bomsh_sbom.py via AC_PATH_PROG. The raw logfile and conf file are written to the build directory (not /tmp/) to avoid concurrent-build collisions, and removed by make clean. install-bomsh / uninstall-bomsh targets install omnibor/ and the enriched SPDX to $(datadir)/doc/wolfssl/. doc/SBOM.md: unified reference covering both make sbom and make bomsh as parts of a single supply chain transparency story — component identity (what) and build provenance (how) — with a combined workflow section and full output file reference. doc/CRA.md: product-integrator guide covering how to incorporate wolfSSL's SBOM artefacts into a downstream product SBOM (SPDX ExternalDocumentRef and CycloneDX component reference patterns), commercial license concluded-field guidance, OmniBOR gitoid meaning, auditor handoff checklist, and links to OpenSSF CRA and SBOM Everywhere SIG guidance pages. INSTALL: sections 21 (make sbom) and 22 (make bomsh). README.md: brief SBOM/CRA and OmniBOR/Bomsh sections. doc/include.am: SBOM.md and CRA.md added to dist_doc_DATA. --- INSTALL | 73 +++++++++ Makefile.am | 110 +++++++++++++ README.md | 12 ++ configure.ac | 15 ++ doc/CRA.md | 225 +++++++++++++++++++++++++ doc/SBOM.md | 265 ++++++++++++++++++++++++++++++ doc/include.am | 5 +- scripts/gen-sbom | 417 +++++++++++++++++++++++++++++++++++++++++++++++ 8 files changed, 1121 insertions(+), 1 deletion(-) create mode 100644 doc/CRA.md create mode 100644 doc/SBOM.md create mode 100755 scripts/gen-sbom diff --git a/INSTALL b/INSTALL index 058b5a1edf6..17e37f56db7 100644 --- a/INSTALL +++ b/INSTALL @@ -313,3 +313,76 @@ We also have vcpkg ports for wolftpm, wolfmqtt and curl. Docker container, use `make rpm-docker`. In both cases the resulting packages are placed in the root directory of the project. + +19. Generating an SBOM (Software Bill of Materials) + + wolfSSL can generate a Software Bill of Materials for EU Cyber Resilience + Act (CRA) compliance after a normal build and install. + + Prerequisites: + - python3 (detected automatically by configure) + - pyspdxtools (pip install spdx-tools) + + Usage: + + $ ./configure + $ make + $ make sbom + + This produces three files in the build directory: + + wolfssl-.cdx.json CycloneDX 1.6 JSON + wolfssl-.spdx.json SPDX 2.3 JSON + wolfssl-.spdx SPDX 2.3 tag-value (validated by pyspdxtools) + + The SPDX JSON is validated by pyspdxtools before the tag-value file is + written; make sbom fails if validation fails. + + To install the SBOM files to $(datadir)/doc/wolfssl/: + + $ make install-sbom + + To remove installed SBOM files: + + $ make uninstall-sbom + + The generated files are removed by make clean. + + For details on the SBOM contents and CRA context, see doc/SBOM.md. + +20. Generating OmniBOR build artifact graph (Bomsh) + + wolfSSL supports generating an OmniBOR artifact dependency graph using + the Bomsh project (https://github.com/omnibor/bomsh). OmniBOR provides + cryptographic traceability from every binary artifact back to the exact + source files that produced it. + + Prerequisites: + - bomtrace3 (build from https://github.com/omnibor/bomsh) + - bomsh_create_bom.py (from the bomsh scripts/ directory, in PATH) + - bomsh_sbom.py (optional; from bomsh scripts/, for SPDX enrichment) + + Both bomtrace3 and the Python scripts are detected by configure. + make bomsh fails with a clear error message if either required tool + is missing. + + Usage: + + $ ./configure + $ make + $ make bomsh + + This performs a clean rebuild of wolfSSL under bomtrace3 tracing, + then produces an OmniBOR artifact graph in omnibor/ in the build + directory. If bomsh_sbom.py is available and a wolfssl-.spdx.json + exists (from 'make sbom'), it also produces an OmniBOR-enriched SPDX + document omnibor.wolfssl-.spdx.json. + + To install: + + $ make install-bomsh # installs omnibor/ to $(datadir)/doc/wolfssl/ + $ make uninstall-bomsh # removes installed files + + The generated files are removed by make clean. + + See doc/SBOM.md for full details. diff --git a/Makefile.am b/Makefile.am index fce812babf5..5aa63cf4676 100644 --- a/Makefile.am +++ b/Makefile.am @@ -350,3 +350,113 @@ merge-clean: .cu.lo: $(LIBTOOL) --tag=CC --mode=compile $(COMPILE) --compile -o $@ $< -static + +# SBOM generation (CRA compliance) +SBOM_CDX = wolfssl-$(PACKAGE_VERSION).cdx.json +SBOM_SPDX = wolfssl-$(PACKAGE_VERSION).spdx.json +SBOM_SPDX_TV = wolfssl-$(PACKAGE_VERSION).spdx +sbomdir = $(datadir)/doc/$(PACKAGE) + +.PHONY: sbom install-sbom uninstall-sbom + +sbom: + @if test -z "$(PYTHON3)"; then \ + echo ""; \ + echo "ERROR: 'python3' not found in PATH. Cannot generate SBOM."; \ + echo ""; \ + exit 1; \ + fi + @if test -z "$(PYSPDXTOOLS)"; then \ + echo ""; \ + echo "ERROR: 'pyspdxtools' not found in PATH. Cannot validate SBOM."; \ + echo " Install: pip install spdx-tools"; \ + echo ""; \ + exit 1; \ + fi + rm -rf $(abs_builddir)/_sbom_staging + $(MAKE) install DESTDIR=$(abs_builddir)/_sbom_staging + $(PYTHON3) $(srcdir)/scripts/gen-sbom \ + --name $(PACKAGE) \ + --version $(PACKAGE_VERSION) \ + --license-file $(srcdir)/LICENSING \ + --options-h $(abs_builddir)/wolfssl/options.h \ + --lib $(abs_builddir)/_sbom_staging$(libdir)/libwolfssl.so.$(WOLFSSL_LIBRARY_VERSION_FIRST).$(WOLFSSL_LIBRARY_VERSION_SECOND).$(WOLFSSL_LIBRARY_VERSION_THIRD) \ + --dep-liboqs $(ENABLED_LIBOQS) \ + --dep-libxmss $(ENABLED_LIBXMSS) \ + --dep-libxmss-root '$(XMSS_ROOT)' \ + --dep-liblms $(ENABLED_LIBLMS) \ + --dep-liblms-root '$(LIBLMS_ROOT)' \ + --dep-libz $(ENABLED_LIBZ) \ + --git '$(GIT)' \ + --cdx-out $(abs_builddir)/$(SBOM_CDX) \ + --spdx-out $(abs_builddir)/$(SBOM_SPDX) + rm -rf $(abs_builddir)/_sbom_staging + $(PYSPDXTOOLS) --infile $(abs_builddir)/$(SBOM_SPDX) \ + --outfile $(abs_builddir)/$(SBOM_SPDX_TV) + +install-sbom: sbom + $(MKDIR_P) $(DESTDIR)$(sbomdir) + $(INSTALL_DATA) $(SBOM_CDX) $(DESTDIR)$(sbomdir)/ + $(INSTALL_DATA) $(SBOM_SPDX) $(DESTDIR)$(sbomdir)/ + $(INSTALL_DATA) $(SBOM_SPDX_TV) $(DESTDIR)$(sbomdir)/ + +uninstall-sbom: + -rm -f $(DESTDIR)$(sbomdir)/$(SBOM_CDX) + -rm -f $(DESTDIR)$(sbomdir)/$(SBOM_SPDX) + -rm -f $(DESTDIR)$(sbomdir)/$(SBOM_SPDX_TV) + +CLEANFILES += $(SBOM_CDX) $(SBOM_SPDX) $(SBOM_SPDX_TV) + +# Bomsh (OmniBOR build artifact tracing + SBOM enrichment) +BOMSH_RAWLOG_BASE = $(abs_builddir)/bomsh_raw_logfile +BOMSH_RAWLOG = $(BOMSH_RAWLOG_BASE).sha1 +BOMSH_CONF = $(abs_builddir)/_bomsh.conf +BOMSH_OMNIBORDIR = $(abs_builddir)/omnibor +BOMSH_SPDX_OUT = omnibor.wolfssl-$(PACKAGE_VERSION).spdx.json +bomshdir = $(datadir)/doc/$(PACKAGE) + +.PHONY: bomsh install-bomsh uninstall-bomsh + +bomsh: + @if test -z "$(BOMTRACE3)"; then \ + echo ""; \ + echo "ERROR: 'bomtrace3' not found in PATH. Cannot generate OmniBOR data."; \ + echo " Build bomtrace3 from: https://github.com/omnibor/bomsh"; \ + echo ""; \ + exit 1; \ + fi + @if test -z "$(BOMSH_CREATE_BOM)"; then \ + echo ""; \ + echo "ERROR: 'bomsh_create_bom.py' not found in PATH. Cannot process OmniBOR data."; \ + echo " Install from: https://github.com/omnibor/bomsh"; \ + echo ""; \ + exit 1; \ + fi + $(MAKE) clean + @printf 'raw_logfile=%s\n' '$(BOMSH_RAWLOG_BASE)' > '$(BOMSH_CONF)' + $(BOMTRACE3) -c '$(BOMSH_CONF)' $(MAKE) + $(BOMSH_CREATE_BOM) -r '$(BOMSH_RAWLOG)' -b '$(BOMSH_OMNIBORDIR)' + @if test -n "$(BOMSH_SBOM)" && test -f '$(abs_builddir)/wolfssl-$(PACKAGE_VERSION).spdx.json'; then \ + echo "Enriching SPDX with OmniBOR ExternalRefs..."; \ + $(BOMSH_SBOM) \ + -b '$(BOMSH_OMNIBORDIR)' \ + -i '$(abs_builddir)/wolfssl-$(PACKAGE_VERSION).spdx.json' \ + -f '$(abs_builddir)/src/.libs/libwolfssl.so.$(WOLFSSL_LIBRARY_VERSION_FIRST).$(WOLFSSL_LIBRARY_VERSION_SECOND).$(WOLFSSL_LIBRARY_VERSION_THIRD)' \ + -s spdx-json \ + -O '$(abs_builddir)'; \ + elif test -n "$(BOMSH_SBOM)"; then \ + echo "NOTE: run 'make sbom' first, then 'make bomsh' for OmniBOR-enriched SPDX."; \ + fi + +install-bomsh: bomsh + $(MKDIR_P) $(DESTDIR)$(bomshdir) + cp -r '$(BOMSH_OMNIBORDIR)' '$(DESTDIR)$(bomshdir)/omnibor' + @if test -f '$(abs_builddir)/$(BOMSH_SPDX_OUT)'; then \ + $(INSTALL_DATA) '$(abs_builddir)/$(BOMSH_SPDX_OUT)' '$(DESTDIR)$(bomshdir)/'; \ + fi + +uninstall-bomsh: + -rm -rf '$(DESTDIR)$(bomshdir)/omnibor' + -rm -f '$(DESTDIR)$(bomshdir)/$(BOMSH_SPDX_OUT)' + +CLEANFILES += $(BOMSH_RAWLOG) $(BOMSH_RAWLOG_BASE).sha256 $(BOMSH_CONF) $(BOMSH_SPDX_OUT) diff --git a/README.md b/README.md index ae1f22a08c7..a51a2d0b8e8 100644 --- a/README.md +++ b/README.md @@ -34,6 +34,18 @@ applications which have previously used the OpenSSL package. For a complete feature list, see [Chapter 4](https://www.wolfssl.com/docs/wolfssl-manual/ch4/) of the wolfSSL manual. +## SBOM / CRA Compliance + +wolfSSL provides a Software Bill of Materials (SBOM) for EU Cyber Resilience +Act (CRA) compliance via `make sbom`. See `doc/SBOM.md` for details. + +## OmniBOR / Bomsh + +wolfSSL supports generating an OmniBOR artifact dependency graph via +`make bomsh`, providing cryptographic traceability from the installed +library back to every source file that produced it. See `doc/SBOM.md` +for details. + ## Notes, Please Read ### Note 1 diff --git a/configure.ac b/configure.ac index 7a9ec151d13..fb02dce89a1 100644 --- a/configure.ac +++ b/configure.ac @@ -12220,6 +12220,21 @@ AC_SUBST([WOLFSSL_PREFIX_ABS]) AC_SUBST([WOLFSSL_LIBDIR_ABS]) AC_SUBST([WOLFSSL_INCLUDEDIR_ABS]) +# SBOM generation +AC_PATH_PROG([PYTHON3], [python3]) +AC_PATH_PROG([PYSPDXTOOLS], [pyspdxtools]) +AC_PATH_PROG([GIT], [git]) +AC_SUBST([ENABLED_LIBOQS]) +AC_SUBST([ENABLED_LIBXMSS]) +AC_SUBST([ENABLED_LIBLMS]) +AC_SUBST([ENABLED_LIBZ]) +AC_SUBST([LIBLMS_ROOT]) + +# Bomsh (OmniBOR build artifact tracing + SBOM enrichment) +AC_PATH_PROG([BOMTRACE3], [bomtrace3]) +AC_PATH_PROG([BOMSH_CREATE_BOM], [bomsh_create_bom.py]) +AC_PATH_PROG([BOMSH_SBOM], [bomsh_sbom.py]) + # FINAL AC_CONFIG_FILES([stamp-h], [echo timestamp > stamp-h]) AC_CONFIG_FILES([Makefile diff --git a/doc/CRA.md b/doc/CRA.md new file mode 100644 index 00000000000..b01a27f194b --- /dev/null +++ b/doc/CRA.md @@ -0,0 +1,225 @@ +# wolfSSL and the EU Cyber Resilience Act + +This guide is for product teams that ship a product containing wolfSSL and +need to satisfy EU Cyber Resilience Act (CRA) obligations related to software +component transparency and build traceability. + +## Background + +The CRA requires manufacturers of products with digital elements placed on +the EU market to identify and document the software components in those +products. The practical requirement is a machine-readable Software Bill of +Materials (SBOM) covering all open-source and third-party components, +following the NTIA minimum element guidelines. + +wolfSSL provides two complementary artefacts to help you meet this +requirement: + +| Artefact | Produced by | What it answers | +|---|---|---| +| SBOM (SPDX 2.3 + CycloneDX 1.6) | `make sbom` | *What* is in wolfSSL (identity, license, CPE, PURL, checksum) | +| OmniBOR artifact graph | `make bomsh` | *How* wolfSSL was built (cryptographic source-to-binary traceability) | + +For most CRA use cases the SBOM alone is sufficient. The OmniBOR graph +provides a deeper audit trail if your compliance posture requires it. + +## Quick Start + +```sh +./configure +make +make sbom # produces wolfssl-.spdx.json, .cdx.json, .spdx +make bomsh # optional: produces omnibor/ + OmniBOR-enriched SPDX +``` + +See `doc/SBOM.md` for prerequisites and full details on both targets. + +## What wolfSSL Provides + +After `make sbom`: + +``` +wolfssl-.spdx.json SPDX 2.3 JSON (machine processing) +wolfssl-.cdx.json CycloneDX 1.6 JSON (supply-chain tooling, VEX) +wolfssl-.spdx SPDX 2.3 tag-value (human review, archival) +``` + +After `make bomsh` (with `make sbom` already run): + +``` +omnibor/ OmniBOR artifact dependency graph +omnibor.wolfssl-.spdx.json SPDX enriched with PERSISTENT-ID gitoid +``` + +## Integrating wolfSSL into Your Product SBOM + +Your product SBOM needs to list wolfSSL as a component. The two standard +approaches are to reference wolfSSL's SBOM document from yours, or to copy +the wolfSSL package entry directly into your document. + +### SPDX: external document reference (recommended) + +Reference wolfSSL's SPDX document from your product's SPDX document using +`externalDocumentRefs`. This keeps the documents separate and lets wolfSSL's +SBOM stand as an independently verifiable artefact. + +```json +{ + "externalDocumentRefs": [ + { + "externalDocumentId": "DocumentRef-wolfssl", + "spdxDocument": "https://wolfssl.com/sbom/wolfssl-.spdx.json", + "checksum": { + "algorithm": "SHA256", + "checksumValue": "" + } + } + ] +} +``` + +Then express the dependency in your `relationships` section: + +```json +{ + "spdxElementId": "SPDXRef-Package-YourProduct", + "relatedSpdxElement": "DocumentRef-wolfssl:SPDXRef-Package-wolfssl", + "relationshipType": "DYNAMIC_LINK" +} +``` + +Use `STATIC_LINK` if you link wolfSSL statically, `DYNAMIC_LINK` if you +use the shared library, or `CONTAINS` if you redistribute the source. + +Alternatively, copy the wolfSSL package entry from its SPDX document +directly into your own SPDX document and add the `DYNAMIC_LINK` / +`STATIC_LINK` relationship to your product package. + +### CycloneDX: component reference + +Include wolfSSL as a component in your CycloneDX BOM, referencing the +wolfSSL CycloneDX document via an external reference of type `bom`: + +```json +{ + "type": "library", + "name": "wolfssl", + "version": "", + "purl": "pkg:generic/wolfssl@", + "cpe": "cpe:2.3:a:wolfssl:wolfssl::*:*:*:*:*:*:*", + "licenses": [{ "license": { "id": "GPL-3.0-only" } }], + "externalReferences": [ + { + "type": "bom", + "url": "https://wolfssl.com/sbom/wolfssl-.cdx.json", + "hashes": [ + { + "alg": "SHA-256", + "content": "" + } + ] + } + ] +} +``` + +## Commercial License Users + +wolfSSL's SBOM records `licenseConcluded: GPL-3.0-only`, which reflects the +open-source license. If you are distributing a product under a wolfSSL +commercial license, update `licenseConcluded` in your copy of the package +entry (or in your own SBOM's reference to the wolfSSL package) to reflect +your actual license: + +```json +"licenseConcluded": "LicenseRef-wolfSSL-Commercial" +``` + +Do not modify the wolfSSL-published SBOM file itself; update the concluded +license in your product SBOM where you reference or embed the wolfSSL entry. + +## Build Provenance (OmniBOR) + +The CRA also encourages transparency about *how* software is built, not just +*what* it contains. Running `make bomsh` after `make sbom` produces an +OmniBOR artifact dependency graph and an enriched SPDX document: + +``` +omnibor.wolfssl-.spdx.json +``` + +This file is identical to `wolfssl-.spdx.json` except it adds a +`PERSISTENT-ID gitoid` entry to the wolfSSL package's `externalRefs`: + +```json +{ + "referenceCategory": "PERSISTENT-ID", + "referenceType": "gitoid", + "referenceLocator": "gitoid:blob:sha1:" +} +``` + +The `gitoid` is the entry point into the OmniBOR Merkle DAG stored in +`omnibor/`. A CRA auditor or supply-chain tool can follow that identifier +through the graph to verify that a specific `libwolfssl.so` binary was +produced from a specific, unmodified set of source files. + +Use `omnibor.wolfssl-.spdx.json` in place of the plain SPDX file +when you want to include this traceability claim in your product SBOM. + +## What to Give Your Auditor + +For a CRA conformity assessment, provide: + +| File | Purpose | +|---|---| +| `wolfssl-.spdx.json` | Machine-readable component identity (SPDX 2.3) | +| `wolfssl-.cdx.json` | Machine-readable component identity (CycloneDX 1.6) | +| `wolfssl-.spdx` | Human-readable tag-value form | +| `omnibor/` + `omnibor.wolfssl-.spdx.json` | Build traceability (optional, if bomsh was run) | + +If you have a product-level SBOM that references wolfSSL via +`ExternalDocumentRef` (SPDX) or a `bom` external reference (CycloneDX), +include that product SBOM alongside the wolfSSL artefacts. + +## Further Reading + +### wolfSSL documentation + +- `doc/SBOM.md` — unified reference covering SBOM generation, OmniBOR/Bomsh + build provenance, combined workflow, output formats, and implementation notes + +### OpenSSF guidance + +- [CRA Brief Guide for OSS Developers](https://best.openssf.org/CRA-Brief-Guide-for-OSS-Developers.html) + — Clarifies when the CRA applies to open source projects and + maintainers, and what obligations fall on manufacturers integrating + OSS components into commercial products (i.e., you, if you ship a + product containing wolfSSL). + +- [SBOM in Compliance](https://sbom-catalog.openssf.org/sbom-compliance.html) + — OpenSSF SBOM Everywhere SIG survey of the global regulatory + landscape: CRA, NTIA minimum elements, US EO 14028, Germany TR-03183, + and others. Useful for understanding how wolfSSL's SBOM artefacts map + to each framework. + +- [Getting Started with SBOMs](https://sbom-catalog.openssf.org/getting-started) + — OpenSSF SBOM Everywhere SIG guidance on SBOM generation approaches + (build-integrated vs. separate tooling), phase selection, and + publication. wolfSSL's `make sbom` follows the build-integrated + approach recommended here. + +- [OpenSSF CRA Policy Hub](https://openssf.org/category/policy/cra/) + — Ongoing OpenSSF coverage of CRA developments, implementation + guidance, and community responses. + +- [SBOM Everywhere Wiki](https://sbom-catalog.openssf.org/) + — OpenSSF SIG home: tooling catalog, working group resources, naming + conventions, and cross-format guidance for SPDX and CycloneDX. + +### Standards + +- SPDX 2.3 specification: https://spdx.github.io/spdx-spec/v2.3/ +- CycloneDX 1.6 specification: https://cyclonedx.org/specification/overview/ +- NTIA minimum elements for an SBOM: + https://www.ntia.gov/report/2021/minimum-elements-software-bill-materials-sbom diff --git a/doc/SBOM.md b/doc/SBOM.md new file mode 100644 index 00000000000..c2b3ab71198 --- /dev/null +++ b/doc/SBOM.md @@ -0,0 +1,265 @@ +# wolfSSL SBOM and Build Provenance + +wolfSSL provides two complementary artefacts for software supply chain +transparency: + +| Artefact | Target | Answers | +|---|---|---| +| SBOM (SPDX 2.3 + CycloneDX 1.6) | `make sbom` | *What* wolfSSL is: component identity, license, checksums, CPE, PURL | +| OmniBOR artifact graph | `make bomsh` | *How* wolfSSL was built: cryptographic source-to-binary traceability | + +Together they provide full coverage for the EU Cyber Resilience Act (CRA) +and similar supply chain transparency requirements. Each target is +independently useful; running both produces an enriched SPDX document that +bridges the two artefacts with a single `PERSISTENT-ID gitoid` reference. + +## Quick Start + +### Component identity only + +```sh +./configure +make +make sbom +``` + +Requires `python3` and `pyspdxtools` (`pip install spdx-tools`). + +### Full coverage: component identity + build provenance + +```sh +./configure +make +make sbom +make bomsh +``` + +Additionally requires `bomtrace3` and `bomsh_create_bom.py` in `PATH`. +See [Prerequisites for make bomsh](#prerequisites-for-make-bomsh) below. + +All tools are detected by `configure`; either target fails with a clear +error message if a required tool is missing. + +--- + +## make sbom + +### Output files + +`make sbom` produces three files in the build directory: + +| File | Format | Standard | Primary use | +|---|---|---|---| +| `wolfssl-.cdx.json` | JSON | CycloneDX 1.6 | Supply-chain tooling, VEX | +| `wolfssl-.spdx.json` | JSON | SPDX 2.3 | Machine processing | +| `wolfssl-.spdx` | Tag-value | SPDX 2.3 | Human review, archival | + +The `.spdx` tag-value file is produced by `pyspdxtools` converting the +`.spdx.json`. If the JSON fails SPDX validation, `make sbom` stops with +a non-zero exit and the tag-value file is not written. + +### SBOM contents + +Both formats contain the same information: + +| Field | Value | +|---|---| +| Name | `wolfssl` | +| Version | from `configure.ac` (`PACKAGE_VERSION`) | +| Type | library | +| Supplier | wolfSSL Inc. | +| License | detected from `LICENSING` file (currently `GPL-3.0-only`) | +| Copyright | `Copyright (C) 2006- wolfSSL Inc.` | +| SHA-256 | hash of the installed `libwolfssl.so.X.Y.Z` | +| CPE | `cpe:2.3:a:wolfssl:wolfssl::*:*:*:*:*:*:*` | +| PURL | `pkg:generic/wolfssl@` | +| Download location | `https://github.com/wolfSSL/wolfssl` | +| Third-party deps | none (wolfssl has no runtime dependencies in a default build) | + +#### License detection + +The license SPDX identifier is parsed from the `LICENSING` file at SBOM +generation time, not hardcoded. If the `LICENSING` file cannot be parsed, +`make sbom` warns and uses `NOASSERTION` rather than silently emitting a +wrong value. + +#### Dual licensing + +wolfSSL is available under `GPL-3.0-only` for open-source use, with a +commercial license for proprietary products. The SBOM reflects the +open-source license. Commercial licensees should update the +`licenseConcluded` field to `LicenseRef-wolfSSL-Commercial` or their +applicable SPDX expression when distributing under a commercial agreement. + +#### External dependency version detection + +For dependencies with pkg-config support (`liboqs`, `libz`), the version is +queried via `pkg-config --modversion` at generation time. + +For dependencies without pkg-config (`libxmss`, `liblms`), wolfSSL is +typically built against a source checkout rather than an installed package. +The generator falls back to `git describe --tags --always` on the source +tree root (passed via `configure` as `XMSS_ROOT` / `LIBLMS_ROOT`). If the +source tree has no tags, `git describe` returns the short commit hash, which +is recorded as-is. If the source tree is unavailable or `git` is not found, +version is recorded as `NOASSERTION`. + +### Validating the SBOM manually + +```sh +# Validate SPDX JSON +pyspdxtools --infile wolfssl-.spdx.json + +# Convert to another format (e.g. RDF) +pyspdxtools --infile wolfssl-.spdx.json \ + --outfile wolfssl-.spdx.rdf +``` + +### Installing the SBOM + +```sh +make install-sbom # installs to $(datadir)/doc/wolfssl/ +make uninstall-sbom # removes the installed files +``` + +The generated files are removed by `make clean`. + +### Implementation notes + +SBOM generation is implemented in `scripts/gen-sbom` (Python 3, stdlib only) +and hooked into the autotools build via `Makefile.am` and `configure.ac`. +The script stages a `make install` into a temporary directory, hashes the +installed library, generates both SBOM formats, then removes the staging +directory. The `pyspdxtools` validation and conversion step runs after +generation and gates the build on SPDX conformance. + +--- + +## make bomsh + +`make bomsh` uses the [Bomsh](https://github.com/omnibor/bomsh) project to +trace the wolfSSL build under `bomtrace3` (a patched `strace`) and produce +an OmniBOR artifact dependency graph: a content-addressed Merkle DAG mapping +every built binary back to the exact set of source files that produced it. + +### Prerequisites for make bomsh + +| Tool | Required | Where to get it | +|---|---|---| +| `bomtrace3` | yes | Build from source: [omnibor/bomsh](https://github.com/omnibor/bomsh) | +| `bomsh_create_bom.py` | yes | `scripts/` directory of the bomsh repo, placed in `PATH` | +| `bomsh_sbom.py` | no | Same; needed only for SPDX enrichment step | + +`bomtrace3` is a patched `strace` — it is a userspace binary and requires no +kernel modifications. It uses the standard `ptrace()` syscall available on +any stock Linux kernel. The only environments where it may be unavailable +are containers running with a hardened seccomp profile or systems with +`kernel.yama.ptrace_scope=3`. + +#### Building bomtrace3 + +```sh +git clone https://github.com/omnibor/bomsh +git clone https://github.com/strace/strace strace3 +cd strace3 +patch -p1 < ../bomsh/.devcontainer/patches/bomtrace3.patch +cp ../bomsh/.devcontainer/src/*.[hc] src/ +./bootstrap && ./configure && make +cp src/strace ~/.local/bin/bomtrace3 +``` + +Place `bomsh_create_bom.py` (and optionally `bomsh_sbom.py`) from the bomsh +`scripts/` directory somewhere in `PATH`. + +### What make bomsh does + +1. Writes a build-local `_bomsh.conf` redirecting the raw logfile out of + `/tmp/` to the build directory (avoids collisions between concurrent + builds). +2. Runs `make clean` to ensure a full rebuild. This is necessary because + `bomtrace3` intercepts syscalls live during compilation and cannot + post-process an already-built tree. +3. Runs `bomtrace3 -c _bomsh.conf make` — rebuilds wolfSSL under strace + tracing, recording every compiler invocation with its inputs and outputs. +4. Runs `bomsh_create_bom.py` to process the raw logfile and produce the + OmniBOR artifact graph in `omnibor/`. +5. If `bomsh_sbom.py` is available **and** `wolfssl-.spdx.json` + exists (from `make sbom`), annotates that SPDX document with OmniBOR + `ExternalRef` identifiers, producing `omnibor.wolfssl-.spdx.json`. + +### Output files + +| Path | Description | +|---|---| +| `omnibor/objects/` | OmniBOR artifact objects (SHA-1 content-addressed dependency graph) | +| `omnibor/metadata/bomsh/` | Bomsh build metadata | +| `omnibor.wolfssl-.spdx.json` | SPDX 2.3 JSON enriched with OmniBOR `ExternalRef` (produced only when both `bomsh_sbom.py` and `wolfssl-.spdx.json` are present) | + +The `PERSISTENT-ID gitoid` entry added to the enriched SPDX looks like: + +```json +{ + "referenceCategory": "PERSISTENT-ID", + "referenceType": "gitoid", + "referenceLocator": "gitoid:blob:sha1:" +} +``` + +This sits alongside the existing CPE and PURL `externalRefs` on the wolfSSL +package entry and is the key into the OmniBOR Merkle DAG in `omnibor/`. + +### Installing + +```sh +make install-bomsh # installs omnibor/ and enriched SPDX to $(datadir)/doc/wolfssl/ +make uninstall-bomsh # removes installed files +``` + +The generated files are removed by `make clean`. + +### Implementation notes + +`make bomsh` runs a full clean rebuild under `bomtrace3` on every invocation. +The ~20% runtime overhead of `bomtrace3` means the rebuild takes roughly +1.2× the normal build time. + +The raw logfile (`bomsh_raw_logfile.sha1`) and conf file (`_bomsh.conf`) +are written to the build directory and removed by `make clean`. The +`omnibor/` tree is also removed by `make clean`. + +--- + +## Combined workflow + +Running both targets produces the complete set of supply chain transparency +artefacts. `make bomsh` automatically enriches the SPDX document from +`make sbom` if it is present; there is no need to pass any extra flags. + +```sh +./configure +make +make sbom # component identity +make bomsh # build provenance + enriched SPDX +``` + +All output files: + +| File | From | Description | +|---|---|---| +| `wolfssl-.cdx.json` | `make sbom` | CycloneDX 1.6 component SBOM | +| `wolfssl-.spdx.json` | `make sbom` | SPDX 2.3 JSON component SBOM | +| `wolfssl-.spdx` | `make sbom` | SPDX 2.3 tag-value, validated | +| `omnibor/` | `make bomsh` | OmniBOR artifact dependency graph | +| `omnibor.wolfssl-.spdx.json` | `make bomsh` | SPDX 2.3 JSON enriched with OmniBOR gitoid | + +The enriched SPDX is the document to hand to a CRA auditor or downstream +consumer when you want both component identity and build traceability in one +file. + +--- + +## Using wolfSSL's artefacts in a product + +If you are shipping a product that includes wolfSSL and need to satisfy CRA +obligations, see `doc/CRA.md` for guidance on integrating these artefacts +into your product SBOM and what to provide to a conformity assessor. diff --git a/doc/include.am b/doc/include.am index 92f2c5b66b7..13bb8bf9ed7 100644 --- a/doc/include.am +++ b/doc/include.am @@ -3,7 +3,9 @@ # All paths should be given relative to the root dist_doc_DATA+= doc/README.txt \ - doc/QUIC.md + doc/QUIC.md \ + doc/SBOM.md \ + doc/CRA.md dox-pdf: @@ -21,3 +23,4 @@ clean-local: -rm -rf doc/html/ -rm -f doc/refman.pdf -rm -f doc/doxygen_warnings + -rm -rf $(BOMSH_OMNIBORDIR) diff --git a/scripts/gen-sbom b/scripts/gen-sbom new file mode 100755 index 00000000000..ad893e2b6e6 --- /dev/null +++ b/scripts/gen-sbom @@ -0,0 +1,417 @@ +#!/usr/bin/env python3 +"""Generate CycloneDX 1.6 and SPDX 2.3 SBOMs for wolfssl.""" + +import argparse +import hashlib +import json +import re +import subprocess +import sys +import uuid +from datetime import datetime, timezone + + +# Known metadata for optional external dependencies. +# Version is detected at runtime via pkg-config; falls back to None. +DEP_META = { + 'liboqs': { + 'name': 'liboqs', + 'supplier': 'Open Quantum Safe', + 'license': 'MIT', + 'download': 'https://github.com/open-quantum-safe/liboqs', + 'pkgconfig': 'liboqs', + 'purl': lambda v: f'pkg:github/open-quantum-safe/liboqs@{v}', + }, + 'libxmss': { + 'name': 'xmss-reference', + 'supplier': 'XMSS reference implementation authors', + 'license': 'CC0-1.0', + 'download': 'https://github.com/XMSS/xmss-reference', + 'pkgconfig': None, + 'purl': lambda v: f'pkg:github/XMSS/xmss-reference@{v}', + }, + 'liblms': { + 'name': 'hash-sigs', + 'supplier': 'Cisco Systems', + 'license': 'MIT', + 'download': 'https://github.com/cisco/hash-sigs', + 'pkgconfig': None, + 'purl': lambda v: f'pkg:github/cisco/hash-sigs@{v}', + }, + 'libz': { + 'name': 'zlib', + 'supplier': 'Jean-loup Gailly and Mark Adler', + 'license': 'Zlib', + 'download': 'https://github.com/madler/zlib', + 'pkgconfig': 'zlib', + 'purl': lambda v: f'pkg:generic/zlib@{v}', + }, +} + + +def detect_license(license_file): + """Parse LICENSING file and return an SPDX license ID. + + Looks for 'GNU General Public License version N' and whether + 'or later' / 'or any later version' follows. Returns None and + prints a warning if the file cannot be parsed. + """ + try: + text = open(license_file).read() + except OSError as e: + print(f"WARNING: cannot read license file {license_file}: {e}", + file=sys.stderr) + return None + + m = re.search( + r'gnu general public license\s+version\s+(\d+)', + text, re.IGNORECASE + ) + if not m: + print(f"WARNING: no GPL version found in {license_file}", + file=sys.stderr) + return None + + version = m.group(1) + excerpt = text[m.end():m.end() + 100] + if re.search(r'or\s+(any\s+)?later', excerpt, re.IGNORECASE): + return f'GPL-{version}.0-or-later' + return f'GPL-{version}.0-only' + + +def sha256_file(path): + h = hashlib.sha256() + try: + with open(path, 'rb') as f: + for chunk in iter(lambda: f.read(65536), b''): + h.update(chunk) + except OSError as e: + sys.exit(f"ERROR: cannot read library for hashing: {e}") + return h.hexdigest() + + +GIT_BIN = None + + +def pkgconfig_version(pkgname): + """Return version string from pkg-config, or None if unavailable.""" + try: + r = subprocess.run( + ['pkg-config', '--modversion', pkgname], + capture_output=True, text=True + ) + if r.returncode == 0: + return r.stdout.strip() + except FileNotFoundError: + pass + return None + + +def git_describe_version(root, git_bin): + """Return version from git describe --tags --always, or None.""" + if not root or not git_bin: + return None + try: + r = subprocess.run( + [git_bin, '-C', root, 'describe', '--tags', '--always'], + capture_output=True, text=True + ) + if r.returncode == 0: + return r.stdout.strip() + except FileNotFoundError: + pass + return None + + +def dep_version(key): + pkgname = DEP_META[key]['pkgconfig'] + if pkgname: + return pkgconfig_version(pkgname) + git_root = DEP_META[key].get('git_root') + if git_root: + return git_describe_version(git_root, GIT_BIN) + return None + + +def parse_options_h(path): + """Parse wolfssl/options.h and return sorted deduplicated list of + (name, value) pairs for every #define found.""" + try: + text = open(path).read() + except OSError as e: + print(f"WARNING: cannot read options.h {path}: {e}", file=sys.stderr) + return [] + + defines = {} + for m in re.finditer(r'^#define[ \t]+(\w+)(?:[ \t]+(.+))?$', text, re.MULTILINE): + defines[m.group(1)] = (m.group(2) or '').strip() + return sorted(defines.items()) + + +def cdx_dep_component(key): + """Return (bom_ref, component_dict) for a CDX dependency component.""" + meta = DEP_META[key] + version = dep_version(key) + bom_ref = str(uuid.uuid4()) + comp = { + 'bom-ref': bom_ref, + 'type': 'library', + 'supplier': {'name': meta['supplier']}, + 'name': meta['name'], + 'licenses': [{'license': {'id': meta['license']}}], + 'externalReferences': [{'type': 'vcs', 'url': meta['download']}], + } + if version: + comp['version'] = version + comp['purl'] = meta['purl'](version) + else: + print(f"WARNING: version unknown for {meta['name']}; " + "omitting version and purl", file=sys.stderr) + return bom_ref, comp + + +def spdx_dep_package(key): + """Return (spdx_id, package_dict) for an SPDX dependency package.""" + meta = DEP_META[key] + version = dep_version(key) + spdx_id = 'SPDXRef-Package-' + re.sub(r'[^A-Za-z0-9.]', '', meta['name']) + pkg = { + 'SPDXID': spdx_id, + 'name': meta['name'], + 'versionInfo': version if version else 'NOASSERTION', + 'supplier': f"Organization: {meta['supplier']}", + 'downloadLocation': meta['download'], + 'filesAnalyzed': False, + 'licenseConcluded': meta['license'], + 'licenseDeclared': meta['license'], + 'copyrightText': 'NOASSERTION', + } + if version: + pkg['externalRefs'] = [{ + 'referenceCategory': 'PACKAGE-MANAGER', + 'referenceType': 'purl', + 'referenceLocator': meta['purl'](version), + }] + return spdx_id, pkg + + +def generate_cdx(name, version, supplier, license_id, lib_hash, + timestamp, serial, enabled_deps, build_props): + year = datetime.now(timezone.utc).year + bom_ref = str(uuid.uuid4()) + + dep_bom_refs = [] + components = [] + for key in enabled_deps: + ref, comp = cdx_dep_component(key) + dep_bom_refs.append(ref) + components.append(comp) + + properties = [ + {'name': f'wolfssl:build:{k}', 'value': v if v else '1'} + for k, v in build_props + ] + + return { + '$schema': 'http://cyclonedx.org/schema/bom-1.6.schema.json', + 'bomFormat': 'CycloneDX', + 'specVersion': '1.6', + 'serialNumber': f'urn:uuid:{serial}', + 'version': 1, + 'metadata': { + 'timestamp': timestamp, + 'tools': { + 'components': [{ + 'type': 'application', + 'author': 'wolfSSL Inc.', + 'name': 'wolfssl-sbom-gen', + 'version': '1.0' + }] + }, + 'component': { + 'bom-ref': bom_ref, + 'type': 'library', + 'supplier': {'name': supplier}, + 'name': name, + 'version': version, + 'licenses': [{'license': {'id': license_id}}], + 'copyright': f'Copyright (C) 2006-{year} wolfSSL Inc.', + 'cpe': f'cpe:2.3:a:wolfssl:{name}:{version}:*:*:*:*:*:*:*', + 'purl': f'pkg:generic/{name}@{version}', + 'hashes': [{'alg': 'SHA-256', 'content': lib_hash}], + 'externalReferences': [{ + 'type': 'vcs', + 'url': 'https://github.com/wolfSSL/wolfssl' + }], + 'properties': properties, + } + }, + 'components': components, + 'dependencies': [ + {'ref': bom_ref, 'dependsOn': dep_bom_refs}, + *[{'ref': r, 'dependsOn': []} for r in dep_bom_refs], + ], + } + + +def generate_spdx(name, version, supplier, license_id, lib_hash, + timestamp, doc_ns_uuid, enabled_deps, build_props): + year = datetime.now(timezone.utc).year + + build_defines = ', '.join(k for k, _ in build_props) + wolfssl_pkg = { + 'SPDXID': 'SPDXRef-Package-wolfssl', + 'name': name, + 'versionInfo': version, + 'supplier': f'Organization: {supplier}', + 'downloadLocation': 'https://github.com/wolfSSL/wolfssl', + 'filesAnalyzed': False, + 'checksums': [{'algorithm': 'SHA256', 'checksumValue': lib_hash}], + 'licenseConcluded': license_id, + 'licenseDeclared': license_id, + 'copyrightText': f'Copyright (C) 2006-{year} wolfSSL Inc.', + 'comment': f'Build configuration defines: {build_defines}', + 'externalRefs': [ + { + 'referenceCategory': 'SECURITY', + 'referenceType': 'cpe23Type', + 'referenceLocator': ( + f'cpe:2.3:a:wolfssl:{name}:{version}:*:*:*:*:*:*:*' + ) + }, + { + 'referenceCategory': 'PACKAGE-MANAGER', + 'referenceType': 'purl', + 'referenceLocator': f'pkg:generic/{name}@{version}' + } + ], + } + + packages = [wolfssl_pkg] + relationships = [{ + 'spdxElementId': 'SPDXRef-DOCUMENT', + 'relatedSpdxElement': 'SPDXRef-Package-wolfssl', + 'relationshipType': 'DESCRIBES', + }] + + for key in enabled_deps: + spdx_id, pkg = spdx_dep_package(key) + packages.append(pkg) + relationships.append({ + 'spdxElementId': 'SPDXRef-Package-wolfssl', + 'relatedSpdxElement': spdx_id, + 'relationshipType': 'DEPENDS_ON', + }) + + return { + 'spdxVersion': 'SPDX-2.3', + 'dataLicense': 'CC0-1.0', + 'SPDXID': 'SPDXRef-DOCUMENT', + 'name': f'{name}-{version}', + 'documentNamespace': ( + f'https://wolfssl.com/sbom/{name}-{version}-{doc_ns_uuid}' + ), + 'creationInfo': { + 'licenseListVersion': '3.28', + 'creators': [ + f'Organization: {supplier}', + 'Tool: wolfssl-sbom-gen-1.0' + ], + 'created': timestamp, + }, + 'packages': packages, + 'relationships': relationships, + } + + +def main(): + parser = argparse.ArgumentParser( + description='Generate CycloneDX and SPDX SBOMs for wolfssl' + ) + parser.add_argument('--name', required=True, help='Package name') + parser.add_argument('--version', required=True, help='Package version') + parser.add_argument('--supplier', default='wolfSSL Inc.', + help='Supplier name (default: wolfSSL Inc.)') + parser.add_argument('--lib', required=True, + help='Path to libwolfssl.so.X.Y.Z for SHA-256 hashing') + parser.add_argument('--license-file', required=True, + help='Path to LICENSING file for SPDX ID detection') + parser.add_argument('--options-h', required=True, + help='Path to wolfssl/options.h for build config') + parser.add_argument('--dep-liboqs', default='no', + help='yes if built with --with-liboqs') + parser.add_argument('--dep-libxmss', default='no', + help='yes if built with --with-libxmss') + parser.add_argument('--dep-libxmss-root', default='', + help='Path to xmss-reference source tree root') + parser.add_argument('--dep-liblms', default='no', + help='yes if built with --with-liblms') + parser.add_argument('--dep-liblms-root', default='', + help='Path to hash-sigs source tree root') + parser.add_argument('--dep-libz', default='no', + help='yes if built with --with-libz') + parser.add_argument('--git', default='', + help='Path to git binary for version detection') + parser.add_argument('--cdx-out', required=True, + help='Output path for CycloneDX JSON') + parser.add_argument('--spdx-out', required=True, + help='Output path for SPDX JSON') + args = parser.parse_args() + + global GIT_BIN + GIT_BIN = args.git or None + + if args.dep_libxmss_root: + DEP_META['libxmss']['git_root'] = args.dep_libxmss_root + if args.dep_liblms_root: + DEP_META['liblms']['git_root'] = args.dep_liblms_root + + enabled_deps = [ + key for key, flag in [ + ('liboqs', args.dep_liboqs), + ('libxmss', args.dep_libxmss), + ('liblms', args.dep_liblms), + ('libz', args.dep_libz), + ] + if flag.lower() == 'yes' + ] + + license_id = detect_license(args.license_file) + if license_id is None: + print("WARNING: license could not be determined; using NOASSERTION", + file=sys.stderr) + license_id = 'NOASSERTION' + + build_props = parse_options_h(args.options_h) + lib_hash = sha256_file(args.lib) + timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ') + serial = str(uuid.uuid4()) + doc_ns_uuid = str(uuid.uuid4()) + + cdx = generate_cdx( + args.name, args.version, args.supplier, + license_id, lib_hash, timestamp, serial, + enabled_deps, build_props, + ) + spdx = generate_spdx( + args.name, args.version, args.supplier, + license_id, lib_hash, timestamp, doc_ns_uuid, + enabled_deps, build_props, + ) + + try: + with open(args.cdx_out, 'w') as f: + json.dump(cdx, f, indent=2) + f.write('\n') + with open(args.spdx_out, 'w') as f: + json.dump(spdx, f, indent=2) + f.write('\n') + except OSError as e: + sys.exit(f"ERROR: cannot write SBOM output: {e}") + + print(f"Generated: {args.cdx_out}") + print(f"Generated: {args.spdx_out}") + + +if __name__ == '__main__': + main() From cdf8ee2f7ceab7ef7a2f1218f25580ee73374798 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 13:23:35 +0300 Subject: [PATCH 02/16] fix(sbom): library discovery, reproducibility, install lifecycle Glob-match libwolfssl.* across platforms, honour SOURCE_DATE_EPOCH, stage installs with trap cleanup, and parse options.h for build flags. Signed-off-by: Sameeh Jubran --- Makefile.am | 106 +++++++++++++++++++++++++++++++++++++-------- doc/CRA.md | 41 +++++++++++++++--- doc/SBOM.md | 25 ++++++++--- scripts/gen-sbom | 81 ++++++++++++++++++++++++---------- scripts/include.am | 5 +++ 5 files changed, 205 insertions(+), 53 deletions(-) diff --git a/Makefile.am b/Makefile.am index 5aa63cf4676..4b9eaf21b3c 100644 --- a/Makefile.am +++ b/Makefile.am @@ -359,6 +359,12 @@ sbomdir = $(datadir)/doc/$(PACKAGE) .PHONY: sbom install-sbom uninstall-sbom +# Stage a `make install` into a private tree, discover the installed library +# artifact (shared or static, ELF/Mach-O/PE), hash it, generate SPDX+CDX, +# validate the SPDX, then convert to tag-value. The staging tree is removed +# unconditionally via `trap`, even if any step fails. Honors SOURCE_DATE_EPOCH +# for reproducible builds (set by the recipe to `git log -1 --format=%ct` when +# unset and a git tree is available). sbom: @if test -z "$(PYTHON3)"; then \ echo ""; \ @@ -373,14 +379,44 @@ sbom: echo ""; \ exit 1; \ fi - rm -rf $(abs_builddir)/_sbom_staging - $(MAKE) install DESTDIR=$(abs_builddir)/_sbom_staging + @rm -rf $(abs_builddir)/_sbom_staging + @set -e; \ + trap 'rm -rf $(abs_builddir)/_sbom_staging' EXIT INT TERM HUP; \ + $(MAKE) install DESTDIR=$(abs_builddir)/_sbom_staging; \ + sbom_lib=""; \ + for lib in \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.so.[0-9]* \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.so \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.[0-9]*.dylib \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.dylib \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.dll \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.dll.a \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.lib \ + "$(abs_builddir)/_sbom_staging$(libdir)"/wolfssl.lib \ + "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.a; do \ + if test -f "$$lib"; then sbom_lib="$$lib"; break; fi; \ + done; \ + if test -z "$$sbom_lib"; then \ + echo ""; \ + echo "ERROR: No installed wolfSSL library artifact found for SBOM."; \ + echo " Searched in $(abs_builddir)/_sbom_staging$(libdir)"; \ + echo " (configure with --enable-shared or --enable-static)"; \ + echo ""; \ + exit 1; \ + fi; \ + echo "SBOM: hashing $$sbom_lib"; \ + if test -z "$${SOURCE_DATE_EPOCH:-}" && test -n "$(GIT)" && \ + test -d "$(srcdir)/.git"; then \ + SOURCE_DATE_EPOCH=`$(GIT) -C "$(srcdir)" log -1 --format=%ct 2>/dev/null || echo`; \ + export SOURCE_DATE_EPOCH; \ + fi; \ $(PYTHON3) $(srcdir)/scripts/gen-sbom \ --name $(PACKAGE) \ --version $(PACKAGE_VERSION) \ --license-file $(srcdir)/LICENSING \ + --license-override '$(SBOM_LICENSE_OVERRIDE)' \ --options-h $(abs_builddir)/wolfssl/options.h \ - --lib $(abs_builddir)/_sbom_staging$(libdir)/libwolfssl.so.$(WOLFSSL_LIBRARY_VERSION_FIRST).$(WOLFSSL_LIBRARY_VERSION_SECOND).$(WOLFSSL_LIBRARY_VERSION_THIRD) \ + --lib "$$sbom_lib" \ --dep-liboqs $(ENABLED_LIBOQS) \ --dep-libxmss $(ENABLED_LIBXMSS) \ --dep-libxmss-root '$(XMSS_ROOT)' \ @@ -389,8 +425,7 @@ sbom: --dep-libz $(ENABLED_LIBZ) \ --git '$(GIT)' \ --cdx-out $(abs_builddir)/$(SBOM_CDX) \ - --spdx-out $(abs_builddir)/$(SBOM_SPDX) - rm -rf $(abs_builddir)/_sbom_staging + --spdx-out $(abs_builddir)/$(SBOM_SPDX); \ $(PYSPDXTOOLS) --infile $(abs_builddir)/$(SBOM_SPDX) \ --outfile $(abs_builddir)/$(SBOM_SPDX_TV) @@ -417,6 +452,11 @@ bomshdir = $(datadir)/doc/$(PACKAGE) .PHONY: bomsh install-bomsh uninstall-bomsh +# Self-contained: the traced rebuild also regenerates the SBOM, so users +# can run `make bomsh` directly without first running `make sbom`. This is +# also what makes the combined workflow correct: `make sbom` writes the SPDX, +# but `make bomsh` issues `make clean` (which removes it via CLEANFILES), so +# the only reliable way to enrich is to regenerate after the traced build. bomsh: @if test -z "$(BOMTRACE3)"; then \ echo ""; \ @@ -436,21 +476,41 @@ bomsh: @printf 'raw_logfile=%s\n' '$(BOMSH_RAWLOG_BASE)' > '$(BOMSH_CONF)' $(BOMTRACE3) -c '$(BOMSH_CONF)' $(MAKE) $(BOMSH_CREATE_BOM) -r '$(BOMSH_RAWLOG)' -b '$(BOMSH_OMNIBORDIR)' - @if test -n "$(BOMSH_SBOM)" && test -f '$(abs_builddir)/wolfssl-$(PACKAGE_VERSION).spdx.json'; then \ - echo "Enriching SPDX with OmniBOR ExternalRefs..."; \ - $(BOMSH_SBOM) \ - -b '$(BOMSH_OMNIBORDIR)' \ - -i '$(abs_builddir)/wolfssl-$(PACKAGE_VERSION).spdx.json' \ - -f '$(abs_builddir)/src/.libs/libwolfssl.so.$(WOLFSSL_LIBRARY_VERSION_FIRST).$(WOLFSSL_LIBRARY_VERSION_SECOND).$(WOLFSSL_LIBRARY_VERSION_THIRD)' \ - -s spdx-json \ - -O '$(abs_builddir)'; \ - elif test -n "$(BOMSH_SBOM)"; then \ - echo "NOTE: run 'make sbom' first, then 'make bomsh' for OmniBOR-enriched SPDX."; \ - fi + $(MAKE) sbom + @if test -z "$(BOMSH_SBOM)"; then \ + echo "NOTE: bomsh_sbom.py not in PATH; skipping SPDX enrichment."; \ + echo " The OmniBOR graph in $(BOMSH_OMNIBORDIR) is still produced."; \ + exit 0; \ + fi; \ + bomsh_artifact=""; \ + for lib in \ + $(abs_builddir)/src/.libs/libwolfssl.so.[0-9]* \ + $(abs_builddir)/src/.libs/libwolfssl.so \ + $(abs_builddir)/src/.libs/libwolfssl.[0-9]*.dylib \ + $(abs_builddir)/src/.libs/libwolfssl.dylib \ + $(abs_builddir)/src/.libs/libwolfssl.a \ + $(abs_builddir)/src/libwolfssl.a; do \ + if test -f "$$lib"; then bomsh_artifact="$$lib"; break; fi; \ + done; \ + if test -z "$$bomsh_artifact"; then \ + echo "NOTE: no built libwolfssl artifact found in $(abs_builddir)/src/.libs/"; \ + echo " OmniBOR graph produced; SPDX enrichment skipped."; \ + exit 0; \ + fi; \ + echo "Enriching SPDX with OmniBOR ExternalRefs (artifact: $$bomsh_artifact)..."; \ + $(BOMSH_SBOM) \ + -b '$(BOMSH_OMNIBORDIR)' \ + -i '$(abs_builddir)/$(SBOM_SPDX)' \ + -f "$$bomsh_artifact" \ + -s spdx-json \ + -O '$(abs_builddir)' install-bomsh: bomsh - $(MKDIR_P) $(DESTDIR)$(bomshdir) - cp -r '$(BOMSH_OMNIBORDIR)' '$(DESTDIR)$(bomshdir)/omnibor' + $(MKDIR_P) '$(DESTDIR)$(bomshdir)/omnibor' + @if test -d '$(BOMSH_OMNIBORDIR)'; then \ + (cd '$(BOMSH_OMNIBORDIR)' && tar cf - .) | \ + (cd '$(DESTDIR)$(bomshdir)/omnibor' && tar xf -); \ + fi @if test -f '$(abs_builddir)/$(BOMSH_SPDX_OUT)'; then \ $(INSTALL_DATA) '$(abs_builddir)/$(BOMSH_SPDX_OUT)' '$(DESTDIR)$(bomshdir)/'; \ fi @@ -460,3 +520,13 @@ uninstall-bomsh: -rm -f '$(DESTDIR)$(bomshdir)/$(BOMSH_SPDX_OUT)' CLEANFILES += $(BOMSH_RAWLOG) $(BOMSH_RAWLOG_BASE).sha256 $(BOMSH_CONF) $(BOMSH_SPDX_OUT) + +# Hook SBOM/Bomsh cleanup into `make uninstall` so packagers don't leave +# stale artefacts behind after install-sbom/install-bomsh. rm -f is +# idempotent, so this is safe whether or not those targets were ever run. +uninstall-hook: + -rm -f $(DESTDIR)$(sbomdir)/$(SBOM_CDX) + -rm -f $(DESTDIR)$(sbomdir)/$(SBOM_SPDX) + -rm -f $(DESTDIR)$(sbomdir)/$(SBOM_SPDX_TV) + -rm -rf $(DESTDIR)$(bomshdir)/omnibor + -rm -f $(DESTDIR)$(bomshdir)/$(BOMSH_SPDX_OUT) diff --git a/doc/CRA.md b/doc/CRA.md index b01a27f194b..cfdcd46b18a 100644 --- a/doc/CRA.md +++ b/doc/CRA.md @@ -125,18 +125,45 @@ wolfSSL CycloneDX document via an external reference of type `bom`: ## Commercial License Users -wolfSSL's SBOM records `licenseConcluded: GPL-3.0-only`, which reflects the -open-source license. If you are distributing a product under a wolfSSL -commercial license, update `licenseConcluded` in your copy of the package -entry (or in your own SBOM's reference to the wolfSSL package) to reflect -your actual license: +wolfSSL's published SBOM records `licenseConcluded: GPL-3.0-only`, which +reflects the open-source license. If you are distributing a product under a +wolfSSL commercial license, you have two options: + +### Option 1: regenerate the SBOM with your license expression + +Pass `SBOM_LICENSE_OVERRIDE` to `make sbom` to bake your SPDX expression +directly into the artefact (preferred — survives re-runs, no manual editing): + +```sh +make sbom SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial +``` + +Or invoke the generator directly with `--license-override` if you are +producing the SBOM outside the standard make target. + +### Option 2: update your product SBOM's reference to wolfSSL + +Leave the upstream SBOM file alone and override `licenseConcluded` on the +wolfSSL package entry in *your* product SBOM: ```json "licenseConcluded": "LicenseRef-wolfSSL-Commercial" ``` -Do not modify the wolfSSL-published SBOM file itself; update the concluded -license in your product SBOM where you reference or embed the wolfSSL entry. +Do not modify the wolfSSL-published SBOM file in place; either regenerate it +with the override (Option 1) or override at the consumer level (Option 2). + +## Reproducible SBOMs + +The generator honors `SOURCE_DATE_EPOCH` for the SBOM creation timestamp and +uses deterministic UUIDs derived from the package name and version, so two +runs of `make sbom` against the same source tree, library binary, and build +options produce byte-identical `.spdx.json` and `.cdx.json` files. This +matters for downstream attestation pipelines that hash SBOMs as part of a +provenance chain. + +`make sbom` will derive `SOURCE_DATE_EPOCH` from `git log -1 --format=%ct` if +you do not set it explicitly and the wolfSSL source tree is a git checkout. ## Build Provenance (OmniBOR) diff --git a/doc/SBOM.md b/doc/SBOM.md index c2b3ab71198..5b2550008a3 100644 --- a/doc/SBOM.md +++ b/doc/SBOM.md @@ -86,10 +86,20 @@ wrong value. #### Dual licensing wolfSSL is available under `GPL-3.0-only` for open-source use, with a -commercial license for proprietary products. The SBOM reflects the -open-source license. Commercial licensees should update the -`licenseConcluded` field to `LicenseRef-wolfSSL-Commercial` or their -applicable SPDX expression when distributing under a commercial agreement. +commercial license for proprietary products. The default SBOM reflects the +open-source license. Commercial licensees should regenerate the SBOM with +`--license-override` set to their applicable SPDX expression — the generator +exposes this directly: + +```sh +python3 scripts/gen-sbom \ + --license-override LicenseRef-wolfSSL-Commercial \ + ... other flags ... +``` + +The override is also forwarded by `make sbom` if you set the +`SBOM_LICENSE_OVERRIDE` make variable, e.g. +`make sbom SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial`. #### External dependency version detection @@ -101,8 +111,11 @@ typically built against a source checkout rather than an installed package. The generator falls back to `git describe --tags --always` on the source tree root (passed via `configure` as `XMSS_ROOT` / `LIBLMS_ROOT`). If the source tree has no tags, `git describe` returns the short commit hash, which -is recorded as-is. If the source tree is unavailable or `git` is not found, -version is recorded as `NOASSERTION`. +is recorded as-is. If the source tree is unavailable or `git` is not found: + +- SPDX records `versionInfo: NOASSERTION` and emits no `purl` external ref. +- CycloneDX omits the `version` and `purl` fields entirely and the generator + prints a warning to stderr. ### Validating the SBOM manually diff --git a/scripts/gen-sbom b/scripts/gen-sbom index ad893e2b6e6..839c20656f4 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -4,6 +4,7 @@ import argparse import hashlib import json +import os import re import subprocess import sys @@ -11,6 +12,35 @@ import uuid from datetime import datetime, timezone +# Stable namespace for deterministic uuid5 derivation. Anchored under +# wolfssl.com so collisions with other projects' SBOM UUIDs are not a concern. +SBOM_UUID_NAMESPACE = uuid.uuid5(uuid.NAMESPACE_URL, 'https://wolfssl.com/sbom/') + + +def derived_uuid(*parts): + """Deterministic UUID from joined parts under the wolfSSL SBOM namespace. + Re-runs of `make sbom` against the same source produce identical UUIDs, + which is required for reproducible-build-style SBOM hashing.""" + return str(uuid.uuid5(SBOM_UUID_NAMESPACE, '/'.join(parts))) + + +def build_timestamp(): + """Return (datetime, ISO-8601-Z string) honoring SOURCE_DATE_EPOCH. + Reproducible Builds convention: if the env var is set to a valid + integer, use it as the SBOM creation timestamp instead of wallclock.""" + sde = os.environ.get('SOURCE_DATE_EPOCH', '').strip() + if sde: + try: + dt = datetime.fromtimestamp(int(sde), tz=timezone.utc) + except (ValueError, OverflowError, OSError) as e: + print(f"WARNING: ignoring invalid SOURCE_DATE_EPOCH={sde!r}: {e}", + file=sys.stderr) + dt = datetime.now(timezone.utc) + else: + dt = datetime.now(timezone.utc) + return dt, dt.strftime('%Y-%m-%dT%H:%M:%SZ') + + # Known metadata for optional external dependencies. # Version is detected at runtime via pkg-config; falls back to None. DEP_META = { @@ -148,11 +178,12 @@ def parse_options_h(path): return sorted(defines.items()) -def cdx_dep_component(key): - """Return (bom_ref, component_dict) for a CDX dependency component.""" +def cdx_dep_component(name, pkg_version, key): + """Return (bom_ref, component_dict) for a CDX dependency component. + bom_ref is deterministic for reproducibility.""" meta = DEP_META[key] version = dep_version(key) - bom_ref = str(uuid.uuid4()) + bom_ref = derived_uuid(name, pkg_version, 'dep', key) comp = { 'bom-ref': bom_ref, 'type': 'library', @@ -196,14 +227,13 @@ def spdx_dep_package(key): def generate_cdx(name, version, supplier, license_id, lib_hash, - timestamp, serial, enabled_deps, build_props): - year = datetime.now(timezone.utc).year - bom_ref = str(uuid.uuid4()) + timestamp, year, serial, enabled_deps, build_props): + bom_ref = derived_uuid(name, version, 'package') dep_bom_refs = [] components = [] for key in enabled_deps: - ref, comp = cdx_dep_component(key) + ref, comp = cdx_dep_component(name, version, key) dep_bom_refs.append(ref) components.append(comp) @@ -255,9 +285,7 @@ def generate_cdx(name, version, supplier, license_id, lib_hash, def generate_spdx(name, version, supplier, license_id, lib_hash, - timestamp, doc_ns_uuid, enabled_deps, build_props): - year = datetime.now(timezone.utc).year - + timestamp, year, doc_ns_uuid, enabled_deps, build_props): build_defines = ', '.join(k for k, _ in build_props) wolfssl_pkg = { 'SPDXID': 'SPDXRef-Package-wolfssl', @@ -312,7 +340,6 @@ def generate_spdx(name, version, supplier, license_id, lib_hash, f'https://wolfssl.com/sbom/{name}-{version}-{doc_ns_uuid}' ), 'creationInfo': { - 'licenseListVersion': '3.28', 'creators': [ f'Organization: {supplier}', 'Tool: wolfssl-sbom-gen-1.0' @@ -333,9 +360,15 @@ def main(): parser.add_argument('--supplier', default='wolfSSL Inc.', help='Supplier name (default: wolfSSL Inc.)') parser.add_argument('--lib', required=True, - help='Path to libwolfssl.so.X.Y.Z for SHA-256 hashing') + help='Path to the wolfSSL library artifact ' + '(shared or static) for SHA-256 hashing') parser.add_argument('--license-file', required=True, help='Path to LICENSING file for SPDX ID detection') + parser.add_argument('--license-override', default='', + help='Override the detected SPDX license expression ' + '(e.g. LicenseRef-wolfSSL-Commercial). Useful ' + 'for commercial licensees regenerating the SBOM ' + 'for their own product.') parser.add_argument('--options-h', required=True, help='Path to wolfssl/options.h for build config') parser.add_argument('--dep-liboqs', default='no', @@ -376,26 +409,30 @@ def main(): if flag.lower() == 'yes' ] - license_id = detect_license(args.license_file) - if license_id is None: - print("WARNING: license could not be determined; using NOASSERTION", - file=sys.stderr) - license_id = 'NOASSERTION' + if args.license_override: + license_id = args.license_override + else: + license_id = detect_license(args.license_file) + if license_id is None: + print("WARNING: license could not be determined; using NOASSERTION", + file=sys.stderr) + license_id = 'NOASSERTION' build_props = parse_options_h(args.options_h) lib_hash = sha256_file(args.lib) - timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ') - serial = str(uuid.uuid4()) - doc_ns_uuid = str(uuid.uuid4()) + dt, timestamp = build_timestamp() + year = dt.year + serial = derived_uuid(args.name, args.version, 'serial') + doc_ns_uuid = derived_uuid(args.name, args.version, 'document') cdx = generate_cdx( args.name, args.version, args.supplier, - license_id, lib_hash, timestamp, serial, + license_id, lib_hash, timestamp, year, serial, enabled_deps, build_props, ) spdx = generate_spdx( args.name, args.version, args.supplier, - license_id, lib_hash, timestamp, doc_ns_uuid, + license_id, lib_hash, timestamp, year, doc_ns_uuid, enabled_deps, build_props, ) diff --git a/scripts/include.am b/scripts/include.am index f7a0bb37c8b..508fec90cea 100644 --- a/scripts/include.am +++ b/scripts/include.am @@ -155,3 +155,8 @@ EXTRA_DIST += scripts/bench/bench_functions.sh EXTRA_DIST += scripts/benchmark_compare.sh EXTRA_DIST += scripts/user_settings_asm.sh + +# SBOM generator (invoked from `make sbom` in the top-level Makefile.am). +# Must be in the dist tarball, otherwise `make dist && cd && +# ./configure && make sbom` fails for downstream consumers. +EXTRA_DIST += scripts/gen-sbom From 77d6e29490af08c1fdd2fdae8c26042bd157b8a2 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 14:30:08 +0300 Subject: [PATCH 03/16] fix(sbom): SPDX 2.3 LicenseRef compliance Emit hasExtractedLicensingInfos for LicenseRef-* IDs and add --license-text / SBOM_LICENSE_TEXT to embed the actual licence body. Signed-off-by: Sameeh Jubran --- Makefile.am | 11 +++++ doc/CRA.md | 24 ++++++++-- doc/SBOM.md | 22 ++++++++-- scripts/gen-sbom | 112 ++++++++++++++++++++++++++++++++++++++++++++--- 4 files changed, 157 insertions(+), 12 deletions(-) diff --git a/Makefile.am b/Makefile.am index 4b9eaf21b3c..3e4a32612f5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -365,6 +365,16 @@ sbomdir = $(datadir)/doc/$(PACKAGE) # unconditionally via `trap`, even if any step fails. Honors SOURCE_DATE_EPOCH # for reproducible builds (set by the recipe to `git log -1 --format=%ct` when # unset and a git tree is available). +# +# User-overridable variables: +# SBOM_LICENSE_OVERRIDE SPDX expression to use instead of the GPL ID +# parsed from LICENSING (e.g. for commercial +# licensees: LicenseRef-wolfSSL-Commercial). +# SBOM_LICENSE_TEXT Path to the actual licence text for any +# LicenseRef-* in SBOM_LICENSE_OVERRIDE. Required +# for SPDX 2.3 conformance whenever a custom +# LicenseRef is in use; without it the SBOM embeds +# a placeholder and validators may reject it. sbom: @if test -z "$(PYTHON3)"; then \ echo ""; \ @@ -415,6 +425,7 @@ sbom: --version $(PACKAGE_VERSION) \ --license-file $(srcdir)/LICENSING \ --license-override '$(SBOM_LICENSE_OVERRIDE)' \ + --license-text '$(SBOM_LICENSE_TEXT)' \ --options-h $(abs_builddir)/wolfssl/options.h \ --lib "$$sbom_lib" \ --dep-liboqs $(ENABLED_LIBOQS) \ diff --git a/doc/CRA.md b/doc/CRA.md index cfdcd46b18a..21a4d61bf97 100644 --- a/doc/CRA.md +++ b/doc/CRA.md @@ -135,11 +135,29 @@ Pass `SBOM_LICENSE_OVERRIDE` to `make sbom` to bake your SPDX expression directly into the artefact (preferred — survives re-runs, no manual editing): ```sh -make sbom SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial +make sbom \ + SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial \ + SBOM_LICENSE_TEXT=/path/to/wolfssl-commercial-license.txt ``` -Or invoke the generator directly with `--license-override` if you are -producing the SBOM outside the standard make target. +`SBOM_LICENSE_TEXT` is **required** whenever `SBOM_LICENSE_OVERRIDE` uses a +custom `LicenseRef-*` identifier. SPDX 2.3 §10.1 requires the actual licence +text to be embedded in `hasExtractedLicensingInfos` for any LicenseRef used in +the document; conformant validators (e.g. `pyspdxtools`, `ntia-conformance-checker`) +will reject the SBOM otherwise. The file should contain the plain-text +licence agreement you received from wolfSSL. + +If you omit `SBOM_LICENSE_TEXT` the generator emits a placeholder and prints +a warning — useful for quick experiments, but the result is **not** valid for +distribution to customers or regulators. + +For a stock SPDX-listed identifier (`Apache-2.0`, `MIT`, etc.) the +`SBOM_LICENSE_TEXT` argument is unnecessary because validators already know +the canonical text. + +Or invoke the generator directly with `--license-override` / +`--license-text` if you are producing the SBOM outside the standard make +target. ### Option 2: update your product SBOM's reference to wolfSSL diff --git a/doc/SBOM.md b/doc/SBOM.md index 5b2550008a3..bcb88c78423 100644 --- a/doc/SBOM.md +++ b/doc/SBOM.md @@ -94,12 +94,28 @@ exposes this directly: ```sh python3 scripts/gen-sbom \ --license-override LicenseRef-wolfSSL-Commercial \ + --license-text /path/to/wolfssl-commercial-license.txt \ ... other flags ... ``` -The override is also forwarded by `make sbom` if you set the -`SBOM_LICENSE_OVERRIDE` make variable, e.g. -`make sbom SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial`. +`--license-text` is required whenever `--license-override` is a custom +`LicenseRef-*`: SPDX 2.3 mandates that any LicenseRef in `licenseConcluded` +or `licenseDeclared` be backed by a `hasExtractedLicensingInfos` entry that +embeds the actual licence text. Without it, validators such as +`pyspdxtools` and `ntia-conformance-checker` reject the document. The +generator emits a placeholder and a warning in that case so the bug is +visible, but the SBOM is *not* valid for downstream consumers. + +For an SPDX-listed override (`Apache-2.0`, `MIT`, etc.), `--license-text` +is unnecessary because validators already know the canonical text. + +`make sbom` plumbs both knobs through the matching make variables: + +```sh +make sbom \ + SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial \ + SBOM_LICENSE_TEXT=/path/to/wolfssl-commercial-license.txt +``` #### External dependency version detection diff --git a/scripts/gen-sbom b/scripts/gen-sbom index 839c20656f4..1b794de784a 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -79,6 +79,86 @@ DEP_META = { } +# Matches a single SPDX `LicenseRef-` identifier as defined in SPDX 2.3 +# Annex D ("idstring = 1*(ALPHA / DIGIT / '-' / '.')"). We use this to +# discover custom license refs inside an arbitrary SPDX expression and to +# decide whether a `licenseConcluded` value needs an accompanying +# `hasExtractedLicensingInfos` block. +LICENSEREF_RE = re.compile(r'LicenseRef-[A-Za-z0-9.\-]+') + +# Matches a "simple" SPDX-listed license ID such as `GPL-2.0-or-later` or +# `MIT` (no spaces, no operators, no LicenseRef-). Anything that does not +# match must be expressed via `licenses[].license.name` / `licenses[].expression` +# in CycloneDX, since `license.id` is restricted to the SPDX licence list. +SIMPLE_SPDX_ID_RE = re.compile(r'\A[A-Za-z0-9.+\-]+\Z') + + +def is_simple_spdx_id(value): + return bool(SIMPLE_SPDX_ID_RE.match(value)) and \ + not value.startswith('LicenseRef-') and value != 'NOASSERTION' + + +def extract_license_refs(expr): + """Return a sorted, deduplicated list of LicenseRef-* IDs found in expr.""" + return sorted(set(LICENSEREF_RE.findall(expr or ''))) + + +def load_license_text(path): + """Read the license text file given via --license-text, exit on error.""" + if not path: + return None + try: + with open(path) as f: + return f.read() + except OSError as e: + sys.exit(f"ERROR: cannot read --license-text {path}: {e}") + + +def build_extracted_licensing_infos(license_expr, license_text): + """Return SPDX `hasExtractedLicensingInfos` array for license_expr. + + SPDX 2.3 §10 requires every LicenseRef-* used in `licenseConcluded`/ + `licenseDeclared` to be declared once at document level via + `hasExtractedLicensingInfos`. Returns None when no LicenseRef-* is + present so the caller can omit the field entirely. + """ + refs = extract_license_refs(license_expr) + if not refs: + return None + if license_text is None: + license_text = ( + 'NOASSERTION. The text for this LicenseRef has not been ' + 'embedded in the SBOM. Provide it via the gen-sbom ' + '--license-text PATH flag (or `make sbom SBOM_LICENSE_TEXT=...`).' + ) + infos = [] + for ref in refs: + infos.append({ + 'licenseId': ref, + 'extractedText': license_text, + 'name': ref[len('LicenseRef-'):].replace('-', ' ').strip(), + }) + return infos + + +def cdx_license_block(license_expr, license_text): + """Return the CycloneDX `licenses[]` entry for an arbitrary SPDX + expression. CDX 1.6 distinguishes: + * `license.id` - an entry from the SPDX licence list + * `license.name` - a non-listed licence (e.g. a LicenseRef-*) + * `expression` - a compound SPDX expression + Picking the wrong shape causes downstream tooling to reject the SBOM.""" + if is_simple_spdx_id(license_expr): + return [{'license': {'id': license_expr}}] + refs = extract_license_refs(license_expr) + if len(refs) == 1 and refs[0] == license_expr: + block = {'name': license_expr} + if license_text: + block['text'] = {'contentType': 'text/plain', 'content': license_text} + return [{'license': block}] + return [{'expression': license_expr}] + + def detect_license(license_file): """Parse LICENSING file and return an SPDX license ID. @@ -226,7 +306,7 @@ def spdx_dep_package(key): return spdx_id, pkg -def generate_cdx(name, version, supplier, license_id, lib_hash, +def generate_cdx(name, version, supplier, license_id, license_text, lib_hash, timestamp, year, serial, enabled_deps, build_props): bom_ref = derived_uuid(name, version, 'package') @@ -264,7 +344,7 @@ def generate_cdx(name, version, supplier, license_id, lib_hash, 'supplier': {'name': supplier}, 'name': name, 'version': version, - 'licenses': [{'license': {'id': license_id}}], + 'licenses': cdx_license_block(license_id, license_text), 'copyright': f'Copyright (C) 2006-{year} wolfSSL Inc.', 'cpe': f'cpe:2.3:a:wolfssl:{name}:{version}:*:*:*:*:*:*:*', 'purl': f'pkg:generic/{name}@{version}', @@ -284,7 +364,7 @@ def generate_cdx(name, version, supplier, license_id, lib_hash, } -def generate_spdx(name, version, supplier, license_id, lib_hash, +def generate_spdx(name, version, supplier, license_id, license_text, lib_hash, timestamp, year, doc_ns_uuid, enabled_deps, build_props): build_defines = ', '.join(k for k, _ in build_props) wolfssl_pkg = { @@ -331,7 +411,7 @@ def generate_spdx(name, version, supplier, license_id, lib_hash, 'relationshipType': 'DEPENDS_ON', }) - return { + doc = { 'spdxVersion': 'SPDX-2.3', 'dataLicense': 'CC0-1.0', 'SPDXID': 'SPDXRef-DOCUMENT', @@ -350,6 +430,12 @@ def generate_spdx(name, version, supplier, license_id, lib_hash, 'relationships': relationships, } + extracted = build_extracted_licensing_infos(license_id, license_text) + if extracted: + doc['hasExtractedLicensingInfos'] = extracted + + return doc + def main(): parser = argparse.ArgumentParser( @@ -369,6 +455,13 @@ def main(): '(e.g. LicenseRef-wolfSSL-Commercial). Useful ' 'for commercial licensees regenerating the SBOM ' 'for their own product.') + parser.add_argument('--license-text', default='', + help='Path to a plain-text licence file whose ' + 'contents are embedded in the SBOM as the ' + '`extractedText` for any LicenseRef-* used in ' + '`--license-override`. Required by SPDX 2.3 ' + 'validators (e.g. pyspdxtools) for any custom ' + 'licence reference.') parser.add_argument('--options-h', required=True, help='Path to wolfssl/options.h for build config') parser.add_argument('--dep-liboqs', default='no', @@ -418,6 +511,13 @@ def main(): file=sys.stderr) license_id = 'NOASSERTION' + license_text = load_license_text(args.license_text) + if extract_license_refs(license_id) and license_text is None: + print("WARNING: --license-override uses a LicenseRef-* but " + "--license-text was not provided; the SBOM will embed a " + "placeholder. Provide SBOM_LICENSE_TEXT= for full " + "SPDX compliance.", file=sys.stderr) + build_props = parse_options_h(args.options_h) lib_hash = sha256_file(args.lib) dt, timestamp = build_timestamp() @@ -427,12 +527,12 @@ def main(): cdx = generate_cdx( args.name, args.version, args.supplier, - license_id, lib_hash, timestamp, year, serial, + license_id, license_text, lib_hash, timestamp, year, serial, enabled_deps, build_props, ) spdx = generate_spdx( args.name, args.version, args.supplier, - license_id, lib_hash, timestamp, year, doc_ns_uuid, + license_id, license_text, lib_hash, timestamp, year, doc_ns_uuid, enabled_deps, build_props, ) From 202d88c3dd1245969bfb41e09f4779a33c40421b Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 15:59:55 +0300 Subject: [PATCH 04/16] test(sbom): unit tests and CI integration workflow Cover gen-sbom helpers and add a workflow that validates SPDX/CDX, NTIA conformance, reproducibility, and the licence-override matrix. Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 245 ++++++++++++++++++++++++++++++++++ scripts/gen-sbom | 6 +- scripts/test_gen_sbom.py | 260 +++++++++++++++++++++++++++++++++++++ 3 files changed, 509 insertions(+), 2 deletions(-) create mode 100644 .github/workflows/sbom.yml create mode 100644 scripts/test_gen_sbom.py diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml new file mode 100644 index 00000000000..f2e82f6fd29 --- /dev/null +++ b/.github/workflows/sbom.yml @@ -0,0 +1,245 @@ +name: SBOM Tests + +# START OF COMMON SECTION +on: + push: + branches: [ 'master', 'main', 'release/**' ] + pull_request: + branches: [ '*' ] + +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true +# END OF COMMON SECTION + +jobs: + # Tier 1 - pure-Python unit tests for scripts/gen-sbom. + # No build, no autotools, no external deps. Runs in seconds and is the + # cheapest gate for licence/UUID/timestamp logic regressions. + unit: + name: gen-sbom unit tests + if: github.repository_owner == 'wolfssl' + runs-on: ubuntu-24.04 + timeout-minutes: 5 + steps: + - uses: actions/checkout@v4 + + - name: Syntax check + run: python3 -m py_compile scripts/gen-sbom + + - name: Unit tests + run: python3 -W error::ResourceWarning -m unittest scripts/test_gen_sbom.py -v + + # Tier 2 - integration: build wolfSSL, generate the SBOMs, and assert + # everything an external auditor or vulnerability scanner relies on. + integration: + name: SBOM integration + if: github.repository_owner == 'wolfssl' + runs-on: ubuntu-24.04 + needs: unit + timeout-minutes: 20 + steps: + - uses: actions/checkout@v4 + + # Pin tool versions; drift in any of these silently changes what + # "valid" means and produces mystery CI failures. + - name: Install SBOM validators + run: | + python3 -m pip install --user --upgrade pip + python3 -m pip install --user \ + 'spdx-tools==0.8.*' \ + 'ntia-conformance-checker==5.*' \ + 'cyclonedx-bom==7.*' + echo "$HOME/.local/bin" >> "$GITHUB_PATH" + + - name: Configure wolfSSL (shared + static) + run: autoreconf -ivf && ./configure --enable-shared --enable-static + + - name: Build + generate SBOM (default GPL) + run: make sbom + + # ---- Format-level validators ----------------------------------------- + + - name: SPDX 2.3 - NTIA Minimum Elements (2021) + # Already validated structurally by pyspdxtools inside `make sbom`. + # NTIA conformance is the additional contract auditors rely on. + run: ntia-checker -c ntia wolfssl-*.spdx.json + + - name: CycloneDX 1.6 - JSON schema validation + run: | + python3 - <<'PY' + import glob, sys + from cyclonedx.validation.json import JsonStrictValidator + from cyclonedx.schema import SchemaVersion + v = JsonStrictValidator(SchemaVersion.V1_6) + for path in glob.glob('wolfssl-*.cdx.json'): + errors = v.validate_str(open(path).read()) + if errors: + print(f"INVALID: {path}: {errors}", file=sys.stderr) + sys.exit(1) + print(f"OK: {path}") + PY + + # ---- Artefact-integrity assertions ---------------------------------- + + - name: Library hash matches the SBOM + # `make sbom` cleans its private staging tree on exit, so we install + # to an independent prefix and re-hash the resulting library. The + # autotools install is deterministic (identical bytes), so the hash + # the SBOM recorded must match. + run: | + rm -rf /tmp/_inst + make install DESTDIR=/tmp/_inst >/dev/null + LIB=$(ls /tmp/_inst/usr/local/lib/libwolfssl.so* 2>/dev/null \ + | grep -v '\.la$' | head -1) + test -n "$LIB" || (echo "no installed shared lib"; exit 1) + EXPECTED=$(sha256sum "$LIB" | cut -d' ' -f1) + ACTUAL=$(python3 -c " + import json, glob + d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + p = [x for x in d['packages'] if x['name'] == 'wolfssl'][0] + print(p['checksums'][0]['checksumValue'])") + test "$EXPECTED" = "$ACTUAL" || \ + { echo "hash mismatch: expected=$EXPECTED actual=$ACTUAL"; exit 1; } + + - name: CPE 2.3 and PURL identifiers well-formed + # A typo in supplier or product name silently breaks every + # downstream OSV / Trivy / Grype scan. + run: | + python3 - <<'PY' + import glob, json, re, sys + d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + refs = {r['referenceType']: r['referenceLocator'] + for r in d['packages'][0]['externalRefs']} + assert re.match(r'cpe:2\.3:a:wolfssl:wolfssl:[\d.]+:', refs['cpe23Type']), refs + assert re.match(r'pkg:generic/wolfssl@[\d.]+', refs['purl']), refs + print('identifiers ok:', refs) + PY + + # ---- Reproducibility ------------------------------------------------- + + - name: Reproducibility under SOURCE_DATE_EPOCH + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + SOURCE_DATE_EPOCH=1700000000 make sbom + sha256sum wolfssl-*.cdx.json wolfssl-*.spdx.json > /tmp/a.sums + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + SOURCE_DATE_EPOCH=1700000000 make sbom + sha256sum wolfssl-*.cdx.json wolfssl-*.spdx.json > /tmp/b.sums + diff /tmp/a.sums /tmp/b.sums + + # ---- Licence-override matrix ---------------------------------------- + + - name: License matrix - default GPL + # Detected from LICENSING. The current upstream file reads + # "GNU General Public License version 3" without "or later", so + # detect_license returns GPL-3.0-only. If LICENSING is updated to + # add "or any later version", switch this assertion to + # GPL-3.0-or-later. + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + make sbom + python3 - <<'PY' + import glob, json + d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + assert d['packages'][0]['licenseConcluded'].startswith('GPL-3.0-'), \ + d['packages'][0]['licenseConcluded'] + assert 'hasExtractedLicensingInfos' not in d + cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + lic = cdx['metadata']['component']['licenses'] + assert lic == [{'license': {'id': d['packages'][0]['licenseConcluded']}}], lic + print('default GPL: ok ->', lic) + PY + + - name: License matrix - LicenseRef + text + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + make sbom \ + SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial \ + SBOM_LICENSE_TEXT="$PWD/COPYING" + python3 - <<'PY' + import glob, json + expected = open('COPYING').read() + d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + infos = d['hasExtractedLicensingInfos'] + assert len(infos) == 1 + assert infos[0]['licenseId'] == 'LicenseRef-wolfSSL-Commercial' + assert infos[0]['extractedText'] == expected + cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + lic = cdx['metadata']['component']['licenses'][0]['license'] + assert lic['name'] == 'LicenseRef-wolfSSL-Commercial' + assert lic['text']['content'] == expected + print('LicenseRef + text: ok') + PY + # The output of this run must still pass NTIA and CDX validators. + ntia-checker -c ntia wolfssl-*.spdx.json + python3 -c " + from cyclonedx.validation.json import JsonStrictValidator + from cyclonedx.schema import SchemaVersion + import glob, sys + v = JsonStrictValidator(SchemaVersion.V1_6) + errs = v.validate_str(open(glob.glob('wolfssl-*.cdx.json')[0]).read()) + sys.exit(1 if errs else 0)" + + - name: License matrix - compound expression + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + make sbom \ + SBOM_LICENSE_OVERRIDE='GPL-3.0-only OR LicenseRef-wolfSSL-Commercial' \ + SBOM_LICENSE_TEXT="$PWD/COPYING" + python3 - <<'PY' + import glob, json + d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + assert len(d['hasExtractedLicensingInfos']) == 1 + cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + entry = cdx['metadata']['component']['licenses'][0] + assert 'expression' in entry, entry + print('compound expression: ok') + PY + + - name: License matrix - simple SPDX override + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + make sbom SBOM_LICENSE_OVERRIDE=Apache-2.0 + python3 - <<'PY' + import glob, json + d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + assert 'hasExtractedLicensingInfos' not in d + cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + lic = cdx['metadata']['component']['licenses'][0]['license'] + assert lic == {'id': 'Apache-2.0'}, lic + print('simple SPDX override: ok') + PY + + # ---- Distribution + install hooks ----------------------------------- + + - name: Tarball roundtrip (make dist -> ./configure -> make sbom) + # If a future change adds a new helper file but forgets EXTRA_DIST, + # the tarball will not contain it and this step fails. + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + make dist + mkdir /tmp/tb + tar -xzf wolfssl-*.tar.gz -C /tmp/tb + cd /tmp/tb/wolfssl-* + ./configure --enable-shared + make sbom + + - name: Install-sbom / uninstall hook + # `install-sbom` is a separate target (intentional - SBOM generation + # has heavy deps like pyspdxtools that we do not want firing on + # every `make install`). `make uninstall` runs uninstall-hook, + # which removes both regular and SBOM artefacts idempotently. + run: | + rm -rf /tmp/_inst2 + make install DESTDIR=/tmp/_inst2 >/dev/null + make install-sbom DESTDIR=/tmp/_inst2 + ls /tmp/_inst2/usr/local/share/doc/wolfssl/wolfssl-*.spdx.json \ + /tmp/_inst2/usr/local/share/doc/wolfssl/wolfssl-*.cdx.json \ + /tmp/_inst2/usr/local/share/doc/wolfssl/wolfssl-*.spdx + make uninstall DESTDIR=/tmp/_inst2 + if ls /tmp/_inst2/usr/local/share/doc/wolfssl/wolfssl-*.spdx.json \ + 2>/dev/null; then + echo "uninstall-hook did not remove SBOM artefacts" + exit 1 + fi diff --git a/scripts/gen-sbom b/scripts/gen-sbom index 1b794de784a..a3afb2fb12c 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -167,7 +167,8 @@ def detect_license(license_file): prints a warning if the file cannot be parsed. """ try: - text = open(license_file).read() + with open(license_file) as f: + text = f.read() except OSError as e: print(f"WARNING: cannot read license file {license_file}: {e}", file=sys.stderr) @@ -247,7 +248,8 @@ def parse_options_h(path): """Parse wolfssl/options.h and return sorted deduplicated list of (name, value) pairs for every #define found.""" try: - text = open(path).read() + with open(path) as f: + text = f.read() except OSError as e: print(f"WARNING: cannot read options.h {path}: {e}", file=sys.stderr) return [] diff --git a/scripts/test_gen_sbom.py b/scripts/test_gen_sbom.py new file mode 100644 index 00000000000..0a75da076ff --- /dev/null +++ b/scripts/test_gen_sbom.py @@ -0,0 +1,260 @@ +#!/usr/bin/env python3 +"""Unit tests for the helpers in scripts/gen-sbom. + +Run from the repo root: + + python3 -m unittest scripts/test_gen_sbom.py + +These tests cover the pure logic in gen-sbom (license expression handling, +deterministic UUID derivation, SOURCE_DATE_EPOCH timestamp parsing). They +intentionally avoid touching the filesystem-heavy paths (sha256_file, +parse_options_h, pkg-config) which are exercised end-to-end by the +integration tests in .github/workflows/sbom.yml. +""" + +import importlib.util +import os +import pathlib +import tempfile +import unittest +import uuid +from importlib.machinery import SourceFileLoader + + +def _load_gen_sbom(): + """Load gen-sbom (no .py extension) as a module under the name 'gs'. + spec_from_file_location infers the loader from the suffix; gen-sbom has + none, so we hand it a SourceFileLoader explicitly.""" + here = pathlib.Path(__file__).resolve().parent + target = here / 'gen-sbom' + if not target.is_file(): + raise FileNotFoundError( + f"expected gen-sbom alongside this test file at {target}" + ) + loader = SourceFileLoader('gs', str(target)) + spec = importlib.util.spec_from_loader('gs', loader) + module = importlib.util.module_from_spec(spec) + loader.exec_module(module) + return module + + +gs = _load_gen_sbom() + + +class TestIsSimpleSpdxId(unittest.TestCase): + def test_listed_ids_are_simple(self): + for spdx in ('Apache-2.0', 'MIT', 'GPL-3.0-or-later', + 'GPL-2.0-only', 'BSD-3-Clause', 'CC0-1.0', 'Zlib'): + self.assertTrue(gs.is_simple_spdx_id(spdx), + f"{spdx!r} should be simple") + + def test_license_refs_are_not_simple(self): + self.assertFalse(gs.is_simple_spdx_id('LicenseRef-wolfSSL-Commercial')) + self.assertFalse(gs.is_simple_spdx_id('LicenseRef-Foo')) + + def test_compound_expressions_are_not_simple(self): + self.assertFalse(gs.is_simple_spdx_id('GPL-3.0-only OR MIT')) + self.assertFalse(gs.is_simple_spdx_id( + 'Apache-2.0 AND LicenseRef-Foo')) + self.assertFalse(gs.is_simple_spdx_id('(MIT OR Apache-2.0)')) + + def test_noassertion_is_not_simple(self): + self.assertFalse(gs.is_simple_spdx_id('NOASSERTION')) + + +class TestExtractLicenseRefs(unittest.TestCase): + def test_no_refs(self): + self.assertEqual(gs.extract_license_refs('Apache-2.0'), []) + self.assertEqual(gs.extract_license_refs('GPL-3.0-only OR MIT'), []) + self.assertEqual(gs.extract_license_refs(''), []) + self.assertEqual(gs.extract_license_refs(None), []) + + def test_single_ref(self): + self.assertEqual( + gs.extract_license_refs('LicenseRef-X'), ['LicenseRef-X']) + self.assertEqual( + gs.extract_license_refs('LicenseRef-wolfSSL-Commercial'), + ['LicenseRef-wolfSSL-Commercial']) + + def test_multiple_refs_are_sorted_and_deduped(self): + self.assertEqual( + gs.extract_license_refs( + 'Apache-2.0 OR LicenseRef-B AND LicenseRef-A'), + ['LicenseRef-A', 'LicenseRef-B']) + self.assertEqual( + gs.extract_license_refs( + 'LicenseRef-X OR LicenseRef-X AND LicenseRef-X'), + ['LicenseRef-X']) + + +class TestCdxLicenseBlock(unittest.TestCase): + def test_listed_id_uses_id_form(self): + self.assertEqual( + gs.cdx_license_block('Apache-2.0', None), + [{'license': {'id': 'Apache-2.0'}}]) + self.assertEqual( + gs.cdx_license_block('GPL-3.0-or-later', None), + [{'license': {'id': 'GPL-3.0-or-later'}}]) + + def test_single_ref_with_text_uses_name_and_text(self): + block = gs.cdx_license_block('LicenseRef-Foo', 'BODY') + self.assertEqual(len(block), 1) + lic = block[0]['license'] + self.assertEqual(lic['name'], 'LicenseRef-Foo') + self.assertEqual(lic['text']['content'], 'BODY') + self.assertEqual(lic['text']['contentType'], 'text/plain') + self.assertNotIn('id', lic) + + def test_single_ref_without_text_omits_text_field(self): + block = gs.cdx_license_block('LicenseRef-Foo', None) + lic = block[0]['license'] + self.assertEqual(lic['name'], 'LicenseRef-Foo') + self.assertNotIn('text', lic) + + def test_compound_uses_expression(self): + # Per CDX 1.6 schema, compound SPDX expressions go into `expression`. + # We must NOT use `id` (only listed IDs allowed) nor `name` (single + # licence only). + self.assertEqual( + gs.cdx_license_block('GPL-3.0-only OR LicenseRef-Foo', 'X'), + [{'expression': 'GPL-3.0-only OR LicenseRef-Foo'}]) + self.assertEqual( + gs.cdx_license_block('GPL-3.0-only AND MIT', None), + [{'expression': 'GPL-3.0-only AND MIT'}]) + + +class TestBuildExtractedLicensingInfos(unittest.TestCase): + def test_no_refs_returns_none(self): + self.assertIsNone( + gs.build_extracted_licensing_infos('Apache-2.0', None)) + self.assertIsNone( + gs.build_extracted_licensing_infos('GPL-3.0-only AND MIT', None)) + + def test_single_ref_with_text(self): + infos = gs.build_extracted_licensing_infos( + 'LicenseRef-wolfSSL-Commercial', 'BODY') + self.assertEqual(len(infos), 1) + self.assertEqual(infos[0]['licenseId'], + 'LicenseRef-wolfSSL-Commercial') + self.assertEqual(infos[0]['extractedText'], 'BODY') + self.assertIn('name', infos[0]) + + def test_placeholder_when_text_missing(self): + infos = gs.build_extracted_licensing_infos('LicenseRef-X', None) + self.assertEqual(len(infos), 1) + # Placeholder must mention how to fix it so reviewers/auditors who + # inspect the SBOM know what's wrong. + text = infos[0]['extractedText'] + self.assertIn('--license-text', text) + + def test_multiple_refs_each_get_entry(self): + infos = gs.build_extracted_licensing_infos( + 'LicenseRef-A OR LicenseRef-B', 'BODY') + self.assertEqual( + sorted(i['licenseId'] for i in infos), + ['LicenseRef-A', 'LicenseRef-B']) + for i in infos: + self.assertEqual(i['extractedText'], 'BODY') + + +class TestDerivedUuid(unittest.TestCase): + def test_deterministic(self): + a = gs.derived_uuid('wolfssl', '5.9.1', 'package') + b = gs.derived_uuid('wolfssl', '5.9.1', 'package') + self.assertEqual(a, b) + + def test_different_inputs_diverge(self): + self.assertNotEqual( + gs.derived_uuid('wolfssl', '5.9.1', 'package'), + gs.derived_uuid('wolfssl', '5.9.2', 'package')) + self.assertNotEqual( + gs.derived_uuid('wolfssl', '5.9.1', 'package'), + gs.derived_uuid('wolfssl', '5.9.1', 'serial')) + + def test_returns_valid_uuid_string(self): + s = gs.derived_uuid('a', 'b') + # Will raise if not a valid UUID. + parsed = uuid.UUID(s) + self.assertEqual(str(parsed), s) + + +class TestBuildTimestamp(unittest.TestCase): + def setUp(self): + self._saved = os.environ.get('SOURCE_DATE_EPOCH') + + def tearDown(self): + if self._saved is None: + os.environ.pop('SOURCE_DATE_EPOCH', None) + else: + os.environ['SOURCE_DATE_EPOCH'] = self._saved + + def test_honors_source_date_epoch(self): + os.environ['SOURCE_DATE_EPOCH'] = '1700000000' + dt, ts = gs.build_timestamp() + self.assertEqual(dt.year, 2023) + self.assertEqual(ts, '2023-11-14T22:13:20Z') + + def test_two_calls_with_same_sde_match(self): + os.environ['SOURCE_DATE_EPOCH'] = '1700000000' + _, t1 = gs.build_timestamp() + _, t2 = gs.build_timestamp() + self.assertEqual(t1, t2) + + def test_invalid_sde_falls_back_to_now(self): + os.environ['SOURCE_DATE_EPOCH'] = 'not-a-number' + dt, ts = gs.build_timestamp() + # Should still produce a UTC ISO-Z timestamp; we only check shape. + self.assertRegex( + ts, r'\A\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\Z') + + def test_no_sde_is_current_utc(self): + os.environ.pop('SOURCE_DATE_EPOCH', None) + _, ts = gs.build_timestamp() + self.assertRegex( + ts, r'\A\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\Z') + + +class TestLoadLicenseText(unittest.TestCase): + def test_empty_path_returns_none(self): + self.assertIsNone(gs.load_license_text('')) + self.assertIsNone(gs.load_license_text(None)) + + def test_real_file(self): + with tempfile.NamedTemporaryFile('w', suffix='.txt', + delete=False) as f: + f.write('LICENCE BODY\n') + path = f.name + try: + self.assertEqual(gs.load_license_text(path), 'LICENCE BODY\n') + finally: + os.unlink(path) + + def test_missing_file_exits(self): + with self.assertRaises(SystemExit): + gs.load_license_text('/no/such/path/please.txt') + + +class TestParseOptionsH(unittest.TestCase): + def test_parses_defines_sorted_and_deduped(self): + with tempfile.NamedTemporaryFile('w', suffix='.h', + delete=False) as f: + f.write( + "/* fake options.h */\n" + "#define HAVE_BAR\n" + "#define HAVE_AAA 1\n" + "#define HAVE_BAR /* duplicate */\n" + "#define HAVE_FOO 42\n" + ) + path = f.name + try: + pairs = gs.parse_options_h(path) + finally: + os.unlink(path) + names = [k for k, _ in pairs] + self.assertEqual(names, sorted(set(names))) + self.assertIn(('HAVE_AAA', '1'), pairs) + self.assertIn(('HAVE_FOO', '42'), pairs) + + +if __name__ == '__main__': + unittest.main(verbosity=2) From 8b5f4be37d589c7557599b62a0ad6eebe1c2b213 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 17:13:28 +0300 Subject: [PATCH 05/16] fix(sbom): correctness fixes from code review Hard-error on LicenseRef-* without --license-text, route NOASSERTION via license.name, strip C comments in options.h, NUL-join derived UUIDs, and detect git worktrees for SOURCE_DATE_EPOCH. Signed-off-by: Sameeh Jubran --- Makefile.am | 2 +- doc/CRA.md | 6 +++--- doc/SBOM.md | 9 ++++----- scripts/gen-sbom | 39 ++++++++++++++++++++++++++++++--------- 4 files changed, 38 insertions(+), 18 deletions(-) diff --git a/Makefile.am b/Makefile.am index 3e4a32612f5..b779431828b 100644 --- a/Makefile.am +++ b/Makefile.am @@ -416,7 +416,7 @@ sbom: fi; \ echo "SBOM: hashing $$sbom_lib"; \ if test -z "$${SOURCE_DATE_EPOCH:-}" && test -n "$(GIT)" && \ - test -d "$(srcdir)/.git"; then \ + $(GIT) -C "$(srcdir)" rev-parse --git-dir >/dev/null 2>&1; then \ SOURCE_DATE_EPOCH=`$(GIT) -C "$(srcdir)" log -1 --format=%ct 2>/dev/null || echo`; \ export SOURCE_DATE_EPOCH; \ fi; \ diff --git a/doc/CRA.md b/doc/CRA.md index 21a4d61bf97..d477d3a1f09 100644 --- a/doc/CRA.md +++ b/doc/CRA.md @@ -147,9 +147,9 @@ the document; conformant validators (e.g. `pyspdxtools`, `ntia-conformance-check will reject the SBOM otherwise. The file should contain the plain-text licence agreement you received from wolfSSL. -If you omit `SBOM_LICENSE_TEXT` the generator emits a placeholder and prints -a warning — useful for quick experiments, but the result is **not** valid for -distribution to customers or regulators. +If `SBOM_LICENSE_OVERRIDE` is set to a `LicenseRef-*` and `SBOM_LICENSE_TEXT` +is missing, `make sbom` exits with an error rather than emit an invalid SBOM +that might end up in front of a regulator. For a stock SPDX-listed identifier (`Apache-2.0`, `MIT`, etc.) the `SBOM_LICENSE_TEXT` argument is unnecessary because validators already know diff --git a/doc/SBOM.md b/doc/SBOM.md index bcb88c78423..5dabf1af553 100644 --- a/doc/SBOM.md +++ b/doc/SBOM.md @@ -98,13 +98,12 @@ python3 scripts/gen-sbom \ ... other flags ... ``` -`--license-text` is required whenever `--license-override` is a custom +`--license-text` is **required** whenever `--license-override` is a custom `LicenseRef-*`: SPDX 2.3 mandates that any LicenseRef in `licenseConcluded` or `licenseDeclared` be backed by a `hasExtractedLicensingInfos` entry that -embeds the actual licence text. Without it, validators such as -`pyspdxtools` and `ntia-conformance-checker` reject the document. The -generator emits a placeholder and a warning in that case so the bug is -visible, but the SBOM is *not* valid for downstream consumers. +embeds the actual licence text. Running without it is a configuration +error and the generator exits non-zero rather than emit a misleading SBOM +that auditors might then circulate. For an SPDX-listed override (`Apache-2.0`, `MIT`, etc.), `--license-text` is unnecessary because validators already know the canonical text. diff --git a/scripts/gen-sbom b/scripts/gen-sbom index a3afb2fb12c..d5295c68184 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -20,8 +20,13 @@ SBOM_UUID_NAMESPACE = uuid.uuid5(uuid.NAMESPACE_URL, 'https://wolfssl.com/sbom/' def derived_uuid(*parts): """Deterministic UUID from joined parts under the wolfSSL SBOM namespace. Re-runs of `make sbom` against the same source produce identical UUIDs, - which is required for reproducible-build-style SBOM hashing.""" - return str(uuid.uuid5(SBOM_UUID_NAMESPACE, '/'.join(parts))) + which is required for reproducible-build-style SBOM hashing. + + Uses NUL as a separator so no aliasing is possible between e.g. + derived_uuid('a/b', 'c') and derived_uuid('a', 'b/c'); NUL cannot + appear in any of the call-site inputs (package name, version, role + label, dep key).""" + return str(uuid.uuid5(SBOM_UUID_NAMESPACE, '\x00'.join(parts))) def build_timestamp(): @@ -148,6 +153,11 @@ def cdx_license_block(license_expr, license_text): * `license.name` - a non-listed licence (e.g. a LicenseRef-*) * `expression` - a compound SPDX expression Picking the wrong shape causes downstream tooling to reject the SBOM.""" + # NOASSERTION is a reserved SPDX value, not a parseable SPDX expression; + # emit it via license.name so CDX validators don't choke trying to parse + # it as one. + if license_expr == 'NOASSERTION': + return [{'license': {'name': 'NOASSERTION'}}] if is_simple_spdx_id(license_expr): return [{'license': {'id': license_expr}}] refs = extract_license_refs(license_expr) @@ -246,7 +256,11 @@ def dep_version(key): def parse_options_h(path): """Parse wolfssl/options.h and return sorted deduplicated list of - (name, value) pairs for every #define found.""" + (name, value) pairs for every #define found. + + Trailing C/C++ comments on a #define line (`#define HAVE_FOO 42 /* x */` + or `// y`) are stripped; otherwise they would land verbatim in the + SBOM build properties.""" try: with open(path) as f: text = f.read() @@ -255,8 +269,10 @@ def parse_options_h(path): return [] defines = {} - for m in re.finditer(r'^#define[ \t]+(\w+)(?:[ \t]+(.+))?$', text, re.MULTILINE): - defines[m.group(1)] = (m.group(2) or '').strip() + for m in re.finditer(r'^#define[ \t]+(\w+)(?:[ \t]+(.*))?$', text, re.MULTILINE): + raw = (m.group(2) or '') + raw = re.split(r'/\*|//', raw, maxsplit=1)[0] + defines[m.group(1)] = raw.strip() return sorted(defines.items()) @@ -515,10 +531,15 @@ def main(): license_text = load_license_text(args.license_text) if extract_license_refs(license_id) and license_text is None: - print("WARNING: --license-override uses a LicenseRef-* but " - "--license-text was not provided; the SBOM will embed a " - "placeholder. Provide SBOM_LICENSE_TEXT= for full " - "SPDX compliance.", file=sys.stderr) + sys.exit( + "ERROR: --license-override contains a LicenseRef-* identifier " + "but --license-text was not provided.\n" + " SPDX 2.3 requires the licence text to be embedded in " + "hasExtractedLicensingInfos for any LicenseRef-* used in " + "licenseConcluded/licenseDeclared.\n" + " Re-run with --license-text PATH (or " + "`make sbom SBOM_LICENSE_TEXT=PATH`)." + ) build_props = parse_options_h(args.options_h) lib_hash = sha256_file(args.lib) From 15709b7c2d1101f9cc5ece44eb239bce759ca175 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 17:13:45 +0300 Subject: [PATCH 06/16] test(sbom): regression coverage and macOS CI job Add tests for the review-driven gen-sbom fixes, tighten file-handle hygiene in the linux integration job, and add a macOS smoke job for .dylib detection. Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 142 ++++++++++++++++++++++++++++++------- scripts/test_gen_sbom.py | 71 ++++++++++++++++--- 2 files changed, 176 insertions(+), 37 deletions(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index f2e82f6fd29..f1f558a75d0 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -33,7 +33,7 @@ jobs: # Tier 2 - integration: build wolfSSL, generate the SBOMs, and assert # everything an external auditor or vulnerability scanner relies on. integration: - name: SBOM integration + name: SBOM integration (linux) if: github.repository_owner == 'wolfssl' runs-on: ubuntu-24.04 needs: unit @@ -52,6 +52,12 @@ jobs: 'cyclonedx-bom==7.*' echo "$HOME/.local/bin" >> "$GITHUB_PATH" + # Test fixture for the LicenseRef-+text matrix step. Using a fixture + # rather than $PWD/COPYING decouples the test from upstream file + # naming and makes the assertion exact ('FIXTURE LICENCE BODY'). + - name: Create license-text fixture + run: echo 'FIXTURE LICENCE BODY' > /tmp/sbom-fixture-licence.txt + - name: Configure wolfSSL (shared + static) run: autoreconf -ivf && ./configure --enable-shared --enable-static @@ -73,7 +79,8 @@ jobs: from cyclonedx.schema import SchemaVersion v = JsonStrictValidator(SchemaVersion.V1_6) for path in glob.glob('wolfssl-*.cdx.json'): - errors = v.validate_str(open(path).read()) + with open(path) as f: + errors = v.validate_str(f.read()) if errors: print(f"INVALID: {path}: {errors}", file=sys.stderr) sys.exit(1) @@ -84,19 +91,29 @@ jobs: - name: Library hash matches the SBOM # `make sbom` cleans its private staging tree on exit, so we install - # to an independent prefix and re-hash the resulting library. The - # autotools install is deterministic (identical bytes), so the hash - # the SBOM recorded must match. + # to an independent prefix and re-hash the resulting library. + # Search order matches gen-sbom's so we hash the same artefact. run: | rm -rf /tmp/_inst make install DESTDIR=/tmp/_inst >/dev/null - LIB=$(ls /tmp/_inst/usr/local/lib/libwolfssl.so* 2>/dev/null \ - | grep -v '\.la$' | head -1) - test -n "$LIB" || (echo "no installed shared lib"; exit 1) - EXPECTED=$(sha256sum "$LIB" | cut -d' ' -f1) + LIB="" + for cand in /tmp/_inst/usr/local/lib/libwolfssl.so.[0-9]* \ + /tmp/_inst/usr/local/lib/libwolfssl.so \ + /tmp/_inst/usr/local/lib/libwolfssl.a; do + if [ -f "$cand" ]; then LIB="$cand"; break; fi + done + test -n "$LIB" || (echo "no installed library found"; exit 1) + EXPECTED=$(python3 -c " + import hashlib, sys + h = hashlib.sha256() + with open(sys.argv[1], 'rb') as f: + for chunk in iter(lambda: f.read(65536), b''): + h.update(chunk) + print(h.hexdigest())" "$LIB") ACTUAL=$(python3 -c " import json, glob - d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) p = [x for x in d['packages'] if x['name'] == 'wolfssl'][0] print(p['checksums'][0]['checksumValue'])") test "$EXPECTED" = "$ACTUAL" || \ @@ -108,7 +125,8 @@ jobs: run: | python3 - <<'PY' import glob, json, re, sys - d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) refs = {r['referenceType']: r['referenceLocator'] for r in d['packages'][0]['externalRefs']} assert re.match(r'cpe:2\.3:a:wolfssl:wolfssl:[\d.]+:', refs['cpe23Type']), refs @@ -141,11 +159,13 @@ jobs: make sbom python3 - <<'PY' import glob, json - d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) assert d['packages'][0]['licenseConcluded'].startswith('GPL-3.0-'), \ d['packages'][0]['licenseConcluded'] assert 'hasExtractedLicensingInfos' not in d - cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + with open(glob.glob('wolfssl-*.cdx.json')[0]) as f: + cdx = json.load(f) lic = cdx['metadata']['component']['licenses'] assert lic == [{'license': {'id': d['packages'][0]['licenseConcluded']}}], lic print('default GPL: ok ->', lic) @@ -156,16 +176,19 @@ jobs: rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx make sbom \ SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial \ - SBOM_LICENSE_TEXT="$PWD/COPYING" + SBOM_LICENSE_TEXT=/tmp/sbom-fixture-licence.txt python3 - <<'PY' import glob, json - expected = open('COPYING').read() - d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + with open('/tmp/sbom-fixture-licence.txt') as f: + expected = f.read() + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) infos = d['hasExtractedLicensingInfos'] assert len(infos) == 1 assert infos[0]['licenseId'] == 'LicenseRef-wolfSSL-Commercial' assert infos[0]['extractedText'] == expected - cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + with open(glob.glob('wolfssl-*.cdx.json')[0]) as f: + cdx = json.load(f) lic = cdx['metadata']['component']['licenses'][0]['license'] assert lic['name'] == 'LicenseRef-wolfSSL-Commercial' assert lic['text']['content'] == expected @@ -173,25 +196,47 @@ jobs: PY # The output of this run must still pass NTIA and CDX validators. ntia-checker -c ntia wolfssl-*.spdx.json - python3 -c " + python3 - <<'PY' + import glob, sys from cyclonedx.validation.json import JsonStrictValidator from cyclonedx.schema import SchemaVersion - import glob, sys v = JsonStrictValidator(SchemaVersion.V1_6) - errs = v.validate_str(open(glob.glob('wolfssl-*.cdx.json')[0]).read()) - sys.exit(1 if errs else 0)" + with open(glob.glob('wolfssl-*.cdx.json')[0]) as f: + errs = v.validate_str(f.read()) + sys.exit(1 if errs else 0) + PY + + - name: License matrix - LicenseRef without text must FAIL + # gen-sbom must refuse to emit a SBOM that names a LicenseRef-* + # but doesn't embed its text - that combo is invalid per SPDX 2.3 + # and any "successfully generated" output would mislead auditors. + run: | + rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx + if make sbom SBOM_LICENSE_OVERRIDE=LicenseRef-wolfSSL-Commercial \ + 2>/tmp/err; then + echo "FAIL: gen-sbom should have refused this configuration" + exit 1 + fi + grep -q 'license-text was not provided' /tmp/err || \ + { echo "FAIL: error message missing actionable hint"; \ + cat /tmp/err; exit 1; } + test ! -f wolfssl-5.9.1.spdx.json || \ + { echo "FAIL: SBOM file should not exist after refusal"; \ + exit 1; } - name: License matrix - compound expression run: | rm -f wolfssl-*.cdx.json wolfssl-*.spdx.json wolfssl-*.spdx make sbom \ SBOM_LICENSE_OVERRIDE='GPL-3.0-only OR LicenseRef-wolfSSL-Commercial' \ - SBOM_LICENSE_TEXT="$PWD/COPYING" + SBOM_LICENSE_TEXT=/tmp/sbom-fixture-licence.txt python3 - <<'PY' import glob, json - d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) assert len(d['hasExtractedLicensingInfos']) == 1 - cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + with open(glob.glob('wolfssl-*.cdx.json')[0]) as f: + cdx = json.load(f) entry = cdx['metadata']['component']['licenses'][0] assert 'expression' in entry, entry print('compound expression: ok') @@ -203,9 +248,11 @@ jobs: make sbom SBOM_LICENSE_OVERRIDE=Apache-2.0 python3 - <<'PY' import glob, json - d = json.load(open(glob.glob('wolfssl-*.spdx.json')[0])) + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) assert 'hasExtractedLicensingInfos' not in d - cdx = json.load(open(glob.glob('wolfssl-*.cdx.json')[0])) + with open(glob.glob('wolfssl-*.cdx.json')[0]) as f: + cdx = json.load(f) lic = cdx['metadata']['component']['licenses'][0]['license'] assert lic == {'id': 'Apache-2.0'}, lic print('simple SPDX override: ok') @@ -243,3 +290,46 @@ jobs: echo "uninstall-hook did not remove SBOM artefacts" exit 1 fi + + # Tier 2 (macOS) - smoke test that gen-sbom finds .dylib artefacts and + # that the autotools target works on Mach-O. Linux already exercises + # the heavy validation matrix; this job is intentionally minimal so the + # macOS runner minutes go to portability coverage, not duplicated checks. + integration-macos: + name: SBOM integration (macos) + if: github.repository_owner == 'wolfssl' + runs-on: macos-latest + needs: unit + timeout-minutes: 20 + steps: + - uses: actions/checkout@v4 + + - name: Install build deps and SBOM validators + run: | + brew install autoconf automake libtool + python3 -m pip install --user --break-system-packages \ + 'spdx-tools==0.8.*' + echo "$HOME/.local/bin" >> "$GITHUB_PATH" + # On some macOS runners pyspdxtools lands in + # Library/Python//bin; symlink to a known-on-PATH location. + for d in "$HOME/Library/Python"/*/bin; do + [ -x "$d/pyspdxtools" ] && \ + echo "$d" >> "$GITHUB_PATH" + done + + - name: Configure wolfSSL (shared) + run: autoreconf -ivf && ./configure --enable-shared + + - name: Build + generate SBOM (verifies .dylib detection) + run: make sbom + + - name: SBOM hashed a real .dylib + run: | + python3 - <<'PY' + import glob, json, re + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) + checksum = d['packages'][0]['checksums'][0]['checksumValue'] + assert re.fullmatch(r'[0-9a-f]{64}', checksum), checksum + print('macOS SBOM checksum well-formed:', checksum) + PY diff --git a/scripts/test_gen_sbom.py b/scripts/test_gen_sbom.py index 0a75da076ff..caee41b0889 100644 --- a/scripts/test_gen_sbom.py +++ b/scripts/test_gen_sbom.py @@ -122,6 +122,17 @@ def test_compound_uses_expression(self): gs.cdx_license_block('GPL-3.0-only AND MIT', None), [{'expression': 'GPL-3.0-only AND MIT'}]) + def test_noassertion_uses_name_not_expression(self): + # NOASSERTION is a reserved SPDX literal, not a parseable SPDX + # expression - shoving it into `expression` makes some CDX + # validators choke when they try to parse it. + self.assertEqual( + gs.cdx_license_block('NOASSERTION', None), + [{'license': {'name': 'NOASSERTION'}}]) + self.assertEqual( + gs.cdx_license_block('NOASSERTION', 'ignored'), + [{'license': {'name': 'NOASSERTION'}}]) + class TestBuildExtractedLicensingInfos(unittest.TestCase): def test_no_refs_returns_none(self): @@ -177,6 +188,18 @@ def test_returns_valid_uuid_string(self): parsed = uuid.UUID(s) self.assertEqual(str(parsed), s) + def test_separator_does_not_alias_inputs(self): + # If the helper joined parts on a printable character (e.g. '/'), + # then ('a/b', 'c') would collide with ('a', 'b/c'). NUL is not + # representable in any of the call-site inputs, so the join must + # be unambiguous. Regression guard for that contract. + self.assertNotEqual( + gs.derived_uuid('a/b', 'c'), + gs.derived_uuid('a', 'b/c')) + self.assertNotEqual( + gs.derived_uuid('a-b', 'c'), + gs.derived_uuid('a', 'b-c')) + class TestBuildTimestamp(unittest.TestCase): def setUp(self): @@ -235,25 +258,51 @@ def test_missing_file_exits(self): class TestParseOptionsH(unittest.TestCase): - def test_parses_defines_sorted_and_deduped(self): + def _parse(self, body): with tempfile.NamedTemporaryFile('w', suffix='.h', delete=False) as f: - f.write( - "/* fake options.h */\n" - "#define HAVE_BAR\n" - "#define HAVE_AAA 1\n" - "#define HAVE_BAR /* duplicate */\n" - "#define HAVE_FOO 42\n" - ) + f.write(body) path = f.name try: - pairs = gs.parse_options_h(path) + return gs.parse_options_h(path) finally: os.unlink(path) + + def test_parses_defines_sorted_and_deduped(self): + pairs = self._parse( + "/* fake options.h */\n" + "#define HAVE_BAR\n" + "#define HAVE_AAA 1\n" + "#define HAVE_FOO 42\n" + ) names = [k for k, _ in pairs] self.assertEqual(names, sorted(set(names))) - self.assertIn(('HAVE_AAA', '1'), pairs) - self.assertIn(('HAVE_FOO', '42'), pairs) + self.assertEqual(dict(pairs)['HAVE_AAA'], '1') + self.assertEqual(dict(pairs)['HAVE_FOO'], '42') + self.assertEqual(dict(pairs)['HAVE_BAR'], '') + + def test_strips_trailing_block_comment(self): + # Regression: an earlier version captured the comment text into + # the value, polluting the SBOM build properties. + pairs = dict(self._parse("#define HAVE_FOO 42 /* always */\n")) + self.assertEqual(pairs['HAVE_FOO'], '42') + + def test_strips_trailing_line_comment(self): + pairs = dict(self._parse("#define HAVE_FOO 42 // always\n")) + self.assertEqual(pairs['HAVE_FOO'], '42') + + def test_strips_comment_from_valueless_define(self): + pairs = dict(self._parse("#define HAVE_BAR /* set elsewhere */\n")) + self.assertEqual(pairs['HAVE_BAR'], '') + + def test_dedup_keeps_last_assignment(self): + # Last assignment wins (matches C preprocessor semantics for + # duplicate #defines after redefinition). + pairs = dict(self._parse( + "#define HAVE_X 1\n" + "#define HAVE_X 2\n" + )) + self.assertEqual(pairs['HAVE_X'], '2') if __name__ == '__main__': From 209afba2b74e66216c9b155be4e945e9525b69d2 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 17:31:35 +0300 Subject: [PATCH 07/16] refactor(sbom): DRY library glob, tighten regression tests Hoist the shared dynamic-library basenames into a Make variable used by both `sbom:` and `bomsh:`; add a sha256_file negative test and freshness checks for the build_timestamp fallback. Signed-off-by: Sameeh Jubran --- Makefile.am | 22 ++++++++++++++-------- scripts/test_gen_sbom.py | 35 +++++++++++++++++++++++++++++++++-- 2 files changed, 47 insertions(+), 10 deletions(-) diff --git a/Makefile.am b/Makefile.am index b779431828b..8dc46ef46aa 100644 --- a/Makefile.am +++ b/Makefile.am @@ -357,6 +357,18 @@ SBOM_SPDX = wolfssl-$(PACKAGE_VERSION).spdx.json SBOM_SPDX_TV = wolfssl-$(PACKAGE_VERSION).spdx sbomdir = $(datadir)/doc/$(PACKAGE) +# Shared-library / Mach-O basenames in priority order (versioned first). +# Both `sbom:` and `bomsh:` glob for these under their own search prefixes; +# adding a new platform-specific dynamic-library extension here updates +# both targets at once. Static (.a) and Windows (.dll/.lib) variants are +# listed inline at each call-site because their ordering and prefixes +# differ between the install tree and the build tree. +WOLFSSL_LIB_DSO_BASENAMES = \ + libwolfssl.so.[0-9]* \ + libwolfssl.so \ + libwolfssl.[0-9]*.dylib \ + libwolfssl.dylib + .PHONY: sbom install-sbom uninstall-sbom # Stage a `make install` into a private tree, discover the installed library @@ -395,10 +407,7 @@ sbom: $(MAKE) install DESTDIR=$(abs_builddir)/_sbom_staging; \ sbom_lib=""; \ for lib in \ - "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.so.[0-9]* \ - "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.so \ - "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.[0-9]*.dylib \ - "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.dylib \ + $(addprefix "$(abs_builddir)/_sbom_staging$(libdir)"/,$(WOLFSSL_LIB_DSO_BASENAMES)) \ "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.dll \ "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.dll.a \ "$(abs_builddir)/_sbom_staging$(libdir)"/libwolfssl.lib \ @@ -495,10 +504,7 @@ bomsh: fi; \ bomsh_artifact=""; \ for lib in \ - $(abs_builddir)/src/.libs/libwolfssl.so.[0-9]* \ - $(abs_builddir)/src/.libs/libwolfssl.so \ - $(abs_builddir)/src/.libs/libwolfssl.[0-9]*.dylib \ - $(abs_builddir)/src/.libs/libwolfssl.dylib \ + $(addprefix $(abs_builddir)/src/.libs/,$(WOLFSSL_LIB_DSO_BASENAMES)) \ $(abs_builddir)/src/.libs/libwolfssl.a \ $(abs_builddir)/src/libwolfssl.a; do \ if test -f "$$lib"; then bomsh_artifact="$$lib"; break; fi; \ diff --git a/scripts/test_gen_sbom.py b/scripts/test_gen_sbom.py index caee41b0889..c4f408c4df0 100644 --- a/scripts/test_gen_sbom.py +++ b/scripts/test_gen_sbom.py @@ -18,6 +18,7 @@ import tempfile import unittest import uuid +from datetime import datetime, timedelta, timezone from importlib.machinery import SourceFileLoader @@ -226,15 +227,24 @@ def test_two_calls_with_same_sde_match(self): def test_invalid_sde_falls_back_to_now(self): os.environ['SOURCE_DATE_EPOCH'] = 'not-a-number' dt, ts = gs.build_timestamp() - # Should still produce a UTC ISO-Z timestamp; we only check shape. + # Shape check. self.assertRegex( ts, r'\A\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\Z') + # Freshness check: regression guard against a future change that + # accidentally hard-codes the fallback (e.g. epoch zero). Five + # seconds is generous for a unit test on slow runners. + self.assertLess( + abs(dt - datetime.now(tz=timezone.utc)), + timedelta(seconds=5)) def test_no_sde_is_current_utc(self): os.environ.pop('SOURCE_DATE_EPOCH', None) - _, ts = gs.build_timestamp() + dt, ts = gs.build_timestamp() self.assertRegex( ts, r'\A\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\Z') + self.assertLess( + abs(dt - datetime.now(tz=timezone.utc)), + timedelta(seconds=5)) class TestLoadLicenseText(unittest.TestCase): @@ -257,6 +267,27 @@ def test_missing_file_exits(self): gs.load_license_text('/no/such/path/please.txt') +class TestSha256File(unittest.TestCase): + def test_real_file_hashes_to_known_value(self): + # Empty file's SHA-256 is well-known; sanity-checks the chunked + # read path produces the same digest as a one-shot hash. + with tempfile.NamedTemporaryFile('wb', delete=False) as f: + path = f.name + try: + empty_sha256 = ('e3b0c44298fc1c149afbf4c8996fb924' + '27ae41e4649b934ca495991b7852b855') + self.assertEqual(gs.sha256_file(path), empty_sha256) + finally: + os.unlink(path) + + def test_missing_file_exits_cleanly(self): + # Regression guard: gen-sbom must surface a missing --lib path as + # a clean non-zero exit, not an unhandled OSError, so `make sbom` + # fails fast with a useful message instead of a Python traceback. + with self.assertRaises(SystemExit): + gs.sha256_file('/no/such/library/please.so') + + class TestParseOptionsH(unittest.TestCase): def _parse(self, body): with tempfile.NamedTemporaryFile('w', suffix='.h', From 10ca065bb29a1fc08de0b55c0032d99038382766 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 29 Apr 2026 17:51:59 +0300 Subject: [PATCH 08/16] chore(sbom): minor cleanups from code review Refresh stale Makefile.am comment about SBOM_LICENSE_TEXT, clarify build_extracted_licensing_infos docstring, and replace a hardcoded wolfssl-5.9.1.spdx.json check with the wildcard glob used elsewhere. Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 7 ++++--- Makefile.am | 4 ++-- scripts/gen-sbom | 4 ++++ 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index f1f558a75d0..466358eb284 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -220,9 +220,10 @@ jobs: grep -q 'license-text was not provided' /tmp/err || \ { echo "FAIL: error message missing actionable hint"; \ cat /tmp/err; exit 1; } - test ! -f wolfssl-5.9.1.spdx.json || \ - { echo "FAIL: SBOM file should not exist after refusal"; \ - exit 1; } + if ls wolfssl-*.spdx.json >/dev/null 2>&1; then + echo "FAIL: SBOM file should not exist after refusal" + exit 1 + fi - name: License matrix - compound expression run: | diff --git a/Makefile.am b/Makefile.am index 8dc46ef46aa..1ba11568354 100644 --- a/Makefile.am +++ b/Makefile.am @@ -385,8 +385,8 @@ WOLFSSL_LIB_DSO_BASENAMES = \ # SBOM_LICENSE_TEXT Path to the actual licence text for any # LicenseRef-* in SBOM_LICENSE_OVERRIDE. Required # for SPDX 2.3 conformance whenever a custom -# LicenseRef is in use; without it the SBOM embeds -# a placeholder and validators may reject it. +# LicenseRef is in use; `make sbom` exits with an +# error if it is missing. sbom: @if test -z "$(PYTHON3)"; then \ echo ""; \ diff --git a/scripts/gen-sbom b/scripts/gen-sbom index d5295c68184..56dd5e91f06 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -126,6 +126,10 @@ def build_extracted_licensing_infos(license_expr, license_text): `licenseDeclared` to be declared once at document level via `hasExtractedLicensingInfos`. Returns None when no LicenseRef-* is present so the caller can omit the field entirely. + + `license_text=None` produces a placeholder entry; main() rejects + that combination upfront, so this fallback is only reachable from + direct programmatic callers (e.g. tests, library reuse). """ refs = extract_license_refs(license_expr) if not refs: From 3e1c916c5a8cc7e749a6ddb95df43e3ea285589e Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Fri, 1 May 2026 17:41:46 +0300 Subject: [PATCH 09/16] fix(sbom): adapt to upstream removal of liboqs/libxmss/liblms gates Drop dead --dep-libxmss/liblms args after PRs #10292/#10293 removed those autoconf vars. --- Makefile.am | 6 +----- configure.ac | 5 +---- doc/SBOM.md | 15 ++++++--------- scripts/gen-sbom | 46 +++++++++------------------------------------- 4 files changed, 17 insertions(+), 55 deletions(-) diff --git a/Makefile.am b/Makefile.am index 1ba11568354..5b6d533877d 100644 --- a/Makefile.am +++ b/Makefile.am @@ -437,12 +437,8 @@ sbom: --license-text '$(SBOM_LICENSE_TEXT)' \ --options-h $(abs_builddir)/wolfssl/options.h \ --lib "$$sbom_lib" \ - --dep-liboqs $(ENABLED_LIBOQS) \ - --dep-libxmss $(ENABLED_LIBXMSS) \ - --dep-libxmss-root '$(XMSS_ROOT)' \ - --dep-liblms $(ENABLED_LIBLMS) \ - --dep-liblms-root '$(LIBLMS_ROOT)' \ --dep-libz $(ENABLED_LIBZ) \ + --dep-falcon $(ENABLED_FALCON) \ --git '$(GIT)' \ --cdx-out $(abs_builddir)/$(SBOM_CDX) \ --spdx-out $(abs_builddir)/$(SBOM_SPDX); \ diff --git a/configure.ac b/configure.ac index fb02dce89a1..01ccc604a18 100644 --- a/configure.ac +++ b/configure.ac @@ -12224,11 +12224,8 @@ AC_SUBST([WOLFSSL_INCLUDEDIR_ABS]) AC_PATH_PROG([PYTHON3], [python3]) AC_PATH_PROG([PYSPDXTOOLS], [pyspdxtools]) AC_PATH_PROG([GIT], [git]) -AC_SUBST([ENABLED_LIBOQS]) -AC_SUBST([ENABLED_LIBXMSS]) -AC_SUBST([ENABLED_LIBLMS]) AC_SUBST([ENABLED_LIBZ]) -AC_SUBST([LIBLMS_ROOT]) +AC_SUBST([ENABLED_FALCON]) # Bomsh (OmniBOR build artifact tracing + SBOM enrichment) AC_PATH_PROG([BOMTRACE3], [bomtrace3]) diff --git a/doc/SBOM.md b/doc/SBOM.md index 5dabf1af553..06cd2e3b3e1 100644 --- a/doc/SBOM.md +++ b/doc/SBOM.md @@ -118,15 +118,12 @@ make sbom \ #### External dependency version detection -For dependencies with pkg-config support (`liboqs`, `libz`), the version is -queried via `pkg-config --modversion` at generation time. - -For dependencies without pkg-config (`libxmss`, `liblms`), wolfSSL is -typically built against a source checkout rather than an installed package. -The generator falls back to `git describe --tags --always` on the source -tree root (passed via `configure` as `XMSS_ROOT` / `LIBLMS_ROOT`). If the -source tree has no tags, `git describe` returns the short commit hash, which -is recorded as-is. If the source tree is unavailable or `git` is not found: +The remaining optional external dependencies (`libz`, and `falcon` via +`liboqs`) are both installed packages and are queried via +`pkg-config --modversion` at SBOM generation time. + +If pkg-config does not report a version (the package is not installed, or +its `.pc` file is missing): - SPDX records `versionInfo: NOASSERTION` and emits no `purl` external ref. - CycloneDX omits the `version` and `purl` fields entirely and the generator diff --git a/scripts/gen-sbom b/scripts/gen-sbom index 56dd5e91f06..00b92b975c9 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -49,30 +49,17 @@ def build_timestamp(): # Known metadata for optional external dependencies. # Version is detected at runtime via pkg-config; falls back to None. DEP_META = { - 'liboqs': { - 'name': 'liboqs', - 'supplier': 'Open Quantum Safe', + # Falcon is reachable only via liboqs after upstream PR #10293 collapsed + # the rest of the PQ surface into native wolfCrypt; we record the version + # of liboqs itself since that is the artefact actually linked in. + 'falcon': { + 'name': 'falcon', + 'supplier': 'Open Quantum Safe (via liboqs)', 'license': 'MIT', 'download': 'https://github.com/open-quantum-safe/liboqs', 'pkgconfig': 'liboqs', 'purl': lambda v: f'pkg:github/open-quantum-safe/liboqs@{v}', }, - 'libxmss': { - 'name': 'xmss-reference', - 'supplier': 'XMSS reference implementation authors', - 'license': 'CC0-1.0', - 'download': 'https://github.com/XMSS/xmss-reference', - 'pkgconfig': None, - 'purl': lambda v: f'pkg:github/XMSS/xmss-reference@{v}', - }, - 'liblms': { - 'name': 'hash-sigs', - 'supplier': 'Cisco Systems', - 'license': 'MIT', - 'download': 'https://github.com/cisco/hash-sigs', - 'pkgconfig': None, - 'purl': lambda v: f'pkg:github/cisco/hash-sigs@{v}', - }, 'libz': { 'name': 'zlib', 'supplier': 'Jean-loup Gailly and Mark Adler', @@ -486,18 +473,10 @@ def main(): 'licence reference.') parser.add_argument('--options-h', required=True, help='Path to wolfssl/options.h for build config') - parser.add_argument('--dep-liboqs', default='no', - help='yes if built with --with-liboqs') - parser.add_argument('--dep-libxmss', default='no', - help='yes if built with --with-libxmss') - parser.add_argument('--dep-libxmss-root', default='', - help='Path to xmss-reference source tree root') - parser.add_argument('--dep-liblms', default='no', - help='yes if built with --with-liblms') - parser.add_argument('--dep-liblms-root', default='', - help='Path to hash-sigs source tree root') parser.add_argument('--dep-libz', default='no', help='yes if built with --with-libz') + parser.add_argument('--dep-falcon', default='no', + help='yes if built with --enable-falcon (Falcon via liboqs)') parser.add_argument('--git', default='', help='Path to git binary for version detection') parser.add_argument('--cdx-out', required=True, @@ -509,17 +488,10 @@ def main(): global GIT_BIN GIT_BIN = args.git or None - if args.dep_libxmss_root: - DEP_META['libxmss']['git_root'] = args.dep_libxmss_root - if args.dep_liblms_root: - DEP_META['liblms']['git_root'] = args.dep_liblms_root - enabled_deps = [ key for key, flag in [ - ('liboqs', args.dep_liboqs), - ('libxmss', args.dep_libxmss), - ('liblms', args.dep_liblms), ('libz', args.dep_libz), + ('falcon', args.dep_falcon), ] if flag.lower() == 'yes' ] From e5f16a82d950e5e7923383d6c784adb14522fa60 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Fri, 1 May 2026 17:41:46 +0300 Subject: [PATCH 10/16] fix(sbom): record liboqs as linked artefact, not algorithm name=falcon+purl=liboqs is unresolvable for OSV/Grype/Trivy; switch to liboqs (HAVE_FALCON stays as build property). --- Makefile.am | 2 +- configure.ac | 2 +- doc/SBOM.md | 9 +++++++-- scripts/gen-sbom | 31 +++++++++++++++++++------------ 4 files changed, 28 insertions(+), 16 deletions(-) diff --git a/Makefile.am b/Makefile.am index 5b6d533877d..8b4b58d190e 100644 --- a/Makefile.am +++ b/Makefile.am @@ -438,7 +438,7 @@ sbom: --options-h $(abs_builddir)/wolfssl/options.h \ --lib "$$sbom_lib" \ --dep-libz $(ENABLED_LIBZ) \ - --dep-falcon $(ENABLED_FALCON) \ + --dep-liboqs $(ENABLED_LIBOQS) \ --git '$(GIT)' \ --cdx-out $(abs_builddir)/$(SBOM_CDX) \ --spdx-out $(abs_builddir)/$(SBOM_SPDX); \ diff --git a/configure.ac b/configure.ac index 01ccc604a18..95d2fbbf6b2 100644 --- a/configure.ac +++ b/configure.ac @@ -12225,7 +12225,7 @@ AC_PATH_PROG([PYTHON3], [python3]) AC_PATH_PROG([PYSPDXTOOLS], [pyspdxtools]) AC_PATH_PROG([GIT], [git]) AC_SUBST([ENABLED_LIBZ]) -AC_SUBST([ENABLED_FALCON]) +AC_SUBST([ENABLED_LIBOQS]) # Bomsh (OmniBOR build artifact tracing + SBOM enrichment) AC_PATH_PROG([BOMTRACE3], [bomtrace3]) diff --git a/doc/SBOM.md b/doc/SBOM.md index 06cd2e3b3e1..90c343e2f53 100644 --- a/doc/SBOM.md +++ b/doc/SBOM.md @@ -118,9 +118,14 @@ make sbom \ #### External dependency version detection -The remaining optional external dependencies (`libz`, and `falcon` via +The optional external dependencies wolfSSL can link against (`libz` and `liboqs`) are both installed packages and are queried via -`pkg-config --modversion` at SBOM generation time. +`pkg-config --modversion` at SBOM generation time. The SBOM records each +linked library by its package name (`zlib`, `liboqs`) so that downstream +vulnerability scanners (OSV, Grype, Trivy, Dependency-Track) match CVEs +against the right component. Algorithm enablement (e.g. Falcon, which is +reachable only via liboqs) is captured separately as build properties +(`wolfssl:build:HAVE_FALCON` etc.) parsed from `wolfssl/options.h`. If pkg-config does not report a version (the package is not installed, or its `.pc` file is missing): diff --git a/scripts/gen-sbom b/scripts/gen-sbom index 00b92b975c9..4be5dc4dccd 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -46,15 +46,20 @@ def build_timestamp(): return dt, dt.strftime('%Y-%m-%dT%H:%M:%SZ') -# Known metadata for optional external dependencies. -# Version is detected at runtime via pkg-config; falls back to None. +# Known metadata for optional external dependencies. Version is detected +# at runtime via pkg-config; falls back to None. Each entry must describe +# the *linked artefact* (so vulnerability scanners like OSV / Grype / Trivy +# / Dependency-Track resolve CVEs against the right package). Algorithm +# enablement is captured separately via build_props (HAVE_FALCON, ...). DEP_META = { - # Falcon is reachable only via liboqs after upstream PR #10293 collapsed - # the rest of the PQ surface into native wolfCrypt; we record the version - # of liboqs itself since that is the artefact actually linked in. - 'falcon': { - 'name': 'falcon', - 'supplier': 'Open Quantum Safe (via liboqs)', + # liboqs is the only PQ external dependency wolfSSL still links against + # after upstream PR #10293 collapsed the rest of the PQ surface into + # native wolfCrypt. Today, --enable-falcon strictly implies --with-liboqs + # (configure.ac enforces both directions), so a build that links liboqs + # is precisely a build that exposed Falcon. + 'liboqs': { + 'name': 'liboqs', + 'supplier': 'Open Quantum Safe', 'license': 'MIT', 'download': 'https://github.com/open-quantum-safe/liboqs', 'pkgconfig': 'liboqs', @@ -475,8 +480,10 @@ def main(): help='Path to wolfssl/options.h for build config') parser.add_argument('--dep-libz', default='no', help='yes if built with --with-libz') - parser.add_argument('--dep-falcon', default='no', - help='yes if built with --enable-falcon (Falcon via liboqs)') + parser.add_argument('--dep-liboqs', default='no', + help='yes if built with --with-liboqs (the package ' + 'wolfSSL links against; --enable-falcon implies ' + 'this in any legal configuration)') parser.add_argument('--git', default='', help='Path to git binary for version detection') parser.add_argument('--cdx-out', required=True, @@ -490,8 +497,8 @@ def main(): enabled_deps = [ key for key, flag in [ - ('libz', args.dep_libz), - ('falcon', args.dep_falcon), + ('libz', args.dep_libz), + ('liboqs', args.dep_liboqs), ] if flag.lower() == 'yes' ] From 4cf30a628a026f7256f10f0613d5db5e5b50a962 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Fri, 1 May 2026 17:41:46 +0300 Subject: [PATCH 11/16] refactor(sbom): drop unreachable git_root fallback Both live DEP_META entries (libz, liboqs) are pkg-config; the git-describe path was dead. --- Makefile.am | 1 - scripts/gen-sbom | 42 +++++++++++------------------------------- 2 files changed, 11 insertions(+), 32 deletions(-) diff --git a/Makefile.am b/Makefile.am index 8b4b58d190e..04fa6b7fbed 100644 --- a/Makefile.am +++ b/Makefile.am @@ -439,7 +439,6 @@ sbom: --lib "$$sbom_lib" \ --dep-libz $(ENABLED_LIBZ) \ --dep-liboqs $(ENABLED_LIBOQS) \ - --git '$(GIT)' \ --cdx-out $(abs_builddir)/$(SBOM_CDX) \ --spdx-out $(abs_builddir)/$(SBOM_SPDX); \ $(PYSPDXTOOLS) --infile $(abs_builddir)/$(SBOM_SPDX) \ diff --git a/scripts/gen-sbom b/scripts/gen-sbom index 4be5dc4dccd..95380252138 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -207,9 +207,6 @@ def sha256_file(path): return h.hexdigest() -GIT_BIN = None - - def pkgconfig_version(pkgname): """Return version string from pkg-config, or None if unavailable.""" try: @@ -224,30 +221,18 @@ def pkgconfig_version(pkgname): return None -def git_describe_version(root, git_bin): - """Return version from git describe --tags --always, or None.""" - if not root or not git_bin: - return None - try: - r = subprocess.run( - [git_bin, '-C', root, 'describe', '--tags', '--always'], - capture_output=True, text=True - ) - if r.returncode == 0: - return r.stdout.strip() - except FileNotFoundError: - pass - return None - - def dep_version(key): - pkgname = DEP_META[key]['pkgconfig'] - if pkgname: - return pkgconfig_version(pkgname) - git_root = DEP_META[key].get('git_root') - if git_root: - return git_describe_version(git_root, GIT_BIN) - return None + """Resolve the runtime version of a DEP_META entry. + + Every live entry exposes a `pkgconfig` package name; if pkg-config + cannot answer (package missing or `.pc` not on PKG_CONFIG_PATH) we + return None and the caller emits NOASSERTION (SPDX) / omits the + version (CycloneDX). A previous source-tree fallback that used + `git describe` against `git_root` was removed once libxmss/liblms + were dropped upstream; if a future PQ dep returns to a source-only + integration, restore the fallback here together with a `git_root` + field on the DEP_META entry.""" + return pkgconfig_version(DEP_META[key]['pkgconfig']) def parse_options_h(path): @@ -484,17 +469,12 @@ def main(): help='yes if built with --with-liboqs (the package ' 'wolfSSL links against; --enable-falcon implies ' 'this in any legal configuration)') - parser.add_argument('--git', default='', - help='Path to git binary for version detection') parser.add_argument('--cdx-out', required=True, help='Output path for CycloneDX JSON') parser.add_argument('--spdx-out', required=True, help='Output path for SPDX JSON') args = parser.parse_args() - global GIT_BIN - GIT_BIN = args.git or None - enabled_deps = [ key for key, flag in [ ('libz', args.dep_libz), From 649b53875326ac46777576a0b0082fc0d8a66a34 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Fri, 1 May 2026 17:41:46 +0300 Subject: [PATCH 12/16] test(sbom): liboqs/bomsh CI coverage, macOS PATH fix, dep regressions Adds liboqs+bomsh CI jobs, locks DEP_META shape via 5 new tests, ships test_gen_sbom.py in dist. --- .github/workflows/sbom.yml | 193 +++++++++++++++++++++++++++++++++++-- scripts/include.am | 4 + scripts/test_gen_sbom.py | 79 +++++++++++++++ 3 files changed, 269 insertions(+), 7 deletions(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index 466358eb284..84aed2a2179 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -259,6 +259,85 @@ jobs: print('simple SPDX override: ok') PY + # ---- liboqs / Falcon dep entry --------------------------------------- + # Without this, every code path that emits a dep package - pkg-config + # lookup, supplier/purl/license construction, deterministic UUID + # derivation for deps - is uncovered by CI. A future rename or shape + # break in DEP_META['liboqs'] would silently land. + + - name: Install liboqs (provides liboqs.pc for --with-liboqs) + run: sudo apt-get update && sudo apt-get install -y liboqs-dev + + - name: Configure with --with-liboqs --enable-falcon + run: | + make distclean + autoreconf -ivf + ./configure --enable-shared --enable-experimental \ + --with-liboqs --enable-falcon + + - name: Build + generate SBOM with liboqs enabled + run: make sbom + + - name: liboqs dep entry resolves to a CVE-trackable identifier + # The point of recording liboqs (rather than `falcon`) is that + # OSV / Grype / Trivy / Dependency-Track key vulnerability + # records off purl + name. These assertions guard the contract + # that pulled the entry away from the algorithm name. + run: | + python3 - <<'PY' + import glob, json, re, sys + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) + pkgs = {p['name']: p for p in d['packages']} + assert 'liboqs' in pkgs, list(pkgs) + assert 'falcon' not in pkgs, "algorithm name leaked as a dep package" + liboqs = pkgs['liboqs'] + assert liboqs['supplier'] == 'Organization: Open Quantum Safe', \ + liboqs['supplier'] + refs = {r['referenceType']: r['referenceLocator'] + for r in liboqs.get('externalRefs', [])} + assert 'purl' in refs, refs + assert re.match(r'pkg:github/open-quantum-safe/liboqs@', refs['purl']), \ + refs['purl'] + # Algorithm enablement must still be visible via build_props + # (parsed from options.h), not via the dep entry. + props = {p['name']: p['value'] + for p in d['packages'][0].get('annotations', []) + if p.get('annotationType') == 'OTHER'} + # CycloneDX side: same package + version present. + with open(glob.glob('wolfssl-*.cdx.json')[0]) as f: + cdx = json.load(f) + deps = {c['name']: c for c in cdx.get('components', [])} + assert 'liboqs' in deps, list(deps) + print('liboqs dep entry: ok ->', refs['purl']) + PY + + - name: HAVE_FALCON algorithm flag is captured as a build property + # Algorithm visibility moved out of the dep entry; this verifies + # it is still preserved (just somewhere honest). + run: | + python3 - <<'PY' + import glob, json + with open(glob.glob('wolfssl-*.spdx.json')[0]) as f: + d = json.load(f) + wolf = [p for p in d['packages'] if p['name'] == 'wolfssl'][0] + props = {p['name']: p['value'] + for p in wolf.get('annotations', []) + if p.get('annotationType') == 'OTHER'} + # Build props can land as annotations or as a 'attributionTexts' + # block depending on SPDX version; check both. + combined = json.dumps(d) + assert 'HAVE_FALCON' in combined, \ + "HAVE_FALCON missing from SBOM build properties" + print('HAVE_FALCON build prop: present') + PY + + - name: Restore default build for remaining steps + run: | + make distclean + autoreconf -ivf + ./configure --enable-shared --enable-static + # ---- Distribution + install hooks ----------------------------------- - name: Tarball roundtrip (make dist -> ./configure -> make sbom) @@ -310,13 +389,14 @@ jobs: brew install autoconf automake libtool python3 -m pip install --user --break-system-packages \ 'spdx-tools==0.8.*' - echo "$HOME/.local/bin" >> "$GITHUB_PATH" - # On some macOS runners pyspdxtools lands in - # Library/Python//bin; symlink to a known-on-PATH location. - for d in "$HOME/Library/Python"/*/bin; do - [ -x "$d/pyspdxtools" ] && \ - echo "$d" >> "$GITHUB_PATH" - done + # Resolve the actual scripts dir for the python that ran pip, + # rather than guessing a glob like `~/Library/Python/*/bin`. + # `posix_user` is the install scheme `pip install --user` wrote + # to, so this matches even when the runner's selected python + # changes between minor versions / homebrew vs system. + python3 -c \ + 'import sysconfig; print(sysconfig.get_path("scripts","posix_user"))' \ + >> "$GITHUB_PATH" - name: Configure wolfSSL (shared) run: autoreconf -ivf && ./configure --enable-shared @@ -334,3 +414,102 @@ jobs: assert re.fullmatch(r'[0-9a-f]{64}', checksum), checksum print('macOS SBOM checksum well-formed:', checksum) PY + + # Tier 2 (bomsh) - exercises the `make bomsh` target which traces a + # full clean rebuild under bomtrace3 (patched strace, Linux-only) and + # produces an OmniBOR artifact dependency graph. Without this job + # the entire bomsh recipe and its SPDX enrichment step would only be + # exercised by hand; a regression in either would silently land. + bomsh: + name: bomsh integration (linux) + if: github.repository_owner == 'wolfssl' + runs-on: ubuntu-24.04 + needs: unit + timeout-minutes: 30 + steps: + - uses: actions/checkout@v4 + + - name: Install build deps + SBOM validators + run: | + sudo apt-get update + sudo apt-get install -y build-essential autoconf automake libtool \ + python3 python3-pip git + python3 -m pip install --user --upgrade pip + python3 -m pip install --user 'spdx-tools==0.8.*' + echo "$HOME/.local/bin" >> "$GITHUB_PATH" + + - name: Install bomsh toolchain (bomtrace3 + helper scripts) + # Bomsh is not packaged; build bomtrace3 (patched strace) from + # source and install the python helpers system-wide so configure's + # AC_PATH_PROG can find them. + run: | + git clone --depth=1 https://github.com/omnibor/bomsh /tmp/bomsh + # bomtrace3 build: docker/devcontainer-only Makefile in upstream; + # use the embedded build script if present, else fall back to + # the strace patch path. + cd /tmp/bomsh + if [ -d .devcontainer/bomtrace3 ]; then + make -C .devcontainer/bomtrace3 + sudo install -m 755 .devcontainer/bomtrace3/bomtrace3 \ + /usr/local/bin/ + else + echo "bomsh repo layout changed; please update CI" + exit 1 + fi + sudo install -m 755 scripts/bomsh_create_bom.py /usr/local/bin/ + sudo install -m 755 scripts/bomsh_sbom.py /usr/local/bin/ + bomtrace3 --version || true + which bomsh_create_bom.py bomsh_sbom.py + + - name: Configure wolfSSL + run: autoreconf -ivf && ./configure --enable-shared + + - name: Generate SPDX (input to bomsh enrichment) + run: make sbom + + - name: Run make bomsh + run: make bomsh + + - name: OmniBOR artifact graph produced + # bomsh writes the artifact dependency graph under omnibor/. + # Empty/missing graph means bomtrace3 silently failed to trace. + run: | + test -d omnibor + test "$(find omnibor -type f | wc -l)" -gt 0 + echo "omnibor/ contents:" + find omnibor -maxdepth 3 -type f | head -20 + + - name: Enriched SPDX has PERSISTENT-ID gitoid externalRef + # The whole point of `make bomsh` over `make sbom` is the + # bridge between component identity (SPDX package) and build + # provenance (OmniBOR gitoid). If the enrichment step ran but + # produced an SPDX without the gitoid ref, the bridge is broken. + run: | + ls omnibor.wolfssl-*.spdx.json + python3 - <<'PY' + import glob, json, sys + path = glob.glob('omnibor.wolfssl-*.spdx.json')[0] + with open(path) as f: + d = json.load(f) + gitoid_refs = [] + for pkg in d.get('packages', []): + for ref in pkg.get('externalRefs', []): + if (ref.get('referenceCategory') == 'PERSISTENT-ID' + or ref.get('referenceType') == 'gitoid'): + gitoid_refs.append(ref) + assert gitoid_refs, \ + f'no PERSISTENT-ID gitoid externalRef in {path}' + print(f'bomsh enrichment ok: {len(gitoid_refs)} gitoid refs') + PY + + - name: make clean removes all bomsh + sbom artefacts + # Regression guard: if a future change adds an output to either + # recipe but forgets CLEANFILES, this will catch it. + run: | + make clean + if ls wolfssl-*.spdx.json wolfssl-*.cdx.json \ + omnibor.wolfssl-*.spdx.json 2>/dev/null; then + echo "make clean did not remove SBOM/bomsh artefacts" + exit 1 + fi + test ! -d omnibor || (echo "omnibor/ not cleaned"; exit 1) diff --git a/scripts/include.am b/scripts/include.am index 508fec90cea..4f7d9df2192 100644 --- a/scripts/include.am +++ b/scripts/include.am @@ -160,3 +160,7 @@ EXTRA_DIST += scripts/user_settings_asm.sh # Must be in the dist tarball, otherwise `make dist && cd && # ./configure && make sbom` fails for downstream consumers. EXTRA_DIST += scripts/gen-sbom + +# SBOM generator unit tests. Shipped so downstream consumers building +# from a release tarball can re-run the regression suite. +EXTRA_DIST += scripts/test_gen_sbom.py diff --git a/scripts/test_gen_sbom.py b/scripts/test_gen_sbom.py index c4f408c4df0..8de58a91694 100644 --- a/scripts/test_gen_sbom.py +++ b/scripts/test_gen_sbom.py @@ -336,5 +336,84 @@ def test_dedup_keeps_last_assignment(self): self.assertEqual(pairs['HAVE_X'], '2') +class TestDepMetaShape(unittest.TestCase): + """Lock down the dep-tracking surface so renames/removals don't + silently regress vulnerability-scanner identifiers in the SBOM. + + These guard against: + * an external dep being added without a CVE-resolvable identifier + * a future PR re-introducing the `falcon`/`libxmss`/`liblms` + keys after they were intentionally removed.""" + + def test_only_libz_and_liboqs_are_tracked(self): + self.assertEqual(set(gs.DEP_META.keys()), {'libz', 'liboqs'}) + + def test_liboqs_entry_describes_the_linked_artefact(self): + liboqs = gs.DEP_META['liboqs'] + self.assertEqual(liboqs['name'], 'liboqs') + self.assertEqual(liboqs['supplier'], 'Open Quantum Safe') + self.assertEqual(liboqs['pkgconfig'], 'liboqs') + self.assertEqual( + liboqs['purl']('0.10.0'), + 'pkg:github/open-quantum-safe/liboqs@0.10.0') + + def test_no_stale_dep_keys(self): + # `falcon` is an algorithm, not a linked package; it must not + # appear as a dep entry (algorithm enablement lives in + # build_props parsed from options.h). `libxmss` and `liblms` + # were removed upstream; their re-appearance here would + # silently emit unresolvable identifiers in the SBOM. + for stale in ('falcon', 'libxmss', 'liblms', 'xmss', 'lms'): + self.assertNotIn(stale, gs.DEP_META) + + +class TestEnabledDepsCli(unittest.TestCase): + """End-to-end test of the argparse plumbing for --dep-* flags. + + Runs gen-sbom in a child process so we exercise the real argparse + config rather than a re-imported module.""" + + def _run(self, *argv): + import subprocess + here = pathlib.Path(__file__).resolve().parent + script = here / 'gen-sbom' + return subprocess.run( + ['python3', str(script), *argv], + capture_output=True, text=True + ) + + def test_dep_liboqs_is_accepted(self): + result = self._run('--help') + self.assertEqual(result.returncode, 0, result.stderr) + self.assertIn('--dep-liboqs', result.stdout) + self.assertIn('--dep-libz', result.stdout) + + def test_removed_flags_are_rejected(self): + # Each of these was either renamed (--dep-falcon -> --dep-liboqs) + # or removed entirely (--dep-libxmss/--dep-liblms with upstream + # removal of the libraries). argparse should reject them as + # unrecognised, not silently accept them. We pass the full set + # of required args (against /dev/null sentinels) so argparse + # progresses to the unknown-flag check; we never want + # gen-sbom to actually generate anything in this test. + required = [ + '--name', 'wolfssl', + '--version', '0.0.0-test', + '--lib', '/dev/null', + '--license-file', '/dev/null', + '--options-h', '/dev/null', + '--cdx-out', '/dev/null', + '--spdx-out', '/dev/null', + ] + for stale_flag in ('--dep-falcon', '--dep-libxmss', '--dep-liblms', + '--dep-libxmss-root', '--dep-liblms-root', + '--git'): + result = self._run(*required, stale_flag, 'no') + self.assertNotEqual(result.returncode, 0, + f"{stale_flag!r} unexpectedly accepted") + self.assertIn('unrecognized arguments', result.stderr, + f"{stale_flag!r}: {result.stderr!r}") + + if __name__ == '__main__': unittest.main(verbosity=2) From 011a65ac582aa5354f99e2ab475f4aa15233d447 Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Tue, 5 May 2026 12:23:12 +0300 Subject: [PATCH 13/16] feat(sbom): standalone gen-sbom for embedded / RTOS builds Adds --user-settings (pcpp), --srcs (Merkle/OmniBOR), --dep-version, noise filter with NO_/USE_ carve-out, and pcpp #error fail-fast so embedded customers can produce a CRA SBOM without autoconf or .a. Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 302 +++++++++++++++++ INSTALL | 44 ++- README.md | 12 +- doc/CRA.md | 40 ++- doc/SBOM.md | 413 +++++++++++++++++++++-- scripts/gen-sbom | 460 +++++++++++++++++++++++-- scripts/test_gen_sbom.py | 665 +++++++++++++++++++++++++++++++++++++ 7 files changed, 1877 insertions(+), 59 deletions(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index 84aed2a2179..a862cdc9faa 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -30,6 +30,308 @@ jobs: - name: Unit tests run: python3 -W error::ResourceWarning -m unittest scripts/test_gen_sbom.py -v + # Tier 2 (standalone) - the embedded entry point: gen-sbom invoked + # directly without autotools, against a real wolfSSL user_settings.h + # plus a representative source set. Mirrors how a Keil / IAR / + # STM32CubeIDE / ESP-IDF / Zephyr customer would call it from a + # post-build step. Without this job the standalone path would only + # be exercised by hand and a regression in --user-settings, + # --srcs, or --dep-version handling would silently land. + standalone: + name: SBOM standalone (no autotools) + if: github.repository_owner == 'wolfssl' + runs-on: ubuntu-24.04 + needs: unit + timeout-minutes: 10 + steps: + - uses: actions/checkout@v4 + + - name: Install standalone-path deps + # pcpp is the in-Python C preprocessor that lets gen-sbom walk + # settings.h + user_settings.h with no compiler invocation. + # spdx-tools is for the post-generation validation step. + run: | + python3 -m pip install --user --upgrade pip + python3 -m pip install --user pcpp 'spdx-tools==0.8.*' + echo "$HOME/.local/bin" >> "$GITHUB_PATH" + + - name: Generate SBOM via standalone Python entry point + # Uses IDE/GCC-ARM/Header/user_settings.h as the fixture - it + # is a real, comprehensive embedded user_settings.h shipping in + # the tree, so any CI failure here represents a regression a + # real customer would hit. Source set is a small but + # representative slice of wolfcrypt that does not depend on + # any pre-build code generation. + # + # The two `NO_*_H` predefines exercise the noise-filter + # `_CONFIG_H_TOKENS` carve-out end-to-end: a NETOS / Telit / + # similar RTOS profile sets these in user_settings.h to disable + # stdlib header inclusion, and an over-aggressive header-guard + # filter would silently drop them from the SBOM (see the + # corresponding row in `required` below). + run: | + mkdir -p /tmp/standalone + SOURCE_DATE_EPOCH=1700000000 \ + python3 scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file LICENSING \ + --user-settings wolfssl/wolfcrypt/settings.h \ + --user-settings-include . \ + --user-settings-include IDE/GCC-ARM/Header \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --user-settings-define NO_STDINT_H \ + --user-settings-define WOLFSSL_NO_ASSERT_H \ + --srcs wolfcrypt/src/aes.c \ + wolfcrypt/src/sha.c \ + wolfcrypt/src/sha256.c \ + wolfcrypt/src/dh.c \ + --cdx-out /tmp/standalone/wolfssl.cdx.json \ + --spdx-out /tmp/standalone/wolfssl.spdx.json + + - name: Standalone SPDX validates per pyspdxtools + # Same validator the autotools `make sbom` recipe runs. If + # the embedded path produces an SBOM autotools' validator + # rejects, our portability claim is false. + run: pyspdxtools --infile /tmp/standalone/wolfssl.spdx.json + + - name: Standalone SBOM advertises source-merkle hash semantics + # The auditor-facing contract: the standalone SBOM must say + # "this checksum is over a source set, not a library binary", + # and must list which sources fed the hash. Without these + # properties the SHA-256 in `hashes` is ambiguous to anyone + # reviewing the SBOM. + # + # The build-property assertion is a pinned set rather than a + # `len > N` smoke for two reasons: + # 1. A noise-filter regression that drops 80% of the wolfSSL + # config flags but keeps 21 unrelated names would still + # pass a length check, and silently ship an SBOM that + # misrepresents the build to a CRA reviewer. + # 2. The pinned set covers three distinct filter paths: + # - regular config (no `_H` suffix): SINGLE_THREADED, + # WOLFSSL_USER_SETTINGS, USE_FAST_MATH, NO_FILESYSTEM, + # WOLFSSL_SMALL_STACK, NO_DEV_RANDOM + # - `_H`-suffix carve-out via `--user-settings-define`: + # NO_STDINT_H, WOLFSSL_NO_ASSERT_H. These are the + # regression sentinels for the bug fixed in this PR + # (header-guard filter dropping NETOS/Telit-style + # stdlib disablement flags); a regression of + # `_CONFIG_H_TOKENS` in gen-sbom would surface here. + run: | + python3 - <<'PY' + import json + with open('/tmp/standalone/wolfssl.cdx.json') as f: + cdx = json.load(f) + props = {p['name']: p['value'] + for p in cdx['metadata']['component']['properties']} + assert props.get('wolfssl:sbom:hash-kind') == 'source-merkle-omnibor', \ + props + srcs = props['wolfssl:sbom:source-set'].split(',') + assert sorted(srcs) == ['aes.c', 'dh.c', 'sha.c', 'sha256.c'], srcs + + build_prop_names = { + k.split(':', 2)[-1] + for k in props if k.startswith('wolfssl:build:') + } + # Regular wolfSSL config flags (no `_H` suffix). Each is set + # by the GCC-ARM/Header user_settings.h fixture or by the + # --user-settings-define WOLFSSL_USER_SETTINGS predefine. + required_regular = { + 'WOLFSSL_USER_SETTINGS', # the gate the customer set in CFLAGS + 'SINGLE_THREADED', # IDE/GCC-ARM/Header/user_settings.h + 'USE_FAST_MATH', # IDE/GCC-ARM/Header/user_settings.h + 'NO_FILESYSTEM', # IDE/GCC-ARM/Header/user_settings.h + 'WOLFSSL_SMALL_STACK', # IDE/GCC-ARM/Header/user_settings.h + 'NO_DEV_RANDOM', # IDE/GCC-ARM/Header/user_settings.h + } + # `_H`-suffix carve-out sentinels - injected via + # --user-settings-define above so a regression of + # `_CONFIG_H_TOKENS` in gen-sbom (i.e. the bug that this PR + # fixes) blows up CI rather than only the unit tests. + required_h_carveout = { + 'NO_STDINT_H', # NETOS / Telit stdlib disablement + 'WOLFSSL_NO_ASSERT_H', # gates types.h:2132 + } + required = required_regular | required_h_carveout + missing = required - build_prop_names + assert not missing, ( + f'pinned wolfSSL config flags missing from SBOM ' + f'(pcpp + noise filter regression?): {missing}\n' + f' - regular missing : {required_regular - build_prop_names}\n' + f' - _H carve-out missing: {required_h_carveout - build_prop_names}\n' + f'present subset: {build_prop_names & required}' + ) + # No host-leak / Apple TargetConditionals / __* internals. + import re + forbidden = [n for n in build_prop_names + if re.match(r'(?:__|_[A-Z]|TARGET_OS_|TARGET_IPHONE_)', + n)] + assert not forbidden, ( + f'host-leak macros present in SBOM (noise filter ' + f'regression?): {forbidden[:10]}' + ) + print(f'standalone SBOM ok: {len(build_prop_names)} build props ' + f'(all {len(required_regular)} regular + ' + f'{len(required_h_carveout)} carve-out flags present, ' + f'no host-leak names), {len(srcs)} source files') + PY + + - name: Reproducibility - two standalone runs are byte-identical + # The deterministic UUID + SOURCE_DATE_EPOCH machinery applies + # to both entry points; this guards against a future change + # accidentally introducing wallclock or random data into the + # standalone path. Predefines must match the generate step + # exactly; any drift here would diff against the original run. + run: | + mkdir -p /tmp/standalone-r2 + SOURCE_DATE_EPOCH=1700000000 \ + python3 scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file LICENSING \ + --user-settings wolfssl/wolfcrypt/settings.h \ + --user-settings-include . \ + --user-settings-include IDE/GCC-ARM/Header \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --user-settings-define NO_STDINT_H \ + --user-settings-define WOLFSSL_NO_ASSERT_H \ + --srcs wolfcrypt/src/aes.c \ + wolfcrypt/src/sha.c \ + wolfcrypt/src/sha256.c \ + wolfcrypt/src/dh.c \ + --cdx-out /tmp/standalone-r2/wolfssl.cdx.json \ + --spdx-out /tmp/standalone-r2/wolfssl.spdx.json + diff /tmp/standalone/wolfssl.cdx.json \ + /tmp/standalone-r2/wolfssl.cdx.json + diff /tmp/standalone/wolfssl.spdx.json \ + /tmp/standalone-r2/wolfssl.spdx.json + + - name: --dep-version override (no pkg-config needed) + # The whole point of --dep-version is to let cross-compile / + # baremetal hosts emit a dep version when pkg-config is + # unavailable for the target. Asserts the value lands in the + # SBOM dep entry instead of NOASSERTION. + run: | + mkdir -p /tmp/standalone-deps + SOURCE_DATE_EPOCH=1700000000 \ + python3 scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file LICENSING \ + --user-settings wolfssl/wolfcrypt/settings.h \ + --user-settings-include . \ + --user-settings-include IDE/GCC-ARM/Header \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --srcs wolfcrypt/src/aes.c wolfcrypt/src/sha.c \ + --dep-libz yes \ + --dep-version libz=1.3.1 \ + --cdx-out /tmp/standalone-deps/wolfssl.cdx.json \ + --spdx-out /tmp/standalone-deps/wolfssl.spdx.json + python3 - <<'PY' + import json + with open('/tmp/standalone-deps/wolfssl.spdx.json') as f: + d = json.load(f) + deps = {p['name']: p for p in d['packages'] if p['name'] != 'wolfssl'} + assert 'zlib' in deps, list(deps) + assert deps['zlib']['versionInfo'] == '1.3.1', deps['zlib'] + print('--dep-version override ok: zlib@1.3.1') + PY + + - name: --options-h escape hatch ($CC -dM -E, no pcpp) + # The doc/SBOM.md § 1.5 escape hatch for toolchains that cannot + # install pcpp (older Keil / IAR sites with restricted pip + # access): pre-process settings.h with the system compiler's + # `-dM -E` macro-dump mode and feed the resulting flat #define + # list to gen-sbom via --options-h. This step proves the path + # actually works end-to-end and that the noise filter scrubs + # the host-leak macros (__VERSION__, __SSE2__, TARGET_OS_*, + # ...) that `gcc -dM -E` always emits alongside the wolfSSL + # config. + run: | + mkdir -p /tmp/standalone-dme + # Same effective build the pcpp step covered above; the only + # difference is the macro-extraction mechanism. The two + # `-D NO_*_H` predefines mirror the pcpp step and pin the + # `_CONFIG_H_TOKENS` carve-out on the no-pcpp path too. + gcc -dM -E \ + -I . -I IDE/GCC-ARM/Header \ + -DWOLFSSL_USER_SETTINGS \ + -DNO_STDINT_H \ + -DWOLFSSL_NO_ASSERT_H \ + -include wolfssl/wolfcrypt/settings.h \ + -x c /dev/null > /tmp/standalone-dme/options.h + + # Defensive: the value of this whole step is that the noise + # filter scrubs `gcc -dM -E`'s host-leak macros. If a future + # GCC / runner image happened to emit no `__*` defines, the + # `forbidden` assertion below would pass vacuously even with + # the noise filter disabled. Confirm the raw dump actually + # contains plenty of host-leak names, otherwise this step + # is not actually testing what it claims to test. + raw_underscores=$(grep -cE '^#define[[:space:]]+(__|_[A-Z])' \ + /tmp/standalone-dme/options.h || true) + echo "raw -dM -E dump has $raw_underscores compiler-reserved defines" + test "$raw_underscores" -ge 50 || { + echo "ERROR: --options-h CI step is not exercising the noise" + echo " filter (raw dump has only $raw_underscores" + echo " compiler-reserved defines; expected >= 50)." + exit 1 + } + + SOURCE_DATE_EPOCH=1700000000 \ + python3 scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file LICENSING \ + --options-h /tmp/standalone-dme/options.h \ + --srcs wolfcrypt/src/aes.c \ + wolfcrypt/src/sha.c \ + wolfcrypt/src/sha256.c \ + wolfcrypt/src/dh.c \ + --cdx-out /tmp/standalone-dme/wolfssl.cdx.json \ + --spdx-out /tmp/standalone-dme/wolfssl.spdx.json + + # Validate + assert the same wolfSSL config flags reach the + # SBOM via the no-pcpp path that the pcpp path produced + # above. If the noise filter regresses, this step is what + # surfaces it (the raw `gcc -dM -E` dump contains hundreds + # of host-leak macros and only a handful of wolfSSL ones). + pyspdxtools --infile /tmp/standalone-dme/wolfssl.spdx.json + python3 - <<'PY' + import json, re + with open('/tmp/standalone-dme/wolfssl.cdx.json') as f: + cdx = json.load(f) + props = {p['name']: p['value'] + for p in cdx['metadata']['component']['properties']} + build_prop_names = { + k.split(':', 2)[-1] + for k in props if k.startswith('wolfssl:build:') + } + required_regular = { + 'WOLFSSL_USER_SETTINGS', 'SINGLE_THREADED', 'USE_FAST_MATH', + 'NO_FILESYSTEM', 'WOLFSSL_SMALL_STACK', 'NO_DEV_RANDOM', + } + required_h_carveout = { + 'NO_STDINT_H', 'WOLFSSL_NO_ASSERT_H', + } + required = required_regular | required_h_carveout + missing = required - build_prop_names + assert not missing, ( + f'--options-h path lost wolfSSL config flags: {missing}\n' + f' - regular missing : {required_regular - build_prop_names}\n' + f' - _H carve-out missing: {required_h_carveout - build_prop_names}\n' + f'present subset: {build_prop_names & required}' + ) + forbidden = [n for n in build_prop_names + if re.match(r'(?:__|_[A-Z]|TARGET_OS_|TARGET_IPHONE_)', + n)] + assert not forbidden, ( + f'host-leak macros from `gcc -dM -E` dump survived the ' + f'noise filter: {forbidden[:10]}' + ) + print(f'--options-h path ok: {len(build_prop_names)} build ' + f'props (all {len(required_regular)} regular + ' + f'{len(required_h_carveout)} carve-out flags present, ' + f'host-leak macros filtered)') + PY + # Tier 2 - integration: build wolfSSL, generate the SBOMs, and assert # everything an external auditor or vulnerability scanner relies on. integration: diff --git a/INSTALL b/INSTALL index 17e37f56db7..c82ffa13e3b 100644 --- a/INSTALL +++ b/INSTALL @@ -317,7 +317,45 @@ We also have vcpkg ports for wolftpm, wolfmqtt and curl. 19. Generating an SBOM (Software Bill of Materials) wolfSSL can generate a Software Bill of Materials for EU Cyber Resilience - Act (CRA) compliance after a normal build and install. + Act (CRA) compliance. Two entry points are supported, depending on how + you build wolfSSL. + + --- 19a. Embedded / RTOS / IDE-based builds (no autotools) ---------- + + For customers building wolfSSL from a hand-edited user_settings.h with + their own Makefile, Keil MDK, IAR EWARM, STM32CubeIDE, ESP-IDF, + Zephyr, or plain CMake, invoke scripts/gen-sbom directly. No + ./configure, no autotools. + + Prerequisites: + - python3 + - pcpp (pip install pcpp) # required for --user-settings + - spdx-tools (pip install spdx-tools) # optional; for SPDX validation + + Usage: + + $ python3 wolfssl/scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file wolfssl/LICENSING \ + --user-settings wolfssl/wolfssl/wolfcrypt/settings.h \ + --user-settings-include wolfssl \ + --user-settings-include path/to/your/user_settings_dir \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --srcs wolfssl/wolfcrypt/src/aes.c [...your wolfssl source list] \ + --cdx-out wolfssl-5.9.1.cdx.json \ + --spdx-out wolfssl-5.9.1.spdx.json + + The component checksum is a deterministic OmniBOR-compatible Merkle + hash over the source files you compile into your firmware, so you do + not need to synthesize a separate libwolfssl.a just for SBOM purposes. + + See doc/SBOM.md section 1 for per-toolchain recipes (Keil, IAR, + STM32CubeIDE, ESP-IDF, Zephyr, CMake) and the full flag reference. + + --- 19b. Linux / autotools builds ---------------------------------- + + For Debian, RPM, Yocto, FIPS-Ready, and other builds that already use + ./configure && make: Prerequisites: - python3 (detected automatically by configure) @@ -338,6 +376,10 @@ We also have vcpkg ports for wolftpm, wolfmqtt and curl. The SPDX JSON is validated by pyspdxtools before the tag-value file is written; make sbom fails if validation fails. + `make sbom` is a thin convenience wrapper around the same + scripts/gen-sbom Python entry point that section 19a uses, with all + paths resolved automatically from the autotools build tree. + To install the SBOM files to $(datadir)/doc/wolfssl/: $ make install-sbom diff --git a/README.md b/README.md index a51a2d0b8e8..6135cf99640 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,17 @@ of the wolfSSL manual. ## SBOM / CRA Compliance wolfSSL provides a Software Bill of Materials (SBOM) for EU Cyber Resilience -Act (CRA) compliance via `make sbom`. See `doc/SBOM.md` for details. +Act (CRA) compliance via two entry points: + +- `python3 scripts/gen-sbom …` for embedded / RTOS / IDE-based builds + (Keil, IAR, STM32CubeIDE, ESP-IDF, Zephyr, plain CMake, custom Makefile) + configured through a hand-edited `user_settings.h`. No autotools required. +- `make sbom` for Linux server / Debian / RPM / Yocto / FIPS-Ready + builds that already use `./configure && make`. + +Both produce SPDX 2.3 + CycloneDX 1.6 JSON validated against NTIA +minimum elements. See `doc/SBOM.md` for per-toolchain recipes and the +full flag reference. ## OmniBOR / Bomsh diff --git a/doc/CRA.md b/doc/CRA.md index d477d3a1f09..d3cefcf3b8d 100644 --- a/doc/CRA.md +++ b/doc/CRA.md @@ -25,6 +25,40 @@ provides a deeper audit trail if your compliance posture requires it. ## Quick Start +wolfSSL exposes two SBOM entry points, depending on how you build the +library. + +### Embedded / RTOS / IDE-based builds (no `./configure`) + +If your product builds wolfSSL with a hand-edited `user_settings.h` and a +custom Makefile, Keil / IAR / STM32CubeIDE project, ESP-IDF / Zephyr +component, or plain CMake, invoke `scripts/gen-sbom` directly. No +autotools required: + +```sh +python3 wolfssl/scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file wolfssl/LICENSING \ + --user-settings wolfssl/wolfssl/wolfcrypt/settings.h \ + --user-settings-include wolfssl \ + --user-settings-include path/to/your/user_settings_dir \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --srcs wolfssl/wolfcrypt/src/aes.c [and the rest of your wolfssl source list] \ + --cdx-out wolfssl-5.9.1.cdx.json \ + --spdx-out wolfssl-5.9.1.spdx.json +``` + +Requires Python 3 + `pip install pcpp` (used to walk +`user_settings.h` the same way the C compiler does). + +See `doc/SBOM.md` § 1 for per-toolchain recipes (Keil, IAR, +STM32CubeIDE, ESP-IDF, Zephyr, plain CMake) and the full flag reference. + +### Linux / autotools builds + +For Debian / RPM / Yocto / FIPS-Ready / cloud builds that already use +`./configure && make`: + ```sh ./configure make @@ -32,7 +66,11 @@ make sbom # produces wolfssl-.spdx.json, .cdx.json, .spdx make bomsh # optional: produces omnibor/ + OmniBOR-enriched SPDX ``` -See `doc/SBOM.md` for prerequisites and full details on both targets. +`make sbom` is a convenience wrapper around the same `scripts/gen-sbom` +script the embedded path uses. + +See `doc/SBOM.md` for prerequisites and full details on both entry +points. ## What wolfSSL Provides diff --git a/doc/SBOM.md b/doc/SBOM.md index 90c343e2f53..9666792d966 100644 --- a/doc/SBOM.md +++ b/doc/SBOM.md @@ -5,7 +5,7 @@ transparency: | Artefact | Target | Answers | |---|---|---| -| SBOM (SPDX 2.3 + CycloneDX 1.6) | `make sbom` | *What* wolfSSL is: component identity, license, checksums, CPE, PURL | +| SBOM (SPDX 2.3 + CycloneDX 1.6) | `scripts/gen-sbom` / `make sbom` | *What* wolfSSL is: component identity, license, checksums, CPE, PURL | | OmniBOR artifact graph | `make bomsh` | *How* wolfSSL was built: cryptographic source-to-binary traceability | Together they provide full coverage for the EU Cyber Resilience Act (CRA) @@ -13,9 +13,355 @@ and similar supply chain transparency requirements. Each target is independently useful; running both produces an enriched SPDX document that bridges the two artefacts with a single `PERSISTENT-ID gitoid` reference. -## Quick Start +The SBOM generator has two entry points so both customer segments are +covered: -### Component identity only +| Entry point | Who it is for | Build system | +|---|---|---| +| `python3 scripts/gen-sbom …` (standalone) | Embedded / RTOS customers building with their own Makefile, Keil, IAR, STM32CubeIDE, ESP-IDF, Zephyr, plain CMake, etc. | Any | +| `make sbom` (autotools wrapper) | Linux server / Debian / RPM / Yocto / FIPS-Ready customers running `./configure && make` | Autotools | + +Both call the same Python core and produce SBOMs that pass the same SPDX +2.3 / CycloneDX 1.6 / NTIA validators. Pick whichever matches your build +flow. + +--- + +## 1. Standalone Python tool (recommended for embedded / IDE builds) + +`scripts/gen-sbom` is pure Python 3 stdlib (plus an optional `pcpp` dep, +see below). Customers who configure wolfSSL via a hand-edited +`user_settings.h` and link wolfSSL source files directly into firmware +invoke it directly, without running `./configure` or producing a +standalone `libwolfssl.a`. + +### 1.1 Quick start + +```sh +python3 scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file LICENSING \ + --user-settings wolfssl/wolfcrypt/settings.h \ + --user-settings-include . \ + --user-settings-include path/to/your/user_settings_dir \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --srcs wolfcrypt/src/aes.c wolfcrypt/src/sha.c \ + wolfcrypt/src/sha256.c wolfcrypt/src/dh.c \ + wolfcrypt/src/random.c \ + --cdx-out wolfssl-5.9.1.cdx.json \ + --spdx-out wolfssl-5.9.1.spdx.json +``` + +That command produces the same two SBOM JSON files (CycloneDX 1.6 and SPDX +2.3) that `make sbom` produces, with no autotools involvement. + +### 1.2 What you provide + +| Flag | What | Where it comes from | +|---|---|---| +| `--name wolfssl` | Component name | Hard-coded; always `wolfssl` | +| `--version 5.9.1` | Component version | Whatever wolfSSL release you pulled | +| `--license-file LICENSING` | wolfSSL's `LICENSING` file | Already in your wolfSSL source tree | +| `--user-settings wolfssl/wolfcrypt/settings.h` | wolfSSL's master settings header | Already in your wolfSSL source tree | +| `--user-settings-include DIR` (repeatable) | Include path containing your `user_settings.h` and the wolfSSL tree | Same as the `-I` flags in your build | +| `--user-settings-define NAME[=VALUE]` (repeatable) | Macros to predefine for preprocessing | Same as the `-D` flags in your build (at minimum: `-D WOLFSSL_USER_SETTINGS`) | +| `--srcs PATH …` | wolfSSL source files compiled into your firmware | The same source list you pass to your compiler | +| `--cdx-out / --spdx-out` | Output paths for the SBOM JSON files | Anywhere you want | + +Optional flags: + +| Flag | When to use it | +|---|---| +| `--supplier "Acme Inc."` | Override the default `wolfSSL Inc.` (rare) | +| `--dep-libz yes` | If your build links `libz` | +| `--dep-liboqs yes` | If your build links `liboqs` | +| `--dep-version libz=1.3.1` | Explicit dep version when `pkg-config` is unavailable (typical cross-compile) | +| `--license-override LicenseRef-wolfSSL-Commercial` | If you are a commercial licensee, not GPL | +| `--license-text /path/to/commercial-license.txt` | Required when `--license-override` is a `LicenseRef-*` | + +### 1.3 Dependencies + +- **Python 3**. Required. Stdlib only when using `--options-h`. +- **`pcpp`** (`pip install pcpp`). Required only when using + `--user-settings`. pcpp is a pure-Python C preprocessor that walks + `settings.h` and your `user_settings.h` the same way the C compiler + does, so the SBOM build properties reflect the actual compiled + configuration rather than just the literal text of `user_settings.h`. + If pcpp is not available you can pre-process externally with the + compiler and pass the result via `--options-h` (see § 1.5). +- **`pyspdxtools`** (`pip install spdx-tools`). Optional. Needed only + if you want to validate the produced SPDX or convert it to tag-value + form. + +### 1.4 What the SBOM checksum represents + +In a `make sbom` build the `hashes` / `checksums` field in the SBOM is +the SHA-256 of `libwolfssl.so` / `libwolfssl.a` / `libwolfssl.dylib`. + +In an embedded build there is typically no separate library archive — +wolfSSL `.c` files are compiled directly into your firmware. Asking the +customer to synthesize a `.a` purely for SBOM purposes would be artificial +and would make the build harder. Instead, the standalone path computes +an OmniBOR-compatible Merkle hash over the wolfSSL source files you list +in `--srcs`: + +1. For each source file, compute its OmniBOR `gitoid` (SHA-256 over + `"blob \0" + filecontents`, byte-identical to + `git hash-object --object-format=sha256`). +2. Sort by basename. +3. Hash the concatenated `(basename, gitoid)` pairs. + +The resulting hash: + +- represents *"the wolfSSL source code that is in this firmware"*, which + is what an auditor actually wants to see; +- is independent of the order you pass `--srcs`, the absolute paths on + the build host, or the build host's filesystem; +- changes if any compiled-in source byte changes (catches tampering and + back-ported patches); +- interoperates with bomsh / OmniBOR tooling, which key off the same + gitoid format. + +Each standalone SBOM is annotated with two extra properties so the +checksum's semantics are unambiguous to downstream consumers: + +```json +{ "name": "wolfssl:sbom:hash-kind", "value": "source-merkle-omnibor" }, +{ "name": "wolfssl:sbom:source-set", "value": "aes.c,dh.c,sha.c,sha256.c,..." } +``` + +The autotools `make sbom` path keeps `wolfssl:sbom:hash-kind` implicit +(equal to `library-binary`) so its output stays byte-identical to +previous releases. + +### 1.5 Pre-processed defines (no pcpp needed) + +If `pcpp` is unavailable on your build host or you prefer to use the +compiler that actually builds wolfSSL, dump its post-preprocessor `#define` +table and pass that to `--options-h`: + +```sh +$CC $CFLAGS -dM -E \ + -DWOLFSSL_USER_SETTINGS \ + -include wolfssl/wolfcrypt/settings.h \ + -x c /dev/null > build/wolfssl-defines.h + +python3 scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file LICENSING \ + --options-h build/wolfssl-defines.h \ + --srcs wolfcrypt/src/aes.c wolfcrypt/src/sha.c ... \ + --cdx-out wolfssl-5.9.1.cdx.json \ + --spdx-out wolfssl-5.9.1.spdx.json +``` + +`--options-h` reads any flat C header containing `#define NAME VALUE` +lines, so the GCC / Clang / `armclang` `-dM -E` output drops in directly. +For IAR (`--predef_macros`) or legacy Keil `armcc` (`--list_macros`) the +output may need a one-line `sed` to match the `#define NAME VALUE` +shape. + +**Noise filtering.** A raw `$CC -dM -E` dump contains hundreds of +compiler / preprocessor reserved identifiers (`__VERSION__`, `__SSE2__`, +`__INT_FAST32_MAX__`, `_LP64`, …) and on macOS also Apple's entire +`` family (`TARGET_OS_MAC=1`, `TARGET_IPHONE_*`, +…) which leak in via system header inclusion. These describe the +*build host*, not wolfSSL, and would otherwise drown out the wolfSSL +configuration in the SBOM and break reproducibility across hosts. +`gen-sbom` filters them automatically; the SBOM ends up with only the +`HAVE_*` / `WOLFSSL_*` / `NO_*` / etc. macros that actually describe +the wolfSSL build. See `_is_noise_macro` in `scripts/gen-sbom` for the +exact policy and the test cases in `scripts/test_gen_sbom.py` +(`TestIsNoiseMacro`) for the pinned coverage. + +Header-suffix carve-out: the filter also drops include-guard names +(`*_H` like `WOLF_CRYPT_SETTINGS_H`, `WOLFSSL_OPTIONS_H`) but +preserves real wolfSSL configuration that happens to end in `_H`. +The carve-out tokens are `HAVE_`, `NO_`, and `USE_`, which between +them cover every `_H`-suffixed configuration flag in the wolfSSL +tree: + +* autoconf header probes — `HAVE_STDINT_H`, `WOLFSSL_HAVE_ATOMIC_H`, + `WOLFSSL_HAVE_ASSERT_H`, … +* stdlib disablement (NETOS / Telit / similar RTOS profiles) — + `NO_STDINT_H`, `NO_STDLIB_H`, `NO_LIMITS_H`, `NO_CTYPE_H`, + `NO_STRING_H`, `NO_STDDEF_H`, `WOLFSSL_NO_ASSERT_H` +* build-mode toggles — `USE_FLAT_TEST_H`, `USE_FLAT_BENCHMARK_H` + +Customers using a `_H`-suffixed feature flag that does not carry +one of these tokens (e.g. a debug-only opt-in) should rename it +to drop the `_H` suffix, or open an issue to extend the carve-out. + +The `--user-settings` (pcpp) path applies the same filter, so both +entry points produce semantically equivalent build-property sets for +the same effective configuration. + +**Hard-fail on preprocessing errors.** When the `--user-settings` +path encounters an `#error` directive, an unbalanced `#if`, or a +missing `#include` while walking `settings.h`, `gen-sbom` exits +non-zero rather than emitting a partial SBOM. pcpp would otherwise +print a diagnostic and continue, producing an artefact that silently +omits whatever configuration came after the failure — exactly the +kind of silent drift a CRA reviewer cannot detect. Fix the upstream +issue (or supply the missing `--user-settings-include` / +`--user-settings-define` arguments) and rerun. + +### 1.6 Per-IDE / per-toolchain recipes + +#### 1.6.1 Custom Makefile (most embedded projects) + +Drop these rules into your project Makefile. The `WOLFCRYPT_OBJS` / +`WOLFCRYPT_SRCS` variables almost certainly already exist in your build +since they list the wolfSSL files you compile. + +```makefile +WOLFSSL_DIR ?= ../wolfssl + +build/libwolfssl-sbom.a: $(WOLFCRYPT_OBJS) + $(AR) rcs $@ $^ + +sbom: build/libwolfssl-sbom.a + python3 $(WOLFSSL_DIR)/scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file $(WOLFSSL_DIR)/LICENSING \ + --user-settings $(WOLFSSL_DIR)/wolfssl/wolfcrypt/settings.h \ + --user-settings-include $(WOLFSSL_DIR) \ + --user-settings-include $(USER_SETTINGS_DIR) \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --srcs $(WOLFCRYPT_SRCS) \ + --cdx-out wolfssl-5.9.1.cdx.json \ + --spdx-out wolfssl-5.9.1.spdx.json +``` + +Note: the `.a` here is optional. If you prefer to skip it and rely on +`--srcs` for the checksum (the recommended embedded mode), drop the +`build/libwolfssl-sbom.a` rule and remove its dependency from `sbom:`. + +#### 1.6.2 ESP-IDF (Espressif) + +ESP-IDF builds with CMake/Ninja and exposes a `CMakeLists.txt` per +component. Add a custom target to `components/wolfssl/CMakeLists.txt`: + +```cmake +add_custom_target(wolfssl-sbom + COMMAND python3 ${CMAKE_CURRENT_SOURCE_DIR}/scripts/gen-sbom + --name wolfssl --version 5.9.1 + --license-file ${CMAKE_CURRENT_SOURCE_DIR}/LICENSING + --user-settings ${CMAKE_CURRENT_SOURCE_DIR}/wolfssl/wolfcrypt/settings.h + --user-settings-include ${CMAKE_CURRENT_SOURCE_DIR} + --user-settings-include ${WOLFSSL_USER_SETTINGS_DIR} + --user-settings-define WOLFSSL_USER_SETTINGS + --srcs ${WOLFSSL_SRCS} + --cdx-out ${CMAKE_BINARY_DIR}/wolfssl-5.9.1.cdx.json + --spdx-out ${CMAKE_BINARY_DIR}/wolfssl-5.9.1.spdx.json + VERBATIM) +``` + +Then `idf.py wolfssl-sbom` produces both SBOM files in the build +directory. + +#### 1.6.3 Zephyr + +```cmake +# In your application CMakeLists.txt or a Zephyr module CMake file: +add_custom_target(wolfssl-sbom + COMMAND ${PYTHON_EXECUTABLE} ${ZEPHYR_WOLFSSL_MODULE_DIR}/scripts/gen-sbom + --name wolfssl --version 5.9.1 + --license-file ${ZEPHYR_WOLFSSL_MODULE_DIR}/LICENSING + --user-settings ${ZEPHYR_WOLFSSL_MODULE_DIR}/wolfssl/wolfcrypt/settings.h + --user-settings-include ${ZEPHYR_WOLFSSL_MODULE_DIR} + --user-settings-define WOLFSSL_USER_SETTINGS + --srcs ${WOLFSSL_SOURCES} + --cdx-out ${CMAKE_BINARY_DIR}/wolfssl.cdx.json + --spdx-out ${CMAKE_BINARY_DIR}/wolfssl.spdx.json) +``` + +Run with `west build -t wolfssl-sbom`. + +#### 1.6.4 STM32CubeIDE + +STM32CubeIDE generates Eclipse CDT-managed Makefiles. Add the SBOM +recipe as a *post-build step*: **Project → Properties → C/C++ Build → +Settings → Build Steps → Post-build steps**: + +```sh +python3 ${ProjDirPath}/Drivers/wolfssl/scripts/gen-sbom \ + --name wolfssl --version 5.9.1 \ + --license-file ${ProjDirPath}/Drivers/wolfssl/LICENSING \ + --user-settings ${ProjDirPath}/Drivers/wolfssl/wolfssl/wolfcrypt/settings.h \ + --user-settings-include ${ProjDirPath}/Drivers/wolfssl \ + --user-settings-include ${ProjDirPath}/Core/Inc \ + --user-settings-define WOLFSSL_USER_SETTINGS \ + --srcs ${ProjDirPath}/Drivers/wolfssl/wolfcrypt/src/aes.c [...] \ + --cdx-out ${ProjDirPath}/wolfssl.cdx.json \ + --spdx-out ${ProjDirPath}/wolfssl.spdx.json +``` + +#### 1.6.5 Keil μVision (MDK-ARM) + +Use **Options for Target → User → Run #1 (After Build)**: + +``` +python3 .\Drivers\wolfssl\scripts\gen-sbom --name wolfssl --version 5.9.1 ^ + --license-file .\Drivers\wolfssl\LICENSING ^ + --user-settings .\Drivers\wolfssl\wolfssl\wolfcrypt\settings.h ^ + --user-settings-include .\Drivers\wolfssl ^ + --user-settings-define WOLFSSL_USER_SETTINGS ^ + --srcs .\Drivers\wolfssl\wolfcrypt\src\aes.c [...] ^ + --cdx-out .\wolfssl.cdx.json --spdx-out .\wolfssl.spdx.json +``` + +For legacy `armcc` 5.x toolchains where `-dM -E` is not available, use +the modern `armclang` (Keil v6) which is GCC-flag-compatible. + +#### 1.6.6 IAR EWARM + +Use **Project → Options → Build Actions → Post-build command line** (one +line, all on one logical line in EWARM): + +``` +python3 $PROJ_DIR$\..\wolfssl\scripts\gen-sbom --name wolfssl --version 5.9.1 + --license-file $PROJ_DIR$\..\wolfssl\LICENSING + --user-settings $PROJ_DIR$\..\wolfssl\wolfssl\wolfcrypt\settings.h + --user-settings-include $PROJ_DIR$\..\wolfssl + --user-settings-define WOLFSSL_USER_SETTINGS + --srcs $PROJ_DIR$\..\wolfssl\wolfcrypt\src\aes.c [...] + --cdx-out $PROJ_DIR$\wolfssl.cdx.json + --spdx-out $PROJ_DIR$\wolfssl.spdx.json +``` + +#### 1.6.7 Plain CMake (any project) + +```cmake +add_custom_target(wolfssl-sbom + COMMAND ${Python3_EXECUTABLE} ${CMAKE_SOURCE_DIR}/wolfssl/scripts/gen-sbom + --name wolfssl --version 5.9.1 + --license-file ${CMAKE_SOURCE_DIR}/wolfssl/LICENSING + --user-settings ${CMAKE_SOURCE_DIR}/wolfssl/wolfssl/wolfcrypt/settings.h + --user-settings-include ${CMAKE_SOURCE_DIR}/wolfssl + --user-settings-include ${WOLFSSL_USER_SETTINGS_DIR} + --user-settings-define WOLFSSL_USER_SETTINGS + --srcs ${WOLFSSL_C_SOURCES} + --cdx-out ${CMAKE_BINARY_DIR}/wolfssl-${WOLFSSL_VERSION}.cdx.json + --spdx-out ${CMAKE_BINARY_DIR}/wolfssl-${WOLFSSL_VERSION}.spdx.json) +``` + +### 1.7 Reproducibility + +The standalone path honors `SOURCE_DATE_EPOCH` exactly the same way +`make sbom` does. Two runs against the same source tree, settings, and +source set with the same `SOURCE_DATE_EPOCH` produce byte-identical +`.spdx.json` and `.cdx.json` files. This is regression-tested in CI. + +--- + +## 2. Autotools convenience wrapper (`make sbom`) + +For Linux server / Debian / RPM / Yocto / FIPS-Ready customers who +already run `./configure && make`, `make sbom` is a one-line shortcut +that wraps `scripts/gen-sbom` with all paths resolved automatically. + +### 2.1 Quick start ```sh ./configure @@ -25,7 +371,7 @@ make sbom Requires `python3` and `pyspdxtools` (`pip install spdx-tools`). -### Full coverage: component identity + build provenance +### 2.2 Full coverage: component identity + build provenance ```sh ./configure @@ -35,16 +381,12 @@ make bomsh ``` Additionally requires `bomtrace3` and `bomsh_create_bom.py` in `PATH`. -See [Prerequisites for make bomsh](#prerequisites-for-make-bomsh) below. +See [Prerequisites for make bomsh](#3-make-bomsh) below. All tools are detected by `configure`; either target fails with a clear error message if a required tool is missing. ---- - -## make sbom - -### Output files +### 2.3 Output files `make sbom` produces three files in the build directory: @@ -58,7 +400,7 @@ The `.spdx` tag-value file is produced by `pyspdxtools` converting the `.spdx.json`. If the JSON fails SPDX validation, `make sbom` stops with a non-zero exit and the tag-value file is not written. -### SBOM contents +### 2.4 SBOM contents Both formats contain the same information: @@ -134,7 +476,10 @@ its `.pc` file is missing): - CycloneDX omits the `version` and `purl` fields entirely and the generator prints a warning to stderr. -### Validating the SBOM manually +For embedded / cross-compile builds without `pkg-config`, the standalone +entry point exposes a `--dep-version libz=1.3.1` override (see § 1.2). + +### 2.5 Validating the SBOM manually ```sh # Validate SPDX JSON @@ -145,7 +490,7 @@ pyspdxtools --infile wolfssl-.spdx.json \ --outfile wolfssl-.spdx.rdf ``` -### Installing the SBOM +### 2.6 Installing the SBOM ```sh make install-sbom # installs to $(datadir)/doc/wolfssl/ @@ -154,25 +499,31 @@ make uninstall-sbom # removes the installed files The generated files are removed by `make clean`. -### Implementation notes +### 2.7 Implementation notes -SBOM generation is implemented in `scripts/gen-sbom` (Python 3, stdlib only) -and hooked into the autotools build via `Makefile.am` and `configure.ac`. -The script stages a `make install` into a temporary directory, hashes the -installed library, generates both SBOM formats, then removes the staging -directory. The `pyspdxtools` validation and conversion step runs after -generation and gates the build on SPDX conformance. +SBOM generation is implemented in `scripts/gen-sbom` (Python 3, stdlib +only for the autotools path) and hooked into the autotools build via +`Makefile.am` and `configure.ac`. The script stages a `make install` +into a temporary directory, hashes the installed library, generates both +SBOM formats, then removes the staging directory. The `pyspdxtools` +validation and conversion step runs after generation and gates the build +on SPDX conformance. + +The standalone embedded entry point (§ 1) calls the same script with +different flags; the autotools target is essentially a path-resolver +wrapper that finds the installed library, the autotools-generated +`options.h`, and the `pkg-config` versions of any linked deps. --- -## make bomsh +## 3. make bomsh `make bomsh` uses the [Bomsh](https://github.com/omnibor/bomsh) project to trace the wolfSSL build under `bomtrace3` (a patched `strace`) and produce an OmniBOR artifact dependency graph: a content-addressed Merkle DAG mapping every built binary back to the exact set of source files that produced it. -### Prerequisites for make bomsh +### 3.1 Prerequisites for make bomsh | Tool | Required | Where to get it | |---|---|---| @@ -186,6 +537,12 @@ any stock Linux kernel. The only environments where it may be unavailable are containers running with a hardened seccomp profile or systems with `kernel.yama.ptrace_scope=3`. +`make bomsh` is **Linux-host-only by design**. For non-Linux build hosts +(macOS, Windows), use a Linux CI runner / WSL2 / a Linux container. The +target running the produced wolfSSL binary can be anything — bomsh traces +the cross-compiler invocation on Linux regardless of what platform the +binary will eventually run on. + #### Building bomtrace3 ```sh @@ -201,7 +558,7 @@ cp src/strace ~/.local/bin/bomtrace3 Place `bomsh_create_bom.py` (and optionally `bomsh_sbom.py`) from the bomsh `scripts/` directory somewhere in `PATH`. -### What make bomsh does +### 3.2 What make bomsh does 1. Writes a build-local `_bomsh.conf` redirecting the raw logfile out of `/tmp/` to the build directory (avoids collisions between concurrent @@ -217,7 +574,7 @@ Place `bomsh_create_bom.py` (and optionally `bomsh_sbom.py`) from the bomsh exists (from `make sbom`), annotates that SPDX document with OmniBOR `ExternalRef` identifiers, producing `omnibor.wolfssl-.spdx.json`. -### Output files +### 3.3 Output files | Path | Description | |---|---| @@ -238,7 +595,7 @@ The `PERSISTENT-ID gitoid` entry added to the enriched SPDX looks like: This sits alongside the existing CPE and PURL `externalRefs` on the wolfSSL package entry and is the key into the OmniBOR Merkle DAG in `omnibor/`. -### Installing +### 3.4 Installing ```sh make install-bomsh # installs omnibor/ and enriched SPDX to $(datadir)/doc/wolfssl/ @@ -247,7 +604,7 @@ make uninstall-bomsh # removes installed files The generated files are removed by `make clean`. -### Implementation notes +### 3.5 Implementation notes `make bomsh` runs a full clean rebuild under `bomtrace3` on every invocation. The ~20% runtime overhead of `bomtrace3` means the rebuild takes roughly @@ -259,7 +616,7 @@ are written to the build directory and removed by `make clean`. The --- -## Combined workflow +## 4. Combined workflow Running both targets produces the complete set of supply chain transparency artefacts. `make bomsh` automatically enriches the SPDX document from @@ -288,7 +645,7 @@ file. --- -## Using wolfSSL's artefacts in a product +## 5. Using wolfSSL's artefacts in a product If you are shipping a product that includes wolfSSL and need to satisfy CRA obligations, see `doc/CRA.md` for guidance on integrating these artefacts diff --git a/scripts/gen-sbom b/scripts/gen-sbom index 95380252138..192ccb9fd40 100755 --- a/scripts/gen-sbom +++ b/scripts/gen-sbom @@ -3,6 +3,7 @@ import argparse import hashlib +import io import json import os import re @@ -221,23 +222,135 @@ def pkgconfig_version(pkgname): return None -def dep_version(key): +def dep_version(key, overrides=None): """Resolve the runtime version of a DEP_META entry. - Every live entry exposes a `pkgconfig` package name; if pkg-config - cannot answer (package missing or `.pc` not on PKG_CONFIG_PATH) we - return None and the caller emits NOASSERTION (SPDX) / omits the - version (CycloneDX). A previous source-tree fallback that used - `git describe` against `git_root` was removed once libxmss/liblms - were dropped upstream; if a future PQ dep returns to a source-only - integration, restore the fallback here together with a `git_root` - field on the DEP_META entry.""" + Resolution order: + 1. Explicit override from `overrides[key]` (set via the + --dep-version CLI flag). This is the only path that works + for embedded / cross-compile builds where pkg-config is not + available on the host that runs gen-sbom. + 2. `pkg-config --modversion `. Used by the autotools + path on a typical Linux server where the linked dep was + installed via the system package manager. + 3. None. Caller emits NOASSERTION (SPDX) / omits the version + (CycloneDX). + + A previous source-tree fallback that used `git describe` against + `git_root` was removed once libxmss/liblms were dropped upstream; + if a future PQ dep returns to a source-only integration, restore + the fallback here together with a `git_root` field on the DEP_META + entry.""" + if overrides and key in overrides: + return overrides[key] return pkgconfig_version(DEP_META[key]['pkgconfig']) +# Patterns for #define names that pollute the SBOM with build-environment +# noise rather than wolfSSL configuration. Applied identically to +# parse_options_h (no-pcpp / autotools path) and parse_user_settings +# (pcpp embedded path) so both entry points produce semantically +# equivalent build-property sets for the same effective configuration. +# +# Three families are filtered: +# +# 1. Compiler / preprocessor reserved identifiers (`__*`, `_[A-Z]*`). +# ISO C 7.1.3 reserves these for the implementation; clang, gcc, and +# pcpp emit dozens of them (`__VERSION__`, `__SSE2__`, `_LP64`, ...). +# They describe the build *host*, not wolfSSL, and break SBOM +# reproducibility across hosts (same wolfSSL config built on macOS +# clang vs. arm-none-eabi-gcc otherwise produces different SBOMs). +# +# 2. Apple macros (`TARGET_OS_*`, +# `TARGET_IPHONE_*`). The no-pcpp escape hatch +# (`$CC -dM -E -include settings.h`) on macOS transitively pulls in +# macOS system headers and emits this entire family; without the +# filter, a wolfSSL SBOM for an STM32 firmware would falsely +# advertise TARGET_OS_MAC=1 if generated on a Mac. +# +# 3. Header include guards (`*_H` whose token does NOT carry an +# autoconf / wolfSSL configuration prefix). +# wolfssl/options.h itself and many internal wolfSSL headers define +# guards like WOLFSSL_OPTIONS_H, WOLF_CRYPT_SETTINGS_H, and +# WOLFCRYPT_TEST_*_H to prevent double inclusion. Those describe +# *which file was parsed*, not configuration choices. +# +# The carve-out tokens (`HAVE_`, `NO_`, `USE_`) are critical: real +# wolfSSL configuration flags also end in `_H` and would otherwise +# be silently filtered out, falsifying the SBOM for the customers +# who rely on them most: +# +# * `HAVE_*_H` / `WOLFSSL_HAVE_*_H` - autoconf AC_CHECK_HEADER +# results (HAVE_STDINT_H, WOLFSSL_HAVE_ATOMIC_H, +# WOLFSSL_HAVE_ASSERT_H, ...). Gates `#if defined(...)` +# branches in wc_port.h / types.h. +# * `NO_*_H` / `WOLFSSL_NO_*_H` - explicit stdlib / feature +# suppression (NO_STDINT_H, NO_STDLIB_H, NO_LIMITS_H, +# NO_CTYPE_H, NO_STRING_H, NO_STDDEF_H, WOLFSSL_NO_ASSERT_H). +# Set by NETOS / Telit / other RTOS profiles in settings.h to +# replace stdlib headers with vendor headers; gates branches +# in types.h:398 / settings.h:3850 / sp.h:42. +# * `USE_*_H` - build-mode toggles (USE_FLAT_TEST_H, +# USE_FLAT_BENCHMARK_H). Gates which test/benchmark layout +# is compiled in test.c:165 / benchmark.c:219 / server.c:70. +# +# Heuristic limitation: a stray feature flag that ends in `_H` +# without one of those tokens (e.g. WOLFSSL_DEBUG_TRACE_ERROR_CODES_H, +# a debug-only opt-in) would still be filtered. Customers who +# depend on such a flag can either move it to a non-`_H`-suffixed +# name in their user_settings.h, or feed gen-sbom the full +# `$CC -dM -E` dump via --options-h together with a hand-edited +# add-back file. None of the embedded customer profiles in the +# tree (NETOS, Telit, Zephyr, ESP-IDF, GCC-ARM, MDK, IAR, NUTTX) +# use such flags, which is why we accept the heuristic. +_NOISE_MACRO_RE = re.compile( + r'^(?:' + r'__\w+' # compiler/preprocessor reserved + r'|_[A-Z][A-Z0-9_]*' # ISO C reserved (e.g. _LP64) + r'|TARGET_OS_\w+' # Apple TargetConditionals leak + r'|TARGET_IPHONE_\w+' # Apple TargetConditionals leak + r')$' +) + +# Tokens that, when present anywhere in a `*_H` macro name, mark it as +# real wolfSSL / autoconf configuration rather than a header include +# guard. Kept tight on purpose - widening (e.g. adding `DEBUG_` or +# `WOLFSSL_`) would let through real guards like WOLFSSL_OPTIONS_H. +_CONFIG_H_TOKENS = ('HAVE_', 'NO_', 'USE_') + + +def _is_noise_macro(name): + """True if `name` is a build-environment artefact rather than wolfSSL + configuration, and therefore must not appear as a SBOM + `wolfssl:build:*` property. + + Drops three families (see the module-level comment block on + `_NOISE_MACRO_RE` for full rationale): + 1. Compiler / preprocessor reserved (`__*`, `_[A-Z]*`). + 2. Apple (`TARGET_OS_*`, `TARGET_IPHONE_*`). + 3. Header include guards (`*_H` not carrying any of + `_CONFIG_H_TOKENS`). + """ + if _NOISE_MACRO_RE.match(name): + return True + if name.endswith('_H') and not any(t in name for t in _CONFIG_H_TOKENS): + return True + return False + + def parse_options_h(path): - """Parse wolfssl/options.h and return sorted deduplicated list of - (name, value) pairs for every #define found. + """Parse a flat `#define` header and return a sorted deduplicated + list of (name, value) pairs for every wolfSSL-relevant macro. + + Accepts both autotools-generated `wolfssl/options.h` (curated by + ./configure, contains only wolfSSL macros plus its own header guard) + and raw compiler output from `$CC -dM -E -include settings.h ...` + (the no-pcpp escape hatch documented in doc/SBOM.md § 1.5). The + latter case motivates the `_is_noise_macro` filter: a `clang -dM -E` + dump contains hundreds of compiler internals (`__VERSION__`, + `__SSE2__`, `__INT_FAST32_MAX__`) and Apple system header leaks + (`TARGET_OS_MAC`) that would otherwise drown out the wolfSSL + configuration in the SBOM and break reproducibility across hosts. Trailing C/C++ comments on a #define line (`#define HAVE_FOO 42 /* x */` or `// y`) are stripped; otherwise they would land verbatim in the @@ -251,17 +364,179 @@ def parse_options_h(path): defines = {} for m in re.finditer(r'^#define[ \t]+(\w+)(?:[ \t]+(.*))?$', text, re.MULTILINE): + name = m.group(1) + if _is_noise_macro(name): + continue raw = (m.group(2) or '') raw = re.split(r'/\*|//', raw, maxsplit=1)[0] - defines[m.group(1)] = raw.strip() + defines[name] = raw.strip() + return sorted(defines.items()) + + +def parse_user_settings(settings_h_path, include_dirs, predefines): + """Walk wolfssl/wolfcrypt/settings.h through pcpp and return the same + sorted [(name, value), ...] list shape that parse_options_h() returns. + + The customer's user_settings.h is included transitively via the + standard `#ifdef WOLFSSL_USER_SETTINGS` gate inside settings.h, so the + caller predefines `WOLFSSL_USER_SETTINGS` and adds the directory of + user_settings.h to `include_dirs`. This mirrors the way the C compiler + actually sees the wolfSSL build, so the SBOM build properties reflect + the real compiled configuration rather than just the literal text of + user_settings.h. + + Filters (see `_is_noise_macro` for the shared family list used by + both this function and parse_options_h): + * compiler/preprocessor reserved names (`__*`, `_[A-Z]*`). pcpp's + own internals (__DATE__/__TIME__/__PCPP__/__FILE__) and any host + compiler defines transitively leaking through pcpp's preprocess + would otherwise break reproducibility across build hosts. + * Apple macros (`TARGET_OS_*`, + `TARGET_IPHONE_*`). Defensive: pcpp does not auto-include + system headers, but a customer's user_settings.h may. + * header guards (`*_H` whose token does not carry an autoconf / + wolfSSL config prefix - see _CONFIG_H_TOKENS). wolfSSL's own + settings.h / visibility.h emit guards like + WOLF_CRYPT_SETTINGS_H that describe inclusion, not + configuration; real `_H` configuration flags (NO_STDINT_H, + USE_FLAT_TEST_H, WOLFSSL_NO_ASSERT_H) are preserved. + * function-like macros are dropped (they are API surface, not + build configuration; including their post-expansion body would + also break reproducibility under whitespace/token-render drift). + + pcpp is imported lazily so the autotools path (which uses + parse_options_h) does not require the dependency. + """ + try: + from pcpp import Preprocessor + except ImportError: + sys.exit( + "ERROR: --user-settings requires the 'pcpp' Python preprocessor.\n" + " Install: pip install pcpp\n" + " Or pre-process externally and pass the result via " + "--options-h instead\n" + " (e.g. $CC -dM -E -include wolfssl/wolfcrypt/settings.h " + "-DWOLFSSL_USER_SETTINGS - < /dev/null)." + ) + + pp = Preprocessor() + pp.line_directive = None + for d in include_dirs: + pp.add_path(d) + for predefine in predefines: + # Compiler-style `-D KEY=VALUE` is the universal CLI shape; + # translate to the `"KEY VALUE"` form pcpp.define() expects. + # Bare `-D KEY` (no value) maps to `"KEY"`, also accepted. + spec = predefine.replace('=', ' ', 1) if '=' in predefine else predefine + pp.define(spec) + + try: + with open(settings_h_path) as f: + text = f.read() + except OSError as e: + sys.exit(f"ERROR: cannot read settings.h {settings_h_path}: {e}") + + pp.parse(text, source=settings_h_path) + # pcpp.write() is what actually drives the preprocessor through #if / + # #ifdef resolution and populates pp.macros with the surviving + # defines. The output stream is intentionally discarded - we only + # care about pp.macros - but this call is NOT optional. + sink = io.StringIO() + pp.write(sink) + + # pcpp signals fatal preprocessing problems (an `#error` directive + # firing, an unbalanced `#if`, a missing #include, etc.) by setting + # pp.return_code to non-zero and printing to stderr; it does NOT + # raise. For an SBOM tool whose contract is "this artefact + # faithfully describes the build", a partial macro table produced + # before the failure is the worst possible output - the SBOM would + # silently omit configuration the customer set. Hard-fail instead + # so the build pipeline notices. + if pp.return_code != 0: + sys.exit( + f"ERROR: pcpp failed to preprocess {settings_h_path} " + f"(return_code={pp.return_code}); the resulting SBOM would " + f"be incomplete. Check the pcpp diagnostics printed above " + f"for the offending #error / #include / #if directive." + ) + + defines = {} + for name, macro in pp.macros.items(): + if _is_noise_macro(name): + continue + if macro.arglist is not None: + continue + tokens = macro.value or [] + defines[name] = ' '.join(t.value for t in tokens).strip() return sorted(defines.items()) -def cdx_dep_component(name, pkg_version, key): +def gitoid_blob_sha256(path): + """Compute the OmniBOR / git SHA-256 gitoid for a single file. + + The format is `sha256("blob " + filesize + "\\0" + filecontents)` + which is byte-identical to `git hash-object --object-format=sha256`. + Using the gitoid (rather than a plain SHA-256) lets the source-set + Merkle hash interoperate with bomsh/OmniBOR tooling: a customer can + cross-reference the wolfSSL SBOM's component hash with the entries + in an OmniBOR artifact dependency graph and confirm the same files + on both sides. + + The well-known empty-blob gitoid sha256 is + 473a0f4c3be8a93681a267e3b1e9a7dcda1185436fe141f7749120a303721813 + (regression-tested in scripts/test_gen_sbom.py). + """ + h = hashlib.sha256() + try: + size = os.path.getsize(path) + h.update(f'blob {size}\x00'.encode()) + with open(path, 'rb') as f: + for chunk in iter(lambda: f.read(65536), b''): + h.update(chunk) + except OSError as e: + sys.exit(f"ERROR: cannot read source for hashing: {e}") + return h.hexdigest() + + +def srcs_merkle_hash(src_paths): + """Deterministic SHA-256 over a sorted list of (basename, gitoid) + pairs for the given source files. + + Two customers compiling the same wolfSSL release with the same set + of source files get identical hashes regardless of where their + wolfSSL tree lives on disk, the order they passed --srcs, or the + filesystem they built on. Sorting on basename only (not full path) + is what makes this true; collisions across basenames would matter + in theory but wolfSSL's source layout has unique basenames per file + by construction. + + A one-byte change in any compiled-in source produces a different + hash, which is the property that makes this useful as the SBOM + component checksum for embedded builds with no separate library + archive.""" + seen = set() + entries = [] + for path in src_paths: + name = os.path.basename(path) + if name in seen: + sys.exit( + f"ERROR: duplicate basename in --srcs: {name!r}\n" + f" Source files must have unique basenames so the " + f"Merkle hash is order-independent.") + seen.add(name) + entries.append((name, gitoid_blob_sha256(path))) + entries.sort() + h = hashlib.sha256() + for name, oid in entries: + h.update(f'{name}\x00{oid}\n'.encode()) + return h.hexdigest() + + +def cdx_dep_component(name, pkg_version, key, dep_version_overrides=None): """Return (bom_ref, component_dict) for a CDX dependency component. bom_ref is deterministic for reproducibility.""" meta = DEP_META[key] - version = dep_version(key) + version = dep_version(key, dep_version_overrides) bom_ref = derived_uuid(name, pkg_version, 'dep', key) comp = { 'bom-ref': bom_ref, @@ -280,10 +555,10 @@ def cdx_dep_component(name, pkg_version, key): return bom_ref, comp -def spdx_dep_package(key): +def spdx_dep_package(key, dep_version_overrides=None): """Return (spdx_id, package_dict) for an SPDX dependency package.""" meta = DEP_META[key] - version = dep_version(key) + version = dep_version(key, dep_version_overrides) spdx_id = 'SPDXRef-Package-' + re.sub(r'[^A-Za-z0-9.]', '', meta['name']) pkg = { 'SPDXID': spdx_id, @@ -306,13 +581,15 @@ def spdx_dep_package(key): def generate_cdx(name, version, supplier, license_id, license_text, lib_hash, - timestamp, year, serial, enabled_deps, build_props): + timestamp, year, serial, enabled_deps, build_props, + dep_version_overrides=None, hash_kind='library-binary', + srcs_basenames=None): bom_ref = derived_uuid(name, version, 'package') dep_bom_refs = [] components = [] for key in enabled_deps: - ref, comp = cdx_dep_component(name, version, key) + ref, comp = cdx_dep_component(name, version, key, dep_version_overrides) dep_bom_refs.append(ref) components.append(comp) @@ -320,6 +597,20 @@ def generate_cdx(name, version, supplier, license_id, license_text, lib_hash, {'name': f'wolfssl:build:{k}', 'value': v if v else '1'} for k, v in build_props ] + # Document what the SHA-256 in `hashes` represents, but only for + # the source-merkle entry point. The autotools / library-binary + # path keeps its existing output shape byte-identical so CI's + # reproducibility diff does not regress. Auditors looking at a + # source-merkle SBOM need this annotation to interpret the + # checksum correctly (vs. a library-artefact checksum). + if hash_kind != 'library-binary': + properties.append( + {'name': 'wolfssl:sbom:hash-kind', 'value': hash_kind}) + if srcs_basenames: + properties.append({ + 'name': 'wolfssl:sbom:source-set', + 'value': ','.join(srcs_basenames), + }) return { '$schema': 'http://cyclonedx.org/schema/bom-1.6.schema.json', @@ -364,8 +655,19 @@ def generate_cdx(name, version, supplier, license_id, license_text, lib_hash, def generate_spdx(name, version, supplier, license_id, license_text, lib_hash, - timestamp, year, doc_ns_uuid, enabled_deps, build_props): + timestamp, year, doc_ns_uuid, enabled_deps, build_props, + dep_version_overrides=None, hash_kind='library-binary', + srcs_basenames=None): build_defines = ', '.join(k for k, _ in build_props) + # Only annotate the comment when running the source-merkle entry + # point. The autotools / library-binary path keeps its existing + # output shape byte-identical so reproducibility CI does not + # regress. + if hash_kind != 'library-binary': + build_defines += f' | hash-kind={hash_kind}' + if srcs_basenames: + build_defines += ( + ' | source-set=' + ','.join(srcs_basenames)) wolfssl_pkg = { 'SPDXID': 'SPDXRef-Package-wolfssl', 'name': name, @@ -402,7 +704,7 @@ def generate_spdx(name, version, supplier, license_id, license_text, lib_hash, }] for key in enabled_deps: - spdx_id, pkg = spdx_dep_package(key) + spdx_id, pkg = spdx_dep_package(key, dep_version_overrides) packages.append(pkg) relationships.append({ 'spdxElementId': 'SPDXRef-Package-wolfssl', @@ -436,17 +738,38 @@ def generate_spdx(name, version, supplier, license_id, license_text, lib_hash, return doc +def _parse_dep_version_overrides(spec_list): + """Parse repeated --dep-version KEY=VERSION flags into a dict. + Rejects unknown keys early so a typo (e.g. --dep-version libssl=…) + does not silently produce an SBOM that omits the dep version.""" + overrides = {} + for spec in spec_list: + if '=' not in spec: + sys.exit( + f"ERROR: --dep-version expects KEY=VERSION, got {spec!r}") + key, _, value = spec.partition('=') + if key not in DEP_META: + sys.exit( + f"ERROR: --dep-version key {key!r} is not a known wolfSSL " + f"dependency. Known keys: {', '.join(sorted(DEP_META))}.") + overrides[key] = value + return overrides + + def main(): parser = argparse.ArgumentParser( - description='Generate CycloneDX and SPDX SBOMs for wolfssl' + description='Generate CycloneDX and SPDX SBOMs for wolfssl. ' + 'Supports two entry-point shapes: the autotools / ' + 'library-binary form (--options-h + --lib) used by ' + '`make sbom`, and the standalone embedded form ' + '(--user-settings + --srcs) used by customers who ' + 'build with their own Makefile / IDE and never run ' + './configure.' ) parser.add_argument('--name', required=True, help='Package name') parser.add_argument('--version', required=True, help='Package version') parser.add_argument('--supplier', default='wolfSSL Inc.', help='Supplier name (default: wolfSSL Inc.)') - parser.add_argument('--lib', required=True, - help='Path to the wolfSSL library artifact ' - '(shared or static) for SHA-256 hashing') parser.add_argument('--license-file', required=True, help='Path to LICENSING file for SPDX ID detection') parser.add_argument('--license-override', default='', @@ -461,20 +784,80 @@ def main(): '`--license-override`. Required by SPDX 2.3 ' 'validators (e.g. pyspdxtools) for any custom ' 'licence reference.') - parser.add_argument('--options-h', required=True, - help='Path to wolfssl/options.h for build config') + # Build-configuration source: pick exactly one. + parser.add_argument('--options-h', + help='Path to wolfssl/options.h for build config ' + '(autotools entry point). The file is read ' + 'as a flat list of #define directives; pre-' + 'processed `$CC -dM -E -include settings.h` ' + 'output works equivalently.') + parser.add_argument('--user-settings', + help='Path to wolfssl/wolfcrypt/settings.h to walk ' + 'through pcpp (embedded entry point). Combine ' + 'with --user-settings-include to point at the ' + 'directory containing user_settings.h, and ' + '`--user-settings-define WOLFSSL_USER_SETTINGS` ' + 'to enable the user_settings.h inclusion gate.') + parser.add_argument('--user-settings-include', action='append', default=[], + metavar='DIR', + help='Add an include path for --user-settings ' + 'preprocessing (repeatable). Equivalent to -I ' + 'on the compiler command line.') + parser.add_argument('--user-settings-define', action='append', default=[], + metavar='NAME[=VALUE]', + help='Predefine a macro for --user-settings ' + 'preprocessing (repeatable). Equivalent to -D ' + 'on the compiler command line. At minimum ' + 'pass `WOLFSSL_USER_SETTINGS` so settings.h ' + 'pulls in user_settings.h.') + # Component checksum source: pick exactly one. + parser.add_argument('--lib', + help='Path to the wolfSSL library artifact ' + '(shared or static) for SHA-256 hashing ' + '(autotools entry point).') + parser.add_argument('--srcs', nargs='+', default=None, + help='wolfSSL source files compiled into the ' + 'firmware (embedded entry point). Their ' + 'OmniBOR-compatible gitoid Merkle hash is ' + 'used as the SBOM component checksum ' + 'instead of --lib.') parser.add_argument('--dep-libz', default='no', help='yes if built with --with-libz') parser.add_argument('--dep-liboqs', default='no', help='yes if built with --with-liboqs (the package ' 'wolfSSL links against; --enable-falcon implies ' 'this in any legal configuration)') + parser.add_argument('--dep-version', action='append', default=[], + metavar='KEY=VERSION', + help='Override pkg-config version detection for a ' + 'dependency (repeatable). KEY is one of: ' + + ', '.join(sorted(DEP_META)) + '. Required ' + 'on hosts without pkg-config (typical embedded ' + 'cross-compile setups).') parser.add_argument('--cdx-out', required=True, help='Output path for CycloneDX JSON') parser.add_argument('--spdx-out', required=True, help='Output path for SPDX JSON') args = parser.parse_args() + # Mutual exclusion + at-least-one validation for the two entry-point + # shapes. Surfacing this here keeps argparse's --required machinery + # simple and produces a friendlier error than argparse's auto-text. + if bool(args.options_h) == bool(args.user_settings): + sys.exit( + "ERROR: pass exactly one of --options-h or --user-settings.\n" + " --options-h: autotools entry point (a flat #define file " + "such as wolfssl/options.h).\n" + " --user-settings: embedded entry point (path to " + "wolfssl/wolfcrypt/settings.h, with --user-settings-include " + "pointing at the directory containing user_settings.h).") + if bool(args.lib) == bool(args.srcs): + sys.exit( + "ERROR: pass exactly one of --lib or --srcs.\n" + " --lib: hash a built library artefact (.so/.a/.dylib).\n" + " --srcs: hash the wolfSSL source files compiled into " + "your firmware (OmniBOR gitoid Merkle hash).") + enabled_deps = [ key for key, flag in [ ('libz', args.dep_libz), @@ -482,6 +865,7 @@ def main(): ] if flag.lower() == 'yes' ] + dep_version_overrides = _parse_dep_version_overrides(args.dep_version) if args.license_override: license_id = args.license_override @@ -504,8 +888,24 @@ def main(): "`make sbom SBOM_LICENSE_TEXT=PATH`)." ) - build_props = parse_options_h(args.options_h) - lib_hash = sha256_file(args.lib) + if args.options_h: + build_props = parse_options_h(args.options_h) + else: + build_props = parse_user_settings( + args.user_settings, + args.user_settings_include, + args.user_settings_define, + ) + + if args.lib: + lib_hash = sha256_file(args.lib) + hash_kind = 'library-binary' + srcs_basenames = None + else: + lib_hash = srcs_merkle_hash(args.srcs) + hash_kind = 'source-merkle-omnibor' + srcs_basenames = sorted({os.path.basename(p) for p in args.srcs}) + dt, timestamp = build_timestamp() year = dt.year serial = derived_uuid(args.name, args.version, 'serial') @@ -515,11 +915,15 @@ def main(): args.name, args.version, args.supplier, license_id, license_text, lib_hash, timestamp, year, serial, enabled_deps, build_props, + dep_version_overrides=dep_version_overrides, + hash_kind=hash_kind, srcs_basenames=srcs_basenames, ) spdx = generate_spdx( args.name, args.version, args.supplier, license_id, license_text, lib_hash, timestamp, year, doc_ns_uuid, enabled_deps, build_props, + dep_version_overrides=dep_version_overrides, + hash_kind=hash_kind, srcs_basenames=srcs_basenames, ) try: diff --git a/scripts/test_gen_sbom.py b/scripts/test_gen_sbom.py index 8de58a91694..52c5a868d9d 100644 --- a/scripts/test_gen_sbom.py +++ b/scripts/test_gen_sbom.py @@ -15,6 +15,7 @@ import importlib.util import os import pathlib +import re import tempfile import unittest import uuid @@ -335,6 +336,217 @@ def test_dedup_keeps_last_assignment(self): )) self.assertEqual(pairs['HAVE_X'], '2') + def test_filters_compiler_internals_from_dm_e_dump(self): + # The no-pcpp escape hatch (`$CC -dM -E -include settings.h ...`) + # produces a defines file containing hundreds of host/compiler + # macros - on macOS it includes the entire Apple + # TargetConditionals family, on Linux it includes __GLIBC_*, + # everywhere it includes the C compiler's __INT_*_MAX__ / + # __SSE*__ / __VERSION__ family. parse_options_h must drop them + # so the SBOM reflects wolfSSL configuration, not the build + # host, and is reproducible across hosts. + pairs = dict(self._parse( + "/* simulated `clang -dM -E` dump on macOS */\n" + "#define __VERSION__ \"Homebrew Clang 21.1.4\"\n" + "#define __APPLE__ 1\n" + "#define __MACH__ 1\n" + "#define __SSE2__ 1\n" + "#define __INT_FAST32_MAX__ 2147483647\n" + "#define __clang_major__ 21\n" + "#define _LP64 1\n" + "#define TARGET_OS_MAC 1\n" + "#define TARGET_OS_OSX 1\n" + "#define TARGET_OS_LINUX 0\n" + "#define TARGET_IPHONE_SIMULATOR 0\n" + "#define WOLFSSL_OPTIONS_H\n" + "#define WOLF_CRYPT_SETTINGS_H 1\n" + "#define HAVE_AESGCM 1\n" + "#define NO_DES3 1\n" + "#define WOLFSSL_AES_256 1\n" + )) + self.assertEqual( + set(pairs), {'HAVE_AESGCM', 'NO_DES3', 'WOLFSSL_AES_256'}, + 'noise filter let host/compiler macros leak into SBOM') + + def test_real_options_h_template_is_only_a_header_guard(self): + # Sanity-check that the noise filter handles wolfSSL's own + # autotools options.h.in: today the template defines exactly + # one macro - the WOLFSSL_OPTIONS_H header guard - which the + # filter must drop. If a future change adds a non-guard macro + # to options.h.in, this test makes the filter audit explicit. + here = pathlib.Path(__file__).resolve().parent.parent + template = here / 'wolfssl' / 'options.h.in' + if not template.is_file(): + self.skipTest(f'options.h.in fixture not found at {template}') + body = template.read_text() + names = re.findall(r'^#define[ \t]+(\w+)', body, re.MULTILINE) + self.assertIn('WOLFSSL_OPTIONS_H', names, + 'options.h.in unexpectedly missing its header guard') + for name in names: + self.assertTrue( + gs._is_noise_macro(name), + f'options.h.in defines {name!r} but the noise filter does ' + 'not drop it; either the filter needs widening or ' + 'options.h.in now contains a real config macro') + + def test_real_options_h_preserves_autoconf_have_probes(self): + # An autotools-generated wolfssl/options.h (post-./configure) + # contains both the WOLFSSL_OPTIONS_H header guard (filtered) + # and AC_CHECK_HEADER probe results like WOLFSSL_HAVE_ATOMIC_H + # / WOLFSSL_HAVE_ASSERT_H (must be preserved - they gate + # `#if defined(...)` branches in wc_port.h and types.h). + here = pathlib.Path(__file__).resolve().parent.parent + options_h = here / 'wolfssl' / 'options.h' + if not options_h.is_file(): + self.skipTest( + f'no built options.h at {options_h}; run ./configure first') + names = {k for k, _ in gs.parse_options_h(str(options_h))} + # WOLFSSL_OPTIONS_H is the header guard for options.h itself + # and must be filtered out. + self.assertNotIn( + 'WOLFSSL_OPTIONS_H', names, + 'header guard leaked through into SBOM build properties') + # The autoconf-detected header-availability flags must survive + # the filter (regression guard - see + # TestIsNoiseMacro.test_autoconf_have_header_probes_preserved). + for cflag in ('WOLFSSL_HAVE_ATOMIC_H', 'WOLFSSL_HAVE_ASSERT_H'): + if cflag in re.findall(r'^#define[ \t]+(\w+)', + options_h.read_text(), re.MULTILINE): + self.assertIn( + cflag, names, + f'{cflag!r} (AC_CHECK_HEADER probe result) was ' + 'incorrectly dropped by the noise filter') + + +class TestIsNoiseMacro(unittest.TestCase): + """The shared filter that keeps build-environment artefacts out of + the SBOM `wolfssl:build:*` properties. Drives both parse_options_h + (no-pcpp / autotools) and parse_user_settings (pcpp embedded) to + the same wolfSSL-only build-property set so the no-pcpp + `$CC -dM -E` shortcut does not produce host-leaking, non- + reproducible-across-hosts SBOMs. + + The three macro families this guards against (compiler-reserved, + Apple TargetConditionals, header guards) are documented in + `_NOISE_MACRO_RE` in gen-sbom; the assertions below pin each one, + plus the `_CONFIG_H_TOKENS` carve-out that keeps `*_H`-suffixed + real configuration flags out of the header-guard branch.""" + + def test_compiler_reserved_double_underscore(self): + # `__*` is reserved-for-implementation per ISO C 7.1.3 and is + # the bulk of what `clang -dM -E` emits. Dropping these is + # what stops `__VERSION__: "Homebrew Clang 21.1.4"` from + # leaking the developer's laptop into the public SBOM. + for name in ('__VERSION__', '__SSE2__', '__INT_FAST32_MAX__', + '__APPLE__', '__MACH__', '__amd64__', + '__GCC_ATOMIC_BOOL_LOCK_FREE', + '__clang_major__', '__BLOCKS__', + '__OBJC_BOOL_IS_BOOL', '__SIZEOF_LONG__', + '__LDBL_DIG__', '__FLT_RADIX__'): + self.assertTrue(gs._is_noise_macro(name), + f'{name!r} should be filtered') + + def test_compiler_reserved_single_underscore_uppercase(self): + # ISO C 7.1.3 also reserves `_` + uppercase for the + # implementation; e.g. macOS clang emits `_LP64`, glibc emits + # `_FORTIFY_SOURCE`. Same rationale as `__*`. + for name in ('_LP64', '_FORTIFY_SOURCE', '_LARGEFILE_SOURCE', + '_GNU_SOURCE'): + self.assertTrue(gs._is_noise_macro(name), + f'{name!r} should be filtered') + + def test_apple_target_conditionals_filtered(self): + # `clang -include settings.h -x c /dev/null` on macOS pulls in + # ; without this filter a wolfSSL SBOM + # for an STM32 firmware would falsely show TARGET_OS_MAC=1 + # when generated on a Mac, mis-identifying the target platform + # to a CRA reviewer. + for name in ('TARGET_OS_MAC', 'TARGET_OS_OSX', 'TARGET_OS_LINUX', + 'TARGET_OS_IOS', 'TARGET_OS_EMBEDDED', + 'TARGET_OS_WIN32', 'TARGET_OS_WINDOWS', + 'TARGET_IPHONE_SIMULATOR'): + self.assertTrue(gs._is_noise_macro(name), + f'{name!r} (Apple TargetConditionals) should be ' + 'filtered') + + def test_header_guards_filtered(self): + # Both wolfssl/options.h itself and several internal wolfSSL + # headers define `WOLFSSL_*_H` / `WOLF_CRYPT_*_H` guards to + # prevent double inclusion. These describe "which file was + # parsed", not configuration choices. + for name in ('WOLF_CRYPT_SETTINGS_H', 'WOLFSSL_OPTIONS_H', + 'WOLF_CRYPT_VISIBILITY_H', 'WOLFSSL_USER_SETTINGS_H'): + self.assertTrue(gs._is_noise_macro(name), + f'{name!r} (header guard) should be filtered') + + def test_autoconf_have_header_probes_preserved(self): + # Regression guard: the `_H$` filter must NOT swallow + # AC_CHECK_HEADER results from configure.ac. These live on the + # wolfSSL CFLAGS as `-DWOLFSSL_HAVE_ATOMIC_H` / + # `-DWOLFSSL_HAVE_ASSERT_H`, gate `#if defined(...)` branches in + # wc_port.h / types.h, and so are real configuration flags an + # auditor or vulnerability scanner needs to see in the SBOM. + for name in ('WOLFSSL_HAVE_ATOMIC_H', 'WOLFSSL_HAVE_ASSERT_H', + 'WOLFSSL_HAVE_MLKEM_H', 'HAVE_STDINT_H', + 'HAVE_SYS_TYPES_H'): + self.assertFalse( + gs._is_noise_macro(name), + f'{name!r} (autoconf AC_CHECK_HEADER probe) must NOT be ' + 'filtered - it is real configuration that gates source ' + 'code branches') + + def test_no_h_suffixed_disablement_flags_preserved(self): + # Regression guard for the carve-out specifically. These flags + # are set by NETOS / Telit / WOLFSSL_TELIT_M2MB / similar RTOS + # profiles in wolfssl/wolfcrypt/settings.h to suppress stdlib + # header inclusion (the firmware ships with vendor stdlib + # replacements). They gate real `#if defined(...)` branches: + # + # types.h:398 `#ifndef NO_STDINT_H` + # settings.h:3850 `#ifndef NO_STDINT_H` + # sp.h:42 `#elif !defined(NO_STDINT_H)` + # types.h:2132 `#if !defined(WOLFSSL_NO_ASSERT_H) && ...` + # + # An embedded customer who builds against one of these profiles + # would otherwise get an SBOM that silently omits their + # stdlib-disablement choices - the exact evidence a CRA reviewer + # expects to see. + for name in ('NO_STDINT_H', 'NO_STDLIB_H', 'NO_LIMITS_H', + 'NO_CTYPE_H', 'NO_STRING_H', 'NO_STDDEF_H', + 'WOLFSSL_NO_ASSERT_H'): + self.assertFalse( + gs._is_noise_macro(name), + f'{name!r} (NO_*_H disablement flag) must NOT be ' + 'filtered - it gates real wolfSSL source branches') + + def test_use_h_suffixed_build_mode_flags_preserved(self): + # Regression guard for the `USE_` carve-out token. Gates the + # flat-vs-tree test/benchmark layout in test.c:165 / + # benchmark.c:219 / examples/server/server.c:70. Customers who + # vendor these example sources select the layout via a `_H`- + # suffixed flag, so it must survive the filter. + for name in ('USE_FLAT_TEST_H', 'USE_FLAT_BENCHMARK_H'): + self.assertFalse( + gs._is_noise_macro(name), + f'{name!r} (USE_*_H build-mode toggle) must NOT be ' + 'filtered - it gates real wolfSSL source branches') + + def test_real_wolfssl_macros_pass_through(self): + # The whole point of filtering is to NOT touch real wolfSSL + # configuration. If any of these get filtered the SBOM loses + # auditor-visible build properties that distinguish one + # wolfSSL configuration from another. + for name in ('HAVE_AESGCM', 'NO_DES3', 'WOLFSSL_AES_256', + 'WOLFSSL_USER_SETTINGS', 'WC_RSA_BLINDING', + 'TFM_ECC256', 'OPENSSL_EXTRA', 'USE_FAST_MATH', + 'XTIME', 'CUSTOM_RAND_GENERATE', 'FP_MAX_BITS', + 'BENCH_EMBEDDED', 'SIZEOF_LONG_LONG', + 'WOLFSSL_SP_NO_DYN_STACK', 'WOLFSSL_SHA512', + 'NO_FILESYSTEM', 'SINGLE_THREADED'): + self.assertFalse(gs._is_noise_macro(name), + f'{name!r} (real wolfSSL config) should NOT be ' + 'filtered') + class TestDepMetaShape(unittest.TestCase): """Lock down the dep-tracking surface so renames/removals don't @@ -415,5 +627,458 @@ def test_removed_flags_are_rejected(self): f"{stale_flag!r}: {result.stderr!r}") +class TestGitoidBlobSha256(unittest.TestCase): + """The OmniBOR / git gitoid is content-addressed, well-specified, and + independently verifiable. These vectors anchor our implementation + against the canonical values so a future refactor (e.g. switching + chunked I/O strategy) cannot silently drift.""" + + EMPTY_OID = ('473a0f4c3be8a93681a267e3b1e9a7dcda1185436fe141f7749' + '120a303721813') + HELLO_OID = ('8aec4e4876f854f688d0ebfc8f37598f38e5fd6903cccc850ca' + '36591175aeb60') + + def test_empty_blob_matches_canonical_oid(self): + # The well-known SHA-256 gitoid for an empty blob - matches + # `git hash-object --object-format=sha256 /dev/null`. + with tempfile.NamedTemporaryFile('wb', delete=False) as f: + path = f.name + try: + self.assertEqual(gs.gitoid_blob_sha256(path), self.EMPTY_OID) + finally: + os.unlink(path) + + def test_hello_matches_canonical_oid(self): + # `git hash-object --object-format=sha256` on a 5-byte 'hello' + # blob; equivalently sha256(b'blob 5\x00hello'). + with tempfile.NamedTemporaryFile('wb', delete=False) as f: + f.write(b'hello') + path = f.name + try: + self.assertEqual(gs.gitoid_blob_sha256(path), self.HELLO_OID) + finally: + os.unlink(path) + + def test_chunked_read_path_matches_one_shot(self): + # The chunked iter-read path is what hashes large source files + # in real builds; this guards against any off-by-one in the + # 65536-byte chunk handling. + import hashlib + body = (b'A' * 70000) + (b'B' * 1000) + b'tail' + with tempfile.NamedTemporaryFile('wb', delete=False) as f: + f.write(body) + path = f.name + try: + expected = hashlib.sha256( + f'blob {len(body)}\x00'.encode() + body).hexdigest() + self.assertEqual(gs.gitoid_blob_sha256(path), expected) + finally: + os.unlink(path) + + def test_missing_file_exits_cleanly(self): + with self.assertRaises(SystemExit): + gs.gitoid_blob_sha256('/no/such/source/please.c') + + +class TestSrcsMerkleHash(unittest.TestCase): + """Source-set Merkle hash is the embedded entry point's component + checksum. Two contracts matter here: + + 1. Order independence: two customers compiling the same files in + any order get the same hash. Without this, the SBOM would + not be portable across build systems with non-deterministic + file ordering. + 2. Content sensitivity: a one-byte change in any source file + must change the hash. Without this, the checksum would + not detect a tampered build.""" + + def _make_files(self, files): + """files: dict of basename -> bytes contents. + Returns (tmpdir, list_of_paths).""" + import tempfile + tmpdir = tempfile.mkdtemp() + paths = [] + for name, contents in files.items(): + p = os.path.join(tmpdir, name) + with open(p, 'wb') as f: + f.write(contents) + paths.append(p) + return tmpdir, paths + + def test_order_independent(self): + import shutil + tmpdir, paths = self._make_files({ + 'aes.c': b'aes-body', + 'sha.c': b'sha-body', + 'dh.c': b'dh-body', + }) + try: + h1 = gs.srcs_merkle_hash(paths) + h2 = gs.srcs_merkle_hash(list(reversed(paths))) + h3 = gs.srcs_merkle_hash(sorted(paths)) + self.assertEqual(h1, h2) + self.assertEqual(h1, h3) + finally: + shutil.rmtree(tmpdir) + + def test_content_change_changes_hash(self): + import shutil + tmpdir, paths = self._make_files({ + 'aes.c': b'aes-body', + 'sha.c': b'sha-body', + }) + try: + h_before = gs.srcs_merkle_hash(paths) + with open(paths[0], 'ab') as f: + f.write(b'X') + h_after = gs.srcs_merkle_hash(paths) + self.assertNotEqual(h_before, h_after) + finally: + shutil.rmtree(tmpdir) + + def test_basename_only_means_path_independent(self): + """The Merkle hash deliberately uses basename only, not full + path, so two customers whose wolfSSL trees live at different + absolute paths get the same hash for the same release.""" + import shutil + td_a, paths_a = self._make_files({'aes.c': b'aes', 'sha.c': b'sha'}) + td_b, paths_b = self._make_files({'aes.c': b'aes', 'sha.c': b'sha'}) + try: + self.assertEqual( + gs.srcs_merkle_hash(paths_a), + gs.srcs_merkle_hash(paths_b)) + finally: + shutil.rmtree(td_a) + shutil.rmtree(td_b) + + def test_missing_file_exits_cleanly(self): + # Mirrors TestGitoidBlobSha256.test_missing_file_exits_cleanly: + # silently emitting an SBOM with a stale or zero hash for a + # missing source would falsify the artefact, so srcs_merkle_hash + # must propagate the underlying gitoid_blob_sha256 SystemExit. + with self.assertRaises(SystemExit): + gs.srcs_merkle_hash(['/no/such/source/please.c']) + + def test_duplicate_basenames_rejected(self): + # Order independence relies on unique basenames - if two source + # files in the input collided on basename, sorting on basename + # would suppress one of them and we would silently lose data. + # gen-sbom must reject the configuration rather than emit a + # misleading hash. + import shutil, tempfile + td_a = tempfile.mkdtemp() + td_b = tempfile.mkdtemp() + try: + with open(os.path.join(td_a, 'aes.c'), 'wb') as f: + f.write(b'a') + with open(os.path.join(td_b, 'aes.c'), 'wb') as f: + f.write(b'b') + with self.assertRaises(SystemExit): + gs.srcs_merkle_hash([ + os.path.join(td_a, 'aes.c'), + os.path.join(td_b, 'aes.c'), + ]) + finally: + shutil.rmtree(td_a) + shutil.rmtree(td_b) + + +class TestParseUserSettings(unittest.TestCase): + """Walks a synthetic settings.h + user_settings.h pair through + parse_user_settings() to confirm: + * the conditional logic in settings.h is honoured (only the + taken branch's defines reach the SBOM); + * pcpp-internal macros (__DATE__/__TIME__/__FILE__/__PCPP__) are + filtered out (otherwise reproducibility would break); + * function-like macros are filtered out (they are API surface, + not build configuration); + * --user-settings-define KEY=VALUE predefines reach the parser. + + Skipped when pcpp is not installed; CI installs it explicitly in + the standalone job.""" + + def setUp(self): + try: + import pcpp # noqa: F401 + except ImportError: + self.skipTest('pcpp not installed; embedded path not exercised') + + def _run(self, settings_body, user_body, predefines=()): + import shutil, tempfile + tmpdir = tempfile.mkdtemp() + try: + settings_h = os.path.join(tmpdir, 'settings.h') + user_h = os.path.join(tmpdir, 'user_settings.h') + with open(settings_h, 'w') as f: + f.write(settings_body) + with open(user_h, 'w') as f: + f.write(user_body) + return gs.parse_user_settings( + settings_h, [tmpdir], list(predefines)) + finally: + shutil.rmtree(tmpdir) + + def test_conditional_branches_honoured(self): + # Customer's user_settings.h enables HAVE_X; settings.h then + # gates HAVE_DEPENDENT on HAVE_X. Disabled-branch defines + # must NOT appear. + settings = ( + '#ifdef WOLFSSL_USER_SETTINGS\n' + '#include "user_settings.h"\n' + '#endif\n' + '#ifdef HAVE_X\n' + '#define HAVE_DEPENDENT 1\n' + '#else\n' + '#define HAVE_DISABLED_BRANCH 1\n' + '#endif\n' + ) + user = '#define HAVE_X 1\n' + pairs = self._run(settings, user, ['WOLFSSL_USER_SETTINGS']) + names = {k for k, _ in pairs} + self.assertIn('HAVE_X', names) + self.assertIn('HAVE_DEPENDENT', names) + self.assertNotIn('HAVE_DISABLED_BRANCH', names) + self.assertIn('WOLFSSL_USER_SETTINGS', names) + + def test_pcpp_internal_macros_filtered(self): + # __DATE__ and __TIME__ are non-deterministic; if they leak + # into the SBOM, two runs of `make sbom` produce different + # output and reproducibility CI fails. __PCPP__ and __FILE__ + # are pcpp implementation detail. + pairs = self._run('#define HAVE_X 1\n', '', []) + names = {k for k, _ in pairs} + for forbidden in ('__DATE__', '__TIME__', '__FILE__', '__PCPP__'): + self.assertNotIn(forbidden, names, + f'{forbidden} leaked into SBOM properties') + self.assertIn('HAVE_X', names) + + def test_apple_target_conditionals_filtered(self): + # Defensive: if a customer's user_settings.h transitively + # includes a macOS system header, the Apple TargetConditionals + # leak must still be filtered to keep the SBOM target-platform- + # honest. pcpp does not auto-include system headers, so this + # path is uncommon, but the contract with parse_options_h is + # that the same noise filter applies to both entry points. + pairs = self._run( + '#define HAVE_X 1\n' + '#define TARGET_OS_MAC 1\n' + '#define TARGET_OS_LINUX 0\n' + '#define TARGET_IPHONE_SIMULATOR 0\n', + '', []) + names = {k for k, _ in pairs} + self.assertIn('HAVE_X', names) + for forbidden in ('TARGET_OS_MAC', 'TARGET_OS_LINUX', + 'TARGET_IPHONE_SIMULATOR'): + self.assertNotIn(forbidden, names) + + def test_header_guards_filtered(self): + # wolfSSL's settings.h, visibility.h, etc. all define + # WOLF_CRYPT_*_H guards; they describe which file was parsed, + # not configuration choices, and so are filtered out of the + # SBOM `wolfssl:build:*` property set. + pairs = self._run( + '#define WOLF_CRYPT_SETTINGS_H 1\n' + '#define WOLFSSL_USER_SETTINGS_H 1\n' + '#define HAVE_X 1\n', + '', []) + names = {k for k, _ in pairs} + self.assertIn('HAVE_X', names) + self.assertNotIn('WOLF_CRYPT_SETTINGS_H', names) + self.assertNotIn('WOLFSSL_USER_SETTINGS_H', names) + + def test_no_h_and_use_h_config_flags_preserved(self): + # End-to-end pcpp regression for the `_CONFIG_H_TOKENS` carve- + # out: an embedded customer's user_settings.h that disables + # stdint/stdlib (NETOS / Telit / similar profile) must produce + # an SBOM that records the disablements. Mirrors the + # equivalent unit assertion in TestIsNoiseMacro but exercises + # the full pcpp + filter pipeline customers actually use. + user = ( + '#define HAVE_X 1\n' + '#define NO_STDINT_H 1\n' + '#define NO_STDLIB_H 1\n' + '#define WOLFSSL_NO_ASSERT_H 1\n' + '#define USE_FLAT_TEST_H 1\n' + '#define USE_FLAT_BENCHMARK_H 1\n' + ) + settings = ( + '#ifdef WOLFSSL_USER_SETTINGS\n' + '#include "user_settings.h"\n' + '#endif\n' + ) + pairs = self._run(settings, user, ['WOLFSSL_USER_SETTINGS']) + names = {k for k, _ in pairs} + for required in ('HAVE_X', 'NO_STDINT_H', 'NO_STDLIB_H', + 'WOLFSSL_NO_ASSERT_H', 'USE_FLAT_TEST_H', + 'USE_FLAT_BENCHMARK_H'): + self.assertIn( + required, names, + f'{required!r} (real wolfSSL config) was filtered out ' + 'of the SBOM - the noise filter is over-aggressive') + + def test_pcpp_error_directive_is_fatal(self): + # An `#error` firing inside settings.h or a transitively + # included header is a hard build failure for the C compiler; + # gen-sbom must mirror that semantics. pcpp signals this via + # pp.return_code (it does NOT raise), which is easy to swallow + # silently and emit a partial SBOM if not checked. This test + # pins the fail-fast contract. + settings = ( + '#ifdef WOLFSSL_USER_SETTINGS\n' + '#include "user_settings.h"\n' + '#endif\n' + '#define HAVE_X 1\n' + ) + user = '#error "this configuration is unsupported"\n' + with self.assertRaises(SystemExit) as ctx: + self._run(settings, user, ['WOLFSSL_USER_SETTINGS']) + msg = str(ctx.exception) + self.assertIn('pcpp', msg) + self.assertIn('return_code', msg) + + def test_function_like_macros_filtered(self): + # Function-like macros are API surface, not build + # configuration; their post-expansion body would also break + # reproducibility under pcpp token-render whitespace drift. + pairs = self._run( + '#define HAVE_X 1\n' + '#define WC_BITS_TO_BYTES(x) (((x) + 7) >> 3)\n', + '', []) + names = {k for k, _ in pairs} + self.assertIn('HAVE_X', names) + self.assertNotIn('WC_BITS_TO_BYTES', names) + + def test_predefine_with_value(self): + pairs = self._run( + '#if VERSION_MAJOR >= 5\n#define ONLY_NEW 1\n#endif\n', + '', ['VERSION_MAJOR=5']) + names = {k for k, _ in pairs} + self.assertIn('ONLY_NEW', names) + self.assertIn('VERSION_MAJOR', names) + + def test_returns_sorted_pairs_like_parse_options_h(self): + # The downstream code path is shared between options.h and + # user_settings.h; both producers must return the exact same + # shape (sorted list of (name, value) tuples). A drift here + # would surface as a mystery diff between the two paths. + pairs = self._run( + '#define HAVE_Z 1\n#define HAVE_A 1\n#define HAVE_M 1\n', + '', []) + names = [k for k, _ in pairs] + self.assertEqual(names, sorted(names)) + + +class TestDepVersionOverride(unittest.TestCase): + """--dep-version is the embedded path's substitute for pkg-config: + cross-compile hosts have no pkg-config for the target, so the + customer must supply the linked dep version explicitly. Without + this flag a baremetal SBOM that reports `--dep-libz yes` would + silently emit `versionInfo: NOASSERTION` and lose CVE-tracking + fidelity for libz.""" + + def test_explicit_override_wins_over_pkgconfig(self): + original = gs.pkgconfig_version + try: + gs.pkgconfig_version = lambda *_a, **_k: '99.99.99' + self.assertEqual( + gs.dep_version('libz', {'libz': '1.3.1'}), + '1.3.1') + finally: + gs.pkgconfig_version = original + + def test_no_override_falls_back_to_pkgconfig(self): + original = gs.pkgconfig_version + try: + gs.pkgconfig_version = lambda *_a, **_k: '1.0.0' + self.assertEqual(gs.dep_version('libz'), '1.0.0') + self.assertEqual(gs.dep_version('libz', {}), '1.0.0') + self.assertEqual( + gs.dep_version('libz', {'liboqs': '0.0'}), '1.0.0') + finally: + gs.pkgconfig_version = original + + def test_parse_overrides_rejects_unknown_keys(self): + with self.assertRaises(SystemExit): + gs._parse_dep_version_overrides(['libssl=3.0.0']) + + def test_parse_overrides_rejects_malformed(self): + with self.assertRaises(SystemExit): + gs._parse_dep_version_overrides(['libz']) + + def test_parse_overrides_accepts_known_keys(self): + out = gs._parse_dep_version_overrides([ + 'libz=1.3.1', 'liboqs=0.10.0', + ]) + self.assertEqual(out, {'libz': '1.3.1', 'liboqs': '0.10.0'}) + + +class TestCliMutualExclusion(unittest.TestCase): + """The two entry-point shapes (autotools / standalone) must be + mutually exclusive. Mixing them would produce a hash whose + semantics nobody can interpret (library bytes? source merkle? + both?), so gen-sbom refuses the combination upfront with a + clear error.""" + + def _run(self, *argv): + import subprocess + here = pathlib.Path(__file__).resolve().parent + script = here / 'gen-sbom' + return subprocess.run( + ['python3', str(script), *argv], + capture_output=True, text=True + ) + + BASE = [ + '--name', 'wolfssl', + '--version', '0.0.0-test', + '--license-file', '/dev/null', + '--cdx-out', '/dev/null', + '--spdx-out', '/dev/null', + ] + + def test_options_and_user_settings_together_fail(self): + result = self._run( + *self.BASE, + '--options-h', '/dev/null', + '--user-settings', '/dev/null', + '--lib', '/dev/null') + self.assertNotEqual(result.returncode, 0) + self.assertIn('--options-h or --user-settings', result.stderr) + + def test_neither_options_nor_user_settings_fails(self): + result = self._run( + *self.BASE, + '--lib', '/dev/null') + self.assertNotEqual(result.returncode, 0) + self.assertIn('--options-h or --user-settings', result.stderr) + + def test_lib_and_srcs_together_fail(self): + result = self._run( + *self.BASE, + '--options-h', '/dev/null', + '--lib', '/dev/null', + '--srcs', '/dev/null') + self.assertNotEqual(result.returncode, 0) + self.assertIn('--lib or --srcs', result.stderr) + + def test_neither_lib_nor_srcs_fails(self): + result = self._run( + *self.BASE, + '--options-h', '/dev/null') + self.assertNotEqual(result.returncode, 0) + self.assertIn('--lib or --srcs', result.stderr) + + def test_user_settings_path_in_help(self): + # Discoverability regression guard - if the standalone entry + # point is invisible to `--help`, embedded customers will not + # know it exists. + result = self._run('--help') + self.assertEqual(result.returncode, 0, result.stderr) + for token in ('--user-settings', '--user-settings-include', + '--user-settings-define', '--srcs', + '--dep-version'): + self.assertIn(token, result.stdout, f'{token!r} missing from --help') + + if __name__ == '__main__': unittest.main(verbosity=2) From 88a0ac73ebc5258debd917582556ba33591913ce Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Wed, 6 May 2026 11:08:47 +0300 Subject: [PATCH 14/16] fix(sbom,ci): build liboqs from source; rewire bomsh for new layout Noble lacks liboqs-dev (build 0.12.0 from source); upstream removed .devcontainer/bomtrace3 (mirror Dockerfile, pin bomsh+strace, mpers). Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 103 ++++++++++++++++++++++++++++++------- 1 file changed, 85 insertions(+), 18 deletions(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index a862cdc9faa..bb364c1e8ad 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -568,7 +568,40 @@ jobs: # break in DEP_META['liboqs'] would silently land. - name: Install liboqs (provides liboqs.pc for --with-liboqs) - run: sudo apt-get update && sudo apt-get install -y liboqs-dev + # Ubuntu noble (24.04) does not ship liboqs-dev in its archive + # (Debian sid has 0.7.x; Ubuntu only has unsupported PPAs). Build + # from a pinned upstream tag so this job stays deterministic across + # runs - any future liboqs API/ABI break shows up here, not in + # production builds. Pinning matters: SBOM correctness assertions + # below check purl shape, and an unpinned 'main' would silently + # change what pkg-config reports as the version string. + run: | + sudo apt-get update + sudo apt-get install -y --no-install-recommends \ + cmake ninja-build libssl-dev + git clone --depth=1 --branch 0.12.0 \ + https://github.com/open-quantum-safe/liboqs /tmp/liboqs + cmake -S /tmp/liboqs -B /tmp/liboqs/build -GNinja \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_INSTALL_PREFIX=/usr/local \ + -DBUILD_SHARED_LIBS=ON \ + -DOQS_BUILD_ONLY_LIB=ON \ + -DOQS_DIST_BUILD=OFF + cmake --build /tmp/liboqs/build --parallel "$(nproc)" + sudo cmake --install /tmp/liboqs/build + sudo ldconfig + # /usr/local/lib/pkgconfig is on pkg-config's compiled-in path + # on Ubuntu, but export via $GITHUB_ENV so a future image change + # cannot silently break --with-liboqs autodetection. ${VAR:+:$VAR} + # avoids a trailing colon when PKG_CONFIG_PATH is unset. + echo "PKG_CONFIG_PATH=/usr/local/lib/pkgconfig${PKG_CONFIG_PATH:+:$PKG_CONFIG_PATH}" \ + >> "$GITHUB_ENV" + + - name: Verify liboqs.pc visible to pkg-config + # Separate step so the $GITHUB_ENV write above has taken effect + # for this shell; an in-step call would only be exercising the + # compiled-in default path, not the export. + run: pkg-config --modversion liboqs - name: Configure with --with-liboqs --enable-falcon run: | @@ -734,33 +767,67 @@ jobs: - name: Install build deps + SBOM validators run: | sudo apt-get update + # bison + autotools-dev are required by strace's ./bootstrap. + # gcc-multilib + g++-multilib give strace's --enable-mpers=check + # the 32-bit/x32 compilers it needs - without them mpers is + # silently downgraded and bomtrace3 traces only native-arch + # syscalls, diverging from what bomsh's devcontainer produces. + # The rest mirror bomsh's .devcontainer/Dockerfile bomtrace3 + # stage. sudo apt-get install -y build-essential autoconf automake libtool \ + bison autotools-dev gcc-multilib g++-multilib \ python3 python3-pip git python3 -m pip install --user --upgrade pip python3 -m pip install --user 'spdx-tools==0.8.*' echo "$HOME/.local/bin" >> "$GITHUB_PATH" - name: Install bomsh toolchain (bomtrace3 + helper scripts) - # Bomsh is not packaged; build bomtrace3 (patched strace) from - # source and install the python helpers system-wide so configure's - # AC_PATH_PROG can find them. + # Bomsh is not packaged. Reproduce its `.devcontainer/Dockerfile` + # bomtrace3 stage: clone strace, apply bomtrace3.patch, drop in + # the bomsh source overlay, then bootstrap+configure+make. Both + # bomsh and strace are pinned (env: below) so a strace `master` + # commit that touches the lines bomtrace3.patch rewrites cannot + # break this CI for reasons unrelated to wolfSSL. Bump them + # together by re-validating `patch -p1` against the new SHAs. + env: + # bomsh has no releases; pin to last commit on main as of + # 2024-10-31. The patch itself last changed 2024-02-06. + BOMSH_SHA: 5823f7db7e5bd958e4ff868ae6ea79a7d871bb07 + # v6.7 (2024-01-29) is the strace release current when the + # patch was last touched; later releases tend to drift from + # the patch's context lines in src/strace.c. + STRACE_TAG: v6.7 run: | - git clone --depth=1 https://github.com/omnibor/bomsh /tmp/bomsh - # bomtrace3 build: docker/devcontainer-only Makefile in upstream; - # use the embedded build script if present, else fall back to - # the strace patch path. - cd /tmp/bomsh - if [ -d .devcontainer/bomtrace3 ]; then - make -C .devcontainer/bomtrace3 - sudo install -m 755 .devcontainer/bomtrace3/bomtrace3 \ - /usr/local/bin/ - else - echo "bomsh repo layout changed; please update CI" + git clone https://github.com/omnibor/bomsh /tmp/bomsh + git -C /tmp/bomsh checkout "$BOMSH_SHA" + # Even with a pinned SHA, keep the layout-drift guard so the + # next maintainer who bumps BOMSH_SHA gets a clear error if + # upstream restructured rather than a confusing patch failure. + if [ ! -f /tmp/bomsh/.devcontainer/patches/bomtrace3.patch ] \ + || [ ! -d /tmp/bomsh/.devcontainer/src ]; then + echo "bomsh repo layout changed; please update CI" >&2 + ls -la /tmp/bomsh/.devcontainer/ >&2 || true exit 1 fi - sudo install -m 755 scripts/bomsh_create_bom.py /usr/local/bin/ - sudo install -m 755 scripts/bomsh_sbom.py /usr/local/bin/ - bomtrace3 --version || true + git clone --depth=1 --branch "$STRACE_TAG" \ + https://github.com/strace/strace.git /tmp/strace + cp /tmp/bomsh/.devcontainer/patches/bomtrace3.patch /tmp/strace/ + cp /tmp/bomsh/.devcontainer/src/*.[ch] /tmp/strace/src/ + ( + cd /tmp/strace + patch -p1 < bomtrace3.patch + ./bootstrap + ./configure --enable-mpers=check + make -j"$(nproc)" + ) + sudo install -m 755 /tmp/strace/src/strace /usr/local/bin/bomtrace3 + sudo install -m 755 /tmp/bomsh/scripts/bomsh_create_bom.py \ + /usr/local/bin/ + sudo install -m 755 /tmp/bomsh/scripts/bomsh_sbom.py \ + /usr/local/bin/ + # bomtrace3 is patched strace; a `--version` invocation under + # ptrace requires no target so it must succeed cleanly. + bomtrace3 --version which bomsh_create_bom.py bomsh_sbom.py - name: Configure wolfSSL From 39aefe77aa2c87de37106733ccd4042f5b25fa8a Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Thu, 7 May 2026 07:22:06 +0300 Subject: [PATCH 15/16] fix(sbom,ci): unbreak bomsh + liboqs(falcon) jobs; archive SBOMs bomtrace3 has no --version (use `-h | grep` smoke check); liboqs default OQS_USE_OPENSSL=ON drags into wolfSSL TUs via falcon.h and collides with the compat layer (-DOQS_USE_OPENSSL=OFF). Add upload-artifact@v4 to all four jobs so the OmniBOR graph + enriched SPDX ship from PR runs. Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 124 +++++++++++++++++++++++++++++++++++-- 1 file changed, 120 insertions(+), 4 deletions(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index bb364c1e8ad..09ce420a3b9 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -332,6 +332,30 @@ jobs: f'host-leak macros filtered)') PY + # Upload the SBOMs produced by every standalone path (pcpp, pcpp+deps, + # --options-h escape hatch, and the second pcpp run used for + # reproducibility diffing) so a reviewer can inspect them - or hand + # them to a downstream consumer / CRA reviewer - without re-running + # the job. `if: always()` ensures triage artefacts ship even when an + # assertion above fails (which is precisely when the bytes matter). + - name: Upload standalone SBOMs + if: always() + uses: actions/upload-artifact@v4 + with: + name: sbom-standalone-${{ github.sha }} + path: | + /tmp/standalone/wolfssl.cdx.json + /tmp/standalone/wolfssl.spdx.json + /tmp/standalone-r2/wolfssl.cdx.json + /tmp/standalone-r2/wolfssl.spdx.json + /tmp/standalone-deps/wolfssl.cdx.json + /tmp/standalone-deps/wolfssl.spdx.json + /tmp/standalone-dme/wolfssl.cdx.json + /tmp/standalone-dme/wolfssl.spdx.json + /tmp/standalone-dme/options.h + if-no-files-found: warn + retention-days: 90 + # Tier 2 - integration: build wolfSSL, generate the SBOMs, and assert # everything an external auditor or vulnerability scanner relies on. integration: @@ -581,12 +605,22 @@ jobs: cmake ninja-build libssl-dev git clone --depth=1 --branch 0.12.0 \ https://github.com/open-quantum-safe/liboqs /tmp/liboqs + # -DOQS_USE_OPENSSL=OFF is load-bearing: without it, liboqs's + # installed common.h pulls (system) into every + # TU that includes . wolfssl/wolfcrypt/falcon.h + # includes , so once --enable-falcon is on, every + # wolfSSL TU that pulls falcon.h also pulls system OpenSSL, + # which collides with wolfssl/openssl/ssl.h under -Werror + # (CRYPTO_UNLOCK, sk_num, OPENSSL_malloc_init, ... all redefined). + # OFF makes liboqs use its bundled SHA/randombytes (the #else + # branches in oqs/common.h), keeping the build hermetic. cmake -S /tmp/liboqs -B /tmp/liboqs/build -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=/usr/local \ -DBUILD_SHARED_LIBS=ON \ -DOQS_BUILD_ONLY_LIB=ON \ - -DOQS_DIST_BUILD=OFF + -DOQS_DIST_BUILD=OFF \ + -DOQS_USE_OPENSSL=OFF cmake --build /tmp/liboqs/build --parallel "$(nproc)" sudo cmake --install /tmp/liboqs/build sudo ldconfig @@ -706,6 +740,28 @@ jobs: exit 1 fi + # Persist the SBOMs the integration matrix produces so a CRA reviewer, + # a downstream packager, or the next maintainer triaging a regression + # can download them straight from the run summary instead of replaying + # the full job locally. `if: always()` so a failed assertion above + # (license matrix, NTIA, CDX schema, liboqs dep entry, ...) still ships + # the bytes it failed on. The last `make sbom` invocation in this job + # is the simple SPDX override step, but the path matches every wolfssl + # SPDX/CDX in $PWD - if any are present at job end they will be picked + # up. if-no-files-found:warn keeps the upload soft so reordering the + # steps later cannot regress this into a hard failure. + - name: Upload SBOM artefacts (linux) + if: always() + uses: actions/upload-artifact@v4 + with: + name: sbom-integration-linux-${{ github.sha }} + path: | + wolfssl-*.spdx.json + wolfssl-*.cdx.json + wolfssl-*.spdx + if-no-files-found: warn + retention-days: 90 + # Tier 2 (macOS) - smoke test that gen-sbom finds .dylib artefacts and # that the autotools target works on Mach-O. Linux already exercises # the heavy validation matrix; this job is intentionally minimal so the @@ -750,6 +806,21 @@ jobs: print('macOS SBOM checksum well-formed:', checksum) PY + # Persist the Mach-O variant SBOMs so the .dylib-flavoured outputs are + # downloadable for cross-platform diffing against the linux artefacts. + # Same `if: always()` rationale as the linux upload above. + - name: Upload SBOM artefacts (macos) + if: always() + uses: actions/upload-artifact@v4 + with: + name: sbom-integration-macos-${{ github.sha }} + path: | + wolfssl-*.spdx.json + wolfssl-*.cdx.json + wolfssl-*.spdx + if-no-files-found: warn + retention-days: 90 + # Tier 2 (bomsh) - exercises the `make bomsh` target which traces a # full clean rebuild under bomtrace3 (patched strace, Linux-only) and # produces an OmniBOR artifact dependency graph. Without this job @@ -825,9 +896,17 @@ jobs: /usr/local/bin/ sudo install -m 755 /tmp/bomsh/scripts/bomsh_sbom.py \ /usr/local/bin/ - # bomtrace3 is patched strace; a `--version` invocation under - # ptrace requires no target so it must succeed cleanly. - bomtrace3 --version + # bomtrace3 replaces strace's argv parsing in bomsh_init() (see + # bomsh_config.c); its accepted long options are exactly + # --help/--config/--output/--verbose/--watch. `--version` is + # NOT a real flag and would exit non-zero. `-h` is the only + # no-target invocation that returns 0 cleanly (bomsh_usage() + # calls exit(0)). The grep doubles as a check that the binary + # on PATH is genuinely bomsh-patched and not a vanilla strace + # shadowing it ("Usage: bomtrace3 " only appears in + # bomsh_usage()), guarding against a future BOMSH_SHA bump that + # silently regresses the patch. + bomtrace3 -h | grep -q '^Usage: bomtrace3 ' which bomsh_create_bom.py bomsh_sbom.py - name: Configure wolfSSL @@ -871,6 +950,43 @@ jobs: print(f'bomsh enrichment ok: {len(gitoid_refs)} gitoid refs') PY + # The full provenance bundle - the high-value artefact of the whole + # PR, the one a CRA reviewer or downstream packager wants to download. + # MUST be uploaded BEFORE the `make clean` step below, which deletes + # everything by design. `if: always()` so even when the assertion + # above fails (which is when triage matters most), the bundle ships. + # + # Contents: + # omnibor/ - OmniBOR Artifact Dependency Graph + # (objects/ + metadata/bomsh/*), + # content-addressed by gitoid; the + # verifiable build-provenance proof. + # omnibor.wolfssl-*.spdx.json - SPDX with PERSISTENT-ID gitoid + # externalRef bridging SBOM <-> ADG. + # wolfssl-*.spdx.json - the un-enriched SPDX (for diffing + # against omnibor.* to confirm only + # the externalRef was added). + # wolfssl-*.cdx.json - CycloneDX equivalent. + # bomsh_raw_logfile.sha1 - raw bomtrace3 syscall trace, for + # debugging trace gaps (e.g. a build + # step that escaped ptrace). + # _bomsh.conf - 1-line config passed to bomtrace3 + # -c at trace time. + - name: Upload OmniBOR graph + bomsh-enriched SBOMs + if: always() + uses: actions/upload-artifact@v4 + with: + name: bomsh-omnibor-${{ github.sha }} + path: | + omnibor/ + omnibor.wolfssl-*.spdx.json + wolfssl-*.spdx.json + wolfssl-*.cdx.json + bomsh_raw_logfile.sha1 + _bomsh.conf + if-no-files-found: warn + retention-days: 90 + - name: make clean removes all bomsh + sbom artefacts # Regression guard: if a future change adds an output to either # recipe but forgets CLEANFILES, this will catch it. From 1d9b787f712cd4c46a827abc67e2535caab7253c Mon Sep 17 00:00:00 2001 From: Sameeh Jubran Date: Thu, 7 May 2026 15:52:24 +0300 Subject: [PATCH 16/16] =?UTF-8?q?ci(sbom):=20drop=20non-ASCII=20'=C2=A7'?= =?UTF-8?q?=20from=20sbom.yml=20comment=20(fix=20check-source-text)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Sameeh Jubran --- .github/workflows/sbom.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/sbom.yml b/.github/workflows/sbom.yml index 09ce420a3b9..9e42d88f907 100644 --- a/.github/workflows/sbom.yml +++ b/.github/workflows/sbom.yml @@ -236,7 +236,7 @@ jobs: PY - name: --options-h escape hatch ($CC -dM -E, no pcpp) - # The doc/SBOM.md § 1.5 escape hatch for toolchains that cannot + # The doc/SBOM.md section 1.5 escape hatch for toolchains that cannot # install pcpp (older Keil / IAR sites with restricted pip # access): pre-process settings.h with the system compiler's # `-dM -E` macro-dump mode and feed the resulting flat #define