Skip to content

Publish a JSON Schema for coverage.json#1184

Open
sferik wants to merge 11 commits into
mainfrom
schema
Open

Publish a JSON Schema for coverage.json#1184
sferik wants to merge 11 commits into
mainfrom
schema

Conversation

@sferik
Copy link
Copy Markdown
Collaborator

@sferik sferik commented May 14, 2026

Since we want third-party tools to use the coverage.json (and not .resultset.json), we should encourage that by versioning the document and publishing a JSON Schema. This was suggested by @keithrbennett in #1143 (comment) and I agree that it’s a good idea.

This PR adds:

  • schemas/coverage.schema.json, which includes meta, total, per-file coverage, groups, and all four errors shapes. The gemspec has also been modified to add the schemas directory.
  • That file contains a meta.schema_version (currently "1.0"), which is intentionally independent of the gem version, so we can bump the gem version without bumping the schema version.
  • A spec that validates JSONFormatter output against the schema.
  • json_schemer as a development dependency to validate the schema.

This comment was marked as resolved.

@keithrbennett
Copy link
Copy Markdown

keithrbennett commented May 17, 2026

@sferik Thanks for doing this. Regarding versioned and unversioned schema files, I think the versioned files need to be primary/canonical and the 'latest' or 'current' version a convenience. I think the generated result set data file needs to contain a schema spec that includes the schema version so that it will always be usable when future schema versions are released. I may be misunderstanding though.

Here is a ChatGPT discussion of their recommended version strategy and implementation:

https://chatgpt.com/share/6a095730-57d0-8322-aac5-38f90035c2ac

(References in that chat to 'cov-loupe' are unintentional and should be 'simplecov' instead.)

Comment thread README.md
"groups": { "<group name>": { /* per-group stats + files */ } },
"errors": { /* minimum_coverage, minimum_coverage_by_file, minimum_coverage_by_group, maximum_coverage_drop violations */ }
}
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to get full example for some minimal 'Hello World' script (that covers all features).

Comment thread README.md Outdated
Comment thread schemas/coverage.schema.json Outdated
Comment thread README.md Outdated
Comment thread simplecov.gemspec
gem.required_ruby_version = ">= 3.1"

gem.files = Dir["lib/**/*.*", "exe/*", "LICENSE", "CHANGELOG.md", "README.md", "doc/*"]
gem.files = Dir["{lib,schemas}/**/*.*", "exe/*", "LICENSE", "CHANGELOG.md", "README.md", "doc/*"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are at it, maybe it is possible to generate human readable documentation with real world examples to close #1030 (comment) ?

There are a couple of documentation generators listed https://json-schema.org/tools#documentation but I haven't found a nice one yet.

@sferik
Copy link
Copy Markdown
Collaborator Author

sferik commented May 27, 2026

@keithrbennett I’ve incorporated your feedback in 590da07. Please have another look and let me know if you have any more feedback before I merge this in.

@sferik
Copy link
Copy Markdown
Collaborator Author

sferik commented May 27, 2026

@abitrolly I’ve incorporated your feedback in dcf5e59. Please have another look and let me know if you have any more feedback before I merge this in.

@abitrolly
Copy link
Copy Markdown

@sferik looks good. Copying standard when a new version is released seems like a good way to preserve backward compatibility.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.

Comment thread schemas/coverage-v1.0.schema.json Outdated
Comment on lines +104 to +116
"totals": {
"type": "object",
"required": ["lines"],
"additionalProperties": false,
"properties": {
"lines": {"$ref": "#/definitions/line_statistic"},
"branches": {"$ref": "#/definitions/coverage_statistic"},
"methods": {"$ref": "#/definitions/coverage_statistic"}
}
},
"source_file": {
"type": "object",
"required": ["lines", "lines_covered_percent", "covered_lines", "missed_lines", "omitted_lines", "total_lines"],
Comment thread schemas/coverage.schema.json Outdated
Comment on lines +104 to +116
"totals": {
"type": "object",
"required": ["lines"],
"additionalProperties": false,
"properties": {
"lines": {"$ref": "#/definitions/line_statistic"},
"branches": {"$ref": "#/definitions/coverage_statistic"},
"methods": {"$ref": "#/definitions/coverage_statistic"}
}
},
"source_file": {
"type": "object",
"required": ["lines", "lines_covered_percent", "covered_lines", "missed_lines", "omitted_lines", "total_lines"],
sferik added a commit that referenced this pull request May 29, 2026
When a project runs with `disable_coverage :line` (turning off the
line criterion entirely), the JSON formatter drops every line key
from `total`, per-file payloads, and group totals. The schema
previously required those fields unconditionally, so a valid
SimpleCov payload produced in that configuration failed validation
against its own published schema. Caught by Copilot in PR #1184
review.

* Add a `line_coverage` boolean to the meta block (mirroring
  `branch_coverage` and `method_coverage`) so the line-enabled
  status is discoverable at the document level instead of
  inferred from the absence of fields.
* Drop `lines` from the hard-required list in `totals`,
  `source_file`, and `group`. Add `anyOf` clauses requiring at
  least one of (lines, branches, methods) to be present, since
  SimpleCov refuses to start with all criteria disabled.
* Extend `source_file`'s `dependentRequired` to cover the line
  group (lines + the five line stat fields), matching the same
  "all or nothing" guarantee already in place for branches and
  methods.
* Update the `lines` array description to note the
  `meta.line_coverage` invariant and the co-required group,
  mirroring the wording added for `branches` and `methods`.
* Add a schema-validation spec exercising the line-disabled
  emit path so this drift can't slip past CI again.

Fixtures and the README structural overview pick up `line_coverage`.
sferik added a commit that referenced this pull request May 29, 2026
When a project runs with `disable_coverage :line` (turning off the
line criterion entirely), the JSON formatter drops every line key
from `total`, per-file payloads, and group totals. The schema
previously required those fields unconditionally, so a valid
SimpleCov payload produced in that configuration failed validation
against its own published schema. Caught by Copilot in PR #1184
review.

* Add a `line_coverage` boolean to the meta block (mirroring
  `branch_coverage` and `method_coverage`) so the line-enabled
  status is discoverable at the document level instead of
  inferred from the absence of fields.
* Drop `lines` from the hard-required list in `totals`,
  `source_file`, and `group`. Add `anyOf` clauses requiring at
  least one of (lines, branches, methods) to be present, since
  SimpleCov refuses to start with all criteria disabled.
* Extend `source_file`'s `dependentRequired` to cover the line
  group (lines + the five line stat fields), matching the same
  "all or nothing" guarantee already in place for branches and
  methods.
* Update the `lines` array description to note the
  `meta.line_coverage` invariant and the co-required group,
  mirroring the wording added for `branches` and `methods`.
* Add a schema-validation spec exercising the line-disabled
  emit path so this drift can't slip past CI again.

Fixtures and the README structural overview pick up `line_coverage`.
sferik added 11 commits May 29, 2026 20:15
…yload

Move the canonical schema to schemas/coverage-v1.0.schema.json with a
versioned $id, so each version is immutable. Keep
schemas/coverage.schema.json as a convenience alias for "the latest"
that mirrors the canonical except for $id, title, and description. A new
spec asserts the two stay structurally identical so the alias cannot
drift.

Add a top-level $schema field to every coverage.json holding the
versioned canonical URL, so each emitted document is self-describing and
consumers can resolve the exact contract without out-of-band knowledge.
meta.schema_version stays as the human-readable companion.
Draft-07 is from 2018. 2020-12 is the current published draft, supported
by every modern validator (including json_schemer in the dev Gemfile)
and by every IDE that auto-resolves $schema URLs. There is no
compatibility reason to ship a 2018-era contract as the public schema.

Update the meta-schema URI in both schema files (the versioned canonical
and the unversioned alias) and switch the spec assertion from
JSONSchemer.draft7 to JSONSchemer.draft202012. Retitle the README
section to "JSON Schema for coverage.json" so it is searchable for
someone looking for "JSON schema simplecov" rather than for the internal
filename, and tighten the opening paragraph (the second bullet was
redundant with the prose that followed).
`coverage.json` always carried a full copy of every file's source
text alongside the per-line coverage data. Self-contained, but on
larger projects the source dominates the payload and downstream tools
that read source from disk (cov-loupe and similar) carry the cost
without benefit. Add `SimpleCov.source_in_json` (default true, so
existing consumers see no change), and have `JSONFormatter.build_hash`
take an `include_source:` kwarg that defaults to the config.

The HTML report's `coverage_data.js` always passes `include_source: true`
because the client-side viewer renders source from the embedded array
and would break without it. Only the side-file `coverage.json` written
alongside the HTML report honors the new setting, so users can keep
the rich HTML view while shrinking the JSON consumed by their own
tooling. When the setting is at its default, both files share a single
serialization.
JSONFormatter's ErrorsFormatter emits a `maximum_coverage` key into the
errors object whenever `SimpleCov.maximum_coverage` is configured, but
the schema's errors block used `additionalProperties: false` with only
four permitted keys (minimum_coverage, minimum_coverage_by_file,
minimum_coverage_by_group, maximum_coverage_drop). Any project using
maximum_coverage was producing a coverage.json that failed its own
schema validation.

Add maximum_coverage as a fifth allowed key, reusing the expected_actual
shape the formatter already writes. Update the errors block description
and the README structural overview to list it.
…rion

ErrorsFormatter wrote violations into minimum_coverage_by_file nested
as criterion -> filename, while minimum_coverage_by_group nested in the
opposite order (group -> criterion). Consumers looking up violations
for a specific file or group had to know which top-level bucket they
were in to know which key came first.

Flip minimum_coverage_by_file to filename -> criterion so it matches
the by_group convention. Pre-1.0 is the cheap window to fix this
asymmetry. The schema description and the affected spec assertions
are updated to match.
Per-file payloads carried covered_lines, missed_lines, and total_lines,
where total_lines was covered + missed (executable lines only). The
project-wide total.lines block has had a separate omitted field for
non-executable (blank/comment) lines since the schema was introduced,
but the per-file shape did not, so consumers wanting per-file omitted
counts had to walk the `lines` array themselves and count nulls.

A natural reading of total_lines is also lines.length, which it is
not. Renaming total_lines to executable_lines was considered but
rejected: total_branches and total_methods follow the same
covered + missed convention, and renaming just one would create a
fresh asymmetry across the three criteria.

Emit per-file omitted_lines via SourceFile#never_lines, add descriptions
to the four line-count fields making the executable-only semantics
explicit, and require omitted_lines in source_file alongside the
others. Fixtures regenerated.
Each of the five branch stat fields (branches array,
branches_covered_percent, covered_branches, missed_branches,
total_branches) was an independently optional property in source_file,
likewise for the methods group. The formatter always emits all five
together (or none), but the schema did not encode that. A document
with branches_covered_percent and no branches array, or any other
partial combination, would pass validation even though it is
meaningless. Downstream consumers had to defensively probe each
field rather than rely on "all five present or none present" once
they saw any one of them.

Use JSON Schema 2020-12 `dependentRequired` so each of the five
branch fields makes the other four required, ditto for methods.
Also add descriptions to `branches` and `methods` noting the
co-required-group invariant and tying their presence to the
corresponding `meta.branch_coverage` / `meta.method_coverage` flag,
plus descriptions on total_branches / total_methods clarifying the
covered + missed semantics.
A handful of fields in the schema had no description, even though
their meaning was not obvious from the name:

* `report_line` (inside a branch object) is the line of the
  conditional that owns the branch (the if, case, or && line), not
  the start of the branch body. Renderers want this for annotating
  the decision point.
* `strength` on line_statistic and coverage_statistic is the
  average number of executions across covered items (hits per
  covered line / branch / method).
* `total` on both statistic shapes is `covered + missed` and
  excludes `omitted`. Spell it out so it isn't conflated with
  "everything".
* `line_coverage` was described as a per-line sentinel, but the
  same three-way shape (integer / null / "ignored") is also reused
  for branch and method `coverage` fields. Note the reuse, and that
  `"ignored"` marks code inside a simplecov:disable / :nocov:
  region (which can happen for branches and methods too, not just
  lines).
When a project runs with `disable_coverage :line` (turning off the
line criterion entirely), the JSON formatter drops every line key
from `total`, per-file payloads, and group totals. The schema
previously required those fields unconditionally, so a valid
SimpleCov payload produced in that configuration failed validation
against its own published schema. Caught by Copilot in PR #1184
review.

* Add a `line_coverage` boolean to the meta block (mirroring
  `branch_coverage` and `method_coverage`) so the line-enabled
  status is discoverable at the document level instead of
  inferred from the absence of fields.
* Drop `lines` from the hard-required list in `totals`,
  `source_file`, and `group`. Add `anyOf` clauses requiring at
  least one of (lines, branches, methods) to be present, since
  SimpleCov refuses to start with all criteria disabled.
* Extend `source_file`'s `dependentRequired` to cover the line
  group (lines + the five line stat fields), matching the same
  "all or nothing" guarantee already in place for branches and
  methods.
* Update the `lines` array description to note the
  `meta.line_coverage` invariant and the co-required group,
  mirroring the wording added for `branches` and `methods`.
* Add a schema-validation spec exercising the line-disabled
  emit path so this drift can't slip past CI again.

Fixtures and the README structural overview pick up `line_coverage`.
The schema branch's README additions were written against the pre-restructure
README. Reintegrate them into the restructured layout: the source_in_json
opt-out lands in the JSON formatter section, and the coverage.json JSON Schema
overview becomes its own subsection under Formatters. Content is unchanged
from the schema work, only rewrapped to match the surrounding prose.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP / CLI / library access to coverage data Missing SimpleCov JSON specification (with comparison to Cobertura XML format)

4 participants