Skip to content

Stabilize polylabel placement#899

Open
Symmetricity wants to merge 2 commits into
systemed:masterfrom
Symmetricity:fix/deterministic-polylabel-placement
Open

Stabilize polylabel placement#899
Symmetricity wants to merge 2 commits into
systemed:masterfrom
Symmetricity:fix/deterministic-polylabel-placement

Conversation

@Symmetricity
Copy link
Copy Markdown

This PR is AI generated.

Summary

This makes polylabel point placement deterministic across the tested native
builds.

Two small changes are included:

  • Compute the polygon centroid seed relative to a local origin from the outer
    ring before translating it back to map coordinates.
  • Break exact std::priority_queue ties with existing cell geometry values
    instead of leaving equal-priority cells to implementation-specific heap
    ordering.

The intended effect is stable generated point placement for
LayerAsCentroid() users such as the OpenMapTiles-compatible housenumber and
poi layers.

Background

The generated-tile CI work found repeatable semantic differences between
native runners. macOS 14 and macOS latest matched each other, while Ubuntu
CMake, Ubuntu Makefile, and Windows CMake matched each other, but the two
runner groups placed some generated points at different MVT coordinates.

Focused diagnostics on one affected building showed that the source polygon
coordinates, envelope, and initial grid were identical across platforms. The
first divergence was the centroid seed calculated by getCentroidCell():

  • macOS calculated a seed near
    (9.5072016807699118, 53.476525214138611) with a lower distance score, so
    the bbox seed won.
  • Ubuntu and Windows calculated a seed near
    (9.5072058467781453, 53.476548647062621) with a higher distance score, so
    the centroid seed won.

Those slightly different seed choices were enough to move some generated
points by one or two MVT coordinate units.

The centroid calculation was using absolute lon/lat-style coordinates directly
in shoelace sums for very small polygons. Shifting the arithmetic to a local
origin reduces cancellation in those sums. The queue tie-break then handles the
remaining case where cells have exactly equal priority.

Implementation

The patch is intentionally limited to include/polylabel.h.

getCentroidCell() now uses the first outer-ring point as a local arithmetic
origin:

  1. subtract the origin from each ring point before the shoelace sum;
  2. calculate the centroid in that local coordinate space;
  3. add the origin back to the result.

The polylabel() priority queue comparator still orders primarily by max,
the existing "potential" value. It now adds deterministic tie-breaks using
values already stored on each Cell: d, h, x, and y.

This does not change the public polylabel() API, requested precision, or the
geometry algorithm used by callers.

Evidence

The candidate passed generated-tile verification in the native CI matrix used
for this investigation. Decoded MBTiles content matched across:

  • macOS latest Makefile
  • macOS 14 Makefile
  • Ubuntu 22.04 CMake
  • Ubuntu 22.04 Makefile
  • Windows CMake

A 30-run repeat check with the OpenMapTiles profile and the Liechtenstein
Geofabrik extract also produced semantically identical output for every run
against repeat 01 (changed_tiles: 0).

The failed control case was the local-origin centroid seed without the queue
tie-break. That version still had a native CI semantic difference: macOS
matched macOS and Ubuntu/Windows matched Ubuntu/Windows, but one housenumber
point differed between those runner groups.

Performance

Performance was measured with matched RelWithDebInfo builds, once without this
patch and once with this patch.

Configuration:

  • input: Austria Geofabrik extract, about 763 MiB
  • profile: resources/config-openmaptiles.json
  • process: resources/process-openmaptiles.lua
  • output: direct PMTiles
  • threads: --threads 4
  • source PBF warmed before measurement
  • baseline: same code stack without this patch
  • candidate: same code stack plus this patch

I used alternating /usr/bin/time -v runs to check wall time and peak RSS with
both builds running under the same conditions:

Build Wall time Max RSS
without this patch 98.34s 2,863,668 KB
with this patch 99.05s 2,929,420 KB

This paired sample suggests the memory footprint is effectively unchanged and
the patch was about 0.7% slower on this fixture.

The likely reason is that polylabel priority queues can contain many cells
with equal max values. The new comparator only does extra work for exact ties,
but those ties appear common enough on this fixture to be measurable.

Related Issues And PRs

Related centroid/polylabel context:

I did not find an existing issue or PR that directly reports this exact
cross-runner generated-point determinism bug.

Possible Regressions

This can change generated centroid/label point coordinates for polygonal
features that use polylabel, especially tiny polygons where absolute
coordinate cancellation affected the centroid seed or where queue cells had
equal priority.

Expected behavior:

  • callers still receive a point inside the polygon from the same polylabel
    algorithm;
  • output layers and attributes are unchanged;
  • archive bytes may still differ because tile/archive encoding is not made
    byte-for-byte deterministic by this patch;
  • decoded tile content should be stable across the tested native runner groups.

The queue comparator performs a few additional comparisons when max values
are exactly equal. On the Austria fixture, that was measurable as a slowdown in
timing tests, so this should be treated as a determinism/correctness tradeoff
rather than a performance-neutral change.

Testing

git diff --check origin/master..fix/deterministic-polylabel-placement
cmake -S . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build --target tilemaker -j2

Generated-tile checks:

tilemaker liechtenstein.osm.pbf \
  --config=resources/config-openmaptiles.json \
  --process=resources/process-openmaptiles.lua \
  --output=liechtenstein.mbtiles \
  --store=store \
  --quiet --threads 4

Compute the polygon centroid seed relative to a local ring origin before translating it back to map coordinates.

This isolates the centroid-origin change from the distance-key ordering candidate so it can be tested independently against the submitted PR stack.
Break exact priority-queue ties with existing cell geometry values so equivalent cells are not left to implementation-specific heap ordering.

This keeps the local-origin centroid candidate isolated from the broader distance-key ordering change while testing whether the remaining macOS versus Linux/Windows point drift is caused by queue ordering alone.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant