Add grouped-linemerge generalization#2482
Conversation
|
From a first quick glimpse this looks really interesting. I'll need a while to digest this though. Can you look at the build failures and fix them? |
|
That's great to hear that it's interesting!!! I've fixed the clang-tidy, and I've fixed a performance mistake. Here are some numbers for what this does, and how long it takes, over downtown Los Angeles. Note that in practice, we would never expire an entire tile at Z13. More likely Z18 or Z19. I'm just showing it for clarity. Note that the most relevant column is probably As we get to higher zooms, the reason why
|
Summary
This PR adds a new generalization strategy,
grouped-linemerge, toosm2pgsql-gen. It maintains a table of merged linestrings, the equivalent of:The key is that this table is global, rather than local to your tile. It is kept up to date incrementally as the source data changes (and it never re-scans the entire planet to do so).
It works similar to the existing generalizations
riversandvector-union. You configure it from your flex style likeosm2pgsql.run_gen('grouped-linemerge', {...}), and you runosm2pgsql-genthe same way (after import, and after update).Motivation
The motivation is merging ways in OSM-Carto with the same rendering. Here's the relevant issue: openstreetmap-carto/openstreetmap-carto#951
OSM ways are often extremely fragmented, only going a block or two at a time in urban areas, due to relations and tagging changes. This significantly impairs rendering (see images near the bottom of that issue). We could LineMerge these ways within each tile as it's rendered, but this makes it really challenging to have consistent geometries for long ways that span metatile boundaries. (see openstreetmap-carto/openstreetmap-carto#5229 and openstreetmap-carto/openstreetmap-carto@master...leijurv:openstreetmap-carto:recursive-cte)
This PR allows us to have a global LineMerge for all ways, grouped only by the colunms that actually affect rendering (like
name,ref,highway,layer, etc). There are no tile-boundary artifacts, and it can be maintained incrementally quite efficiently.How it works
Initial build
The initial build is very simple - we just run that exact query to do a grouped
ST_LineMergeacross the whole world. This takes a few minutes, because the number of ways in each group is often not so large. We also make an index on the start and end points of each way, this is not strictly required but it significantly helps performance of the incremental update.Incremental updates
Generalizations are able to receive the expire list of tiles whose geometry has changed. The expire list includes tiles where something was deleted, changed, or added. One possible approach here would be to select all roads in the entire planet that match any grouping key that was touched, and re-LineMerge them all. I ruled this out because some road names are very frequent. Imagine if moving a node on "Main Street" in some town, resulted in re-scanning every road named "Main Street" in the entire world. I assumed this was not an option from a performance perspective. Instead, I used a recursive CTE to spider through the geometry from the expired tile. Each changed area becomes a seed from which it walks outwards. To bound this walk, it must match an exact endpoint, and it must match every element of the grouping columns. This finds the full connected component, which replaces the old connected component geometry.
Usage
Then:
Details
I believe this works in general for incremental updates, but the recursive CTE was challenging to get correct and performant. The reason for the new index is so that we can look up ways by their exact endpoint match. Without the new index, we'd use the existing GiST to lookup the way, which means that even if we ask PostGIS for
ST_Intersector some such, it will actually mechanically do&&on the way's entire bbox. This works, to be clear, but it means that if I want to find a way that ends on an exact point, GiST will actually give me all the ways whose entire bbox covers that point, which means less than one percent of the ways that the index gives me are actually eligible. See here openstreetmap-carto/openstreetmap-carto#951 (comment) for discussion about that.Tests
I put in
tests/test-gen-grouped-linemerge.cppa differential fuzz test. It performs hundreds of random connect/disconnects in a small grid of points. I intentionally set it up so the degree of each node could vary from 0 to 4, which stresses howST_LineMergedoes not merge nodes with a degree above 2. At all times, the incrementally updated geometry must exactly match whatST_LineMergewould have done.I have run this locally and it works beautifully. Here's the commit to run it leijurv/openstreetmap-carto@b79c0fa and it does add a bunch of labels (looks like this openstreetmap-carto/openstreetmap-carto#951 (comment))