Always use topological sort for published pages#22896
Open
oguzkocer wants to merge 3 commits into
Open
Conversation
PR #10476 introduced `MAX_TOPOLOGICAL_PAGE_COUNT = 100` to cap the topological (hierarchical) sort and fall back to sorting by publish date for sites with 100+ published pages. This was done to match Calypso's behavior at the time. The date-based fallback produced a completely different ordering than what users see on the web — pages lost their parent/child hierarchy and indentation, and edited pages could jump to the top of the list due to transient local state during save. This commit: - Removes `MAX_TOPOLOGICAL_PAGE_COUNT` and the date-based fallback so all published pages are sorted hierarchically regardless of count. - Rewrites `topologicalSort` to build a `parentId → children` lookup map in a single `groupBy` pass then walk the tree once, instead of re-scanning the full page list on every recursive call.
Collaborator
Contributor
|
|
Contributor
|
|
Contributor
🤖 Build Failure AnalysisThis build has failures. Claude has analyzed them - check the build annotations for details. |
The `sorts 100 or more pages by date DESC` test validated the date-based fallback that no longer exists. The remaining topological sort test is renamed since it now applies to all page counts.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## trunk #22896 +/- ##
==========================================
- Coverage 37.32% 37.32% -0.01%
==========================================
Files 2320 2320
Lines 124578 124576 -2
Branches 16926 16925 -1
==========================================
- Hits 46498 46495 -3
Misses 74319 74319
- Partials 3761 3762 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Description
#10476 introduced
MAX_TOPOLOGICAL_PAGE_COUNT = 100to cap the topological (hierarchical) sort and fall back to sorting by publish date for sites with 100+ published pages. This was done to match Calypso's behavior at the time.The date-based fallback produced a completely different ordering than what users see on the web — pages lost their parent/child hierarchy and indentation, and edited pages could jump to the top of the list due to transient local state during save.
This commit:
MAX_TOPOLOGICAL_PAGE_COUNTand the date-based fallback so all published pages are sorted hierarchically regardless of count.topologicalSortto build aparentId → childrenlookup map in a singlegroupBypass then walk the tree once, instead of re-scanning the full page list on every recursive call.To Test
MAX_TOPOLOGICAL_PAGE_COUNTto something like2, you can verify that the ordering is completely different from the web.Although I don't know why Calypso had the 100 pages limit, maybe due to network request page sizing, I don't believe it makes sense in our case because we fetch all the pages before we show it to the user.
I couldn't find a compelling reason to keep this behavior. The only consideration I could think of was the performance cost, because the current implementation is
O(n^2)but this PR improves that toO(n).Fixes CMM-2065