Skip to content

[Repo Assist] perf: HtmlNode.serialize avoids per-indent string alloc; isVoidElement as module-level constant#1768

Merged
dsyme merged 7 commits into
mainfrom
repo-assist/perf-htmlnode-serialize-2026-04-29-3c3cf8d8a964a909
May 8, 2026
Merged

[Repo Assist] perf: HtmlNode.serialize avoids per-indent string alloc; isVoidElement as module-level constant#1768
dsyme merged 7 commits into
mainfrom
repo-assist/perf-htmlnode-serialize-2026-04-29-3c3cf8d8a964a909

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This is an automated pull request from Repo Assist, an AI assistant for this repository.

Summary

Three focused performance improvements to HtmlNode / HtmlDocument serialization.

Changes

1. isVoidElement as a module-level constant

Before: HtmlNode.ToString() reconstructed a Set<string> of 16 void element names on every call, then wrapped it in a closure.

After: The set (and the isVoidElement predicate) are computed once at module initialisation time, stored in a [<AutoOpen>] module private HtmlNodeHelpers. This eliminates one Set.ofArray + 16-element insertion per ToString() call.

2. newLine uses StringBuilder.Append(char, int) directly

Before:

let newLine plus =
    sb.AppendLine() |> ignore
    String(' ', indentation + plus) |> append   // allocates a temporary string

After:

let newLine plus =
    sb.AppendLine() |> ignore
    sb.Append(' ', indentation + plus) |> ignore  // writes spaces directly, no string alloc

StringBuilder.Append(char, repeatCount) writes the character into the internal buffer directly without creating an intermediate string. Every pretty-printed element that starts a new line previously paid one allocation; now it pays zero.

3. HtmlDocument.ToString() uses a single StringBuilder

Before:

+ (elements |> List.map (fun x -> x.ToString()) |> String.Concat)
```
Each `HtmlNode.ToString()` created its own `StringBuilder`, serialised to it, and returned a `string`. The outer code then built an `FSharpList` of those strings before concatenating.

**After**: A single `StringBuilder` is created; `element.ToString()` is appended directly. The intermediate list allocation is gone.

## Test Status

```
Passed! - Failed: 0, Passed: 2980, Skipped: 0, Total: 2980 (net8.0, -p:NuGetAudit=false)

Pre-existing infrastructure warning (OpenTelemetry.Api 1.15.0 vulnerability via NuGet audit) suppressed with -p:NuGetAudit=false; unrelated to this change, tracked in PR #1762.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@96b9d4c39aa22359c0b38265927eadb31dcf4e2a

…t as module-level constant; HtmlDocument uses single StringBuilder

- Move void-element set to a module-level [<AutoOpen>] private module so it is
  computed once at startup instead of being re-created on every HtmlNode.ToString() call.
- Replace `String(' ', indentation + plus) |> append` in the serialize `newLine` helper
  with `sb.Append(' ', indentation + plus)` which writes spaces directly into the
  StringBuilder without allocating an intermediate string.
- Rewrite HtmlDocument.ToString() to use a single StringBuilder rather than
  `List.map … |> String.Concat`, avoiding the intermediate list allocation.

2980 tests pass (net8.0, -p:NuGetAudit=false).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…3 to resolve GHSA-g94r-2vxg-569j

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented May 1, 2026

Commit pushed: 7f959f0

Generated by 🌈 Repo Assist, see workflow run. Learn more.

@dsyme dsyme marked this pull request as ready for review May 1, 2026 11:41
@dsyme
Copy link
Copy Markdown
Contributor

dsyme commented May 8, 2026

/repo-assist Work out why CI is failing on windows sometimes and create a separate PR for the fix

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented May 8, 2026

✓ Repo Assist completed successfully, see workflow run.

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented May 8, 2026

Pull request created: #1774

Generated by 🌈 Repo Assist, see workflow run. Learn more.

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented May 8, 2026

🤖 Repo Assist here — investigating the Windows CI failures.

Root cause found. In tests/FSharp.Data.Core.Tests/XmlExtensions.fs, the startXmlHttpLocalServer() helper selects a port using Random.Next + GetActiveTcpListeners(). On Windows, Hyper-V and Docker reserve port ranges that are not returned by GetActiveTcpListeners() — so the code picks a "free" port that's actually in an excluded range, causing Kestrel to fail at bind time with:

SocketException (10013): An attempt was made to access a socket in a way forbidden by its access permissions.

The Http.fs test helper already uses the correct TcpListener(0) approach (ask the OS for a free port, guaranteed not excluded).

Fix submitted on branch repo-assist/fix-windows-ci-port-binding-2026-05-08:

  • Replace GetActiveTcpListeners random scan with TcpListener(0) (same as Http.fs)
  • Use app.StartAsync() instead of app.RunAsync() + Thread.Sleep(100), so server readiness is deterministic rather than a timed guess

All 13 XmlExtensions tests pass locally after the fix.

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@fc4ab36dedc44e2a1cdc195cecce262f06c81230

@dsyme dsyme merged commit cf73a9c into main May 8, 2026
3 checks passed
@dsyme dsyme deleted the repo-assist/perf-htmlnode-serialize-2026-04-29-3c3cf8d8a964a909 branch May 8, 2026 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant