Skip to content

fix(proxy): use Host header for noindex check (production was being noindex'd)#47

Merged
JohnRDOrazio merged 1 commit intomainfrom
fix/proxy-host-header
Apr 30, 2026
Merged

fix(proxy): use Host header for noindex check (production was being noindex'd)#47
JohnRDOrazio merged 1 commit intomainfrom
fix/proxy-host-header

Conversation

@JohnRDOrazio
Copy link
Copy Markdown
Member

@JohnRDOrazio JohnRDOrazio commented Apr 30, 2026

Summary

Production was being noindex'd after the PR #44 deploy. The deploy
itself succeeded (run 25186464464)
and proxy.ts made it to the VPS, but the host check inside it was
failing for production:

$ curl -sI https://catholicdigitalcommons.org/
HTTP/2 200
x-robots-tag: noindex, nofollow    ← WRONG

Root cause

Next.js runs with trustHostHeader: false (visible in the
standalone server.js config). With that setting, request.nextUrl
falls back to the bind address rather than the actual incoming
hostname, so nextUrl.hostname resolves to localhost behind
Plesk's reverse proxy. localhost is never in PRODUCTION_HOSTS,
so noindex was applied to every request — including production.

The earlier CodeRabbit suggestion to use nextUrl.hostname is
correct in principle but wrong for our deployment topology. The
Host header (forwarded by nginx via proxy_set_header Host $host)
is the only reliable source of the client-facing hostname here.

Fix

const host = (request.headers.get('host') ?? '').toLowerCase().split(':')[0]
if (host && !PRODUCTION_HOSTS.has(host)) {
  response.headers.set('X-Robots-Tag', 'noindex, nofollow')
}

Two changes from the PR #44 version:

  1. Read request.headers.get('host') instead of nextUrl.hostname
  2. Defensive host && ... guard so a missing/empty Host header
    fails open (doesn't noindex). Safer than the alternative.

Urgency

Production is currently telling Google not to index it. Needs to
merge and redeploy ASAP.

Test plan

  • npm run build clean locally
  • Merge → re-run Build and Deploy workflow_dispatch
  • Verify:
    • curl -I https://catholicdigitalcommons.org/ → no
      X-Robots-Tag
    • curl -I https://www.catholicdigitalcommons.org/ → no
      X-Robots-Tag
    • After staging redeploy: curl -I https://staging.catholicdigitalcommons.org/
      X-Robots-Tag: noindex, nofollow

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Improved host detection in reverse-proxy configurations to accurately apply robots meta tags only when appropriate, preventing incorrect non-production environment detection.

…heck

Production deploy after PR #44 merged showed X-Robots-Tag: noindex,
nofollow on catholicdigitalcommons.org — the OPPOSITE of intent. The
deploy succeeded (run 25186464464) and the new proxy.ts is on the
VPS, but the host check inside it was failing for production.

Root cause: Next.js runs with `trustHostHeader: false` (visible in
the server.js bundle config). With that setting, request.nextUrl
falls back to the bind address rather than the actual incoming
hostname, so nextUrl.hostname is effectively localhost behind
Plesk's reverse proxy. localhost is never in PRODUCTION_HOSTS, so
noindex was applied to every request.

CodeRabbit had earlier suggested switching from headers.get('host')
to nextUrl.hostname for cleanliness. That advice is correct in
principle but wrong for our deployment topology. Reverting to the
Host header — which Plesk's nginx forwards via
proxy_set_header Host $host — is the only reliable source of the
client-facing hostname in this setup.

Also added a defensive `host && ...` guard: if the header is
somehow missing, fail-open (do NOT noindex). Better to
under-noindex one suspicious request than to noindex production.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 23125735-01c8-4040-b34a-2d1a6d950959

📥 Commits

Reviewing files that changed from the base of the PR and between 2663d91 and e287e81.

📒 Files selected for processing (1)
  • proxy.ts

📝 Walkthrough

Walkthrough

The proxy's non-production detection logic now derives the request hostname from the incoming Host header (normalized and port-stripped) instead of request.nextUrl.hostname, ensuring X-Robots-Tag: noindex, nofollow is set conditionally based on an existing, allowlisted hostname in reverse-proxy environments.

Changes

Cohort / File(s) Summary
Proxy Hostname Detection
proxy.ts
Changed hostname extraction from request.nextUrl.hostname to the incoming Host header (normalized and port-stripped); conditioned X-Robots-Tag: noindex, nofollow header assignment on both hostname existence and non-production status.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through proxy streams,
Host headers gleam in moonlit beams,
No ports to trip, no noindex fright,
Just honest hosts in production light!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main fix: using the Host header for the noindex check and resolving the production noindex issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/proxy-host-header

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity · 0 duplication

Metric Results
Complexity 0
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@JohnRDOrazio JohnRDOrazio merged commit 1b9a9f7 into main Apr 30, 2026
9 checks passed
@JohnRDOrazio JohnRDOrazio deleted the fix/proxy-host-header branch April 30, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant