fix(proxy): use Host header for noindex check (production was being noindex'd)#47
fix(proxy): use Host header for noindex check (production was being noindex'd)#47JohnRDOrazio merged 1 commit intomainfrom
Conversation
…heck Production deploy after PR #44 merged showed X-Robots-Tag: noindex, nofollow on catholicdigitalcommons.org — the OPPOSITE of intent. The deploy succeeded (run 25186464464) and the new proxy.ts is on the VPS, but the host check inside it was failing for production. Root cause: Next.js runs with `trustHostHeader: false` (visible in the server.js bundle config). With that setting, request.nextUrl falls back to the bind address rather than the actual incoming hostname, so nextUrl.hostname is effectively localhost behind Plesk's reverse proxy. localhost is never in PRODUCTION_HOSTS, so noindex was applied to every request. CodeRabbit had earlier suggested switching from headers.get('host') to nextUrl.hostname for cleanliness. That advice is correct in principle but wrong for our deployment topology. Reverting to the Host header — which Plesk's nginx forwards via proxy_set_header Host $host — is the only reliable source of the client-facing hostname in this setup. Also added a defensive `host && ...` guard: if the header is somehow missing, fail-open (do NOT noindex). Better to under-noindex one suspicious request than to noindex production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe proxy's non-production detection logic now derives the request hostname from the incoming Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Summary
Production was being noindex'd after the PR #44 deploy. The deploy
itself succeeded (run 25186464464)
and
proxy.tsmade it to the VPS, but the host check inside it wasfailing for production:
Root cause
Next.js runs with
trustHostHeader: false(visible in thestandalone server.js config). With that setting,
request.nextUrlfalls back to the bind address rather than the actual incoming
hostname, so
nextUrl.hostnameresolves tolocalhostbehindPlesk's reverse proxy.
localhostis never inPRODUCTION_HOSTS,so noindex was applied to every request — including production.
The earlier CodeRabbit suggestion to use
nextUrl.hostnameiscorrect in principle but wrong for our deployment topology. The
Host header (forwarded by nginx via
proxy_set_header Host $host)is the only reliable source of the client-facing hostname here.
Fix
Two changes from the PR #44 version:
request.headers.get('host')instead ofnextUrl.hostnamehost && ...guard so a missing/empty Host headerfails open (doesn't noindex). Safer than the alternative.
Urgency
Production is currently telling Google not to index it. Needs to
merge and redeploy ASAP.
Test plan
npm run buildclean locallycurl -I https://catholicdigitalcommons.org/→ noX-Robots-Tagcurl -I https://www.catholicdigitalcommons.org/→ noX-Robots-Tagcurl -I https://staging.catholicdigitalcommons.org/→
X-Robots-Tag: noindex, nofollow🤖 Generated with Claude Code
Summary by CodeRabbit