Skip to content

feat(utils): enhance AEM Headless detection with SDK and GraphQL signals#1627

Draft
habansal wants to merge 1 commit into
mainfrom
feat/enhance-headless-detection
Draft

feat(utils): enhance AEM Headless detection with SDK and GraphQL signals#1627
habansal wants to merge 1 commit into
mainfrom
feat/enhance-headless-detection

Conversation

@habansal

Copy link
Copy Markdown
Contributor

Summary

Improves `detectAEMVersion()` in `spacecat-shared-utils` to detect AEM Headless sites more reliably.

Problem

  1. `/content/dam/` is not headless-specific — CS and AMS sites reference DAM assets all the time; a CS site with images could be misclassified as headless
  2. Single-signal detection was impossible — with only 2 patterns and `MIN_THRESHOLD=2`, both had to match; a site with just an `aem-headless` meta tag would fall through to `OTHER` and get waitlisted in PLG onboarding as "not an AEM site"

Changes

Replaced `/content/dam/` with purpose-built headless patterns:

Pattern What it detects
`@adobe/aem-headless-client` Official AEM Headless JS SDK (in bundled scripts)
`aem-headless-client` Broader SDK reference
`graphql/execute.json` Persisted GraphQL query calls
`content/_cq_graphql/` AEM GraphQL endpoint references

Added weight boosting so each strong signal alone clears the detection threshold (consistent with how EDS/AMS/CS already use boosting):

  • `aem-headless` string → +2 boost (3 pts total, clears threshold alone)
  • `@adobe/aem-headless-client` → +3 boost (4 pts total)
  • `graphql/execute.json` → +2 boost (3 pts total)
  • `_cq_graphql` endpoint → +2 boost (3 pts total)

Tests

Replaced the two previous headless tests (both relied on having `aem-headless` AND `/content/dam/` together) with 5 targeted tests:

  • ✅ `aem-headless` string alone → `AEM_HEADLESS`
  • ✅ `@adobe/aem-headless-client-js` SDK reference → `AEM_HEADLESS`
  • ✅ `/graphql/execute.json` persisted query → `AEM_HEADLESS`
  • ✅ `/content/_cq_graphql/` endpoint → `AEM_HEADLESS`
  • ✅ CS site with `/content/dam/` images → still `AEM_CS` (regression guard)

1155 tests passing, 0 failing.

Test plan

  • Unit tests pass (`npm test -w packages/spacecat-shared-utils`)
  • Lint passes (`npm run lint -w packages/spacecat-shared-utils`)

🤖 Generated with Claude Code

- Replace /content/dam/ (too generic, present on all AEM types) with
  more reliable headless-specific patterns
- Add @adobe/aem-headless-client SDK reference detection
- Add GraphQL persisted query URL (/graphql/execute.json) detection
- Add AEM GraphQL endpoint reference (/content/_cq_graphql/) detection
- Add weight boosting so each strong signal alone exceeds the threshold:
  - aem-headless string: +2 boost (3 pts total)
  - @adobe/aem-headless-client SDK: +3 boost (4 pts total)
  - graphql/execute.json: +2 boost (3 pts total)
  - _cq_graphql endpoint: +2 boost (3 pts total)
- Add regression test: CS site with /content/dam/ images is NOT
  misclassified as headless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant