Skip to content

Add an RSS feed of new patterns, with ORCID author links#144

Open
RichardLitt wants to merge 4 commits into
bulk-migrate-authorsfrom
rss-feed-with-orcid
Open

Add an RSS feed of new patterns, with ORCID author links#144
RichardLitt wants to merge 4 commits into
bulk-migrate-authorsfrom
rss-feed-with-orcid

Conversation

@RichardLitt
Copy link
Copy Markdown
Member

@RichardLitt RichardLitt commented May 25, 2026

Builds on top of #142.

What this changes

The site now publishes an RSS feed of new patterns at /feed_rss_created.xml. RSS is the open standard that lets people add a website to a "feed reader" (Feedly, NetNewsWire, Inoreader, even some email clients) and get a notification whenever something new is published — like a podcast subscription, but for articles.

Each item in the feed includes:

  • The pattern's title and link.
  • The publication date (first time the pattern was added to the repo).
  • A one-line summary, taken from the first paragraph of the pattern.
  • Every author credited with a clickable link to their ORCID profile — so a reader who subscribes can hover over a name and jump straight to that researcher's full publication history on ORCID.org.

How readers find it

Three ways:

  1. A short "Stay in the loop" section on the home page, right under the patterns listing, with a "Subscribe" link.
  2. A subscribe line at the bottom of every author page (for example on /authors/ciara-flanagan/).
  3. Browser autodiscovery — when someone opens any pattern page in a feed-reader-aware browser or in a reader app, the feed shows up automatically as a "Subscribe" option. This is the standard way most RSS subscribers find feeds in 2026.

What this doesn't change

  • No changes to how you write or edit patterns. The feed reads the same authors: frontmatter and authors.yml that Give every author their own page (issue #89) #141 introduced — you don't need to do anything new.
  • No changes to which authors have ORCIDs displayed; authors without an ORCID in authors.yml simply appear in the feed without a link, the same way they do on pattern pages.
  • The home page, the navigation, and pattern pages all look almost the same — the only visible change is the small "Stay in the loop" section on the home page.

One thing worth flagging to contributors

CONTRIBUTING.md has been updated to say this explicitly, but: a contributor's ORCID, if they provide one, will now travel further than before — it appears in every RSS item for every pattern they've co-authored, not just on the pattern page itself. It's still the same ORCID iD they're already publishing on their profile, and it's still optional, but it's worth being upfront about. Anyone uncomfortable with that can simply leave the orcid: field out of their entry.

What to click on the preview site

  • / — scroll past the patterns listing to "Stay in the loop"; click the RSS feed link.
  • /feed_rss_created.xml directly — this is the raw feed (XML). It's not designed for humans, but a quick check that it loads and that author names appear is reassuring.
  • /authors/ciara-flanagan/ — scroll to the bottom; the subscribe line is below the patterns list.
  • Open any pattern page in Firefox or a feed-reader app — the autodiscovery link means the reader will find the feed without you having to copy a URL.

Stacking

This PR stacks on top of #142, which stacks on top of #141. Please merge #141 first, then #142, then this one — once each is in, the next rebases automatically onto main.

Follow-ups (already filed as separate issues)

A pre-merge review surfaced three useful improvements that are bigger than this PR and are tracked separately so they don't block this one merging:

🤖 Generated with Claude Code

Tighten .markdownlint.yaml to require `style: dash` for the
markdownlint MD004 (ul-style) rule, then run `npm run lint-fix`
to normalize all existing patterns to use `-` as the unordered
list marker (previously a mix of `-`, `*`, and `+`).

Pure auto-fix — no content changes, no whitespace changes beyond
the bullet character itself. Confirmed `npm run lint` passes
afterwards.

Prep for the upcoming bulk authors migration; consistent bullet
markers make the Contributors-section parser much simpler.
Hand-fix four patterns whose Contributors sections didn't fit the
standard "bullet list of authors followed by optional prose" shape,
so the upcoming bulk migration can parse them consistently.

- event-tabling.md, ospo-student-ambassador-program.md: contributor
  entries were paragraph-form rather than a bulleted list. Convert
  each line to a `-` bullet. Also strip the `my-orcid?orcid=` query
  prefix from the Nouha Elyazidi ORCID URL so it matches the
  canonical `https://orcid.org/<id>` form.
- source-industry-mentors-for-the-icorps-program.md: the Zach Chandler
  line had a stray `](<https://orcid.org/...>` fragment left over
  from a half-applied markdown link. Remove the trailing junk so
  only the canonical ORCID URL remains.
- template-for-1-1-campus-consultations.md: a bullet read
  "Duane O'Brien consulted on the design of the CMU OSPO project
  consultation template", which is acknowledgement prose rather than
  a structured contributor entry. Move it out of the bullet list and
  into a "Thanks to Duane O'Brien…" sentence below the list.

No semantic changes — same people credited, same ORCIDs, same prose.
Convert all remaining patterns' Contributors sections to the
authors.yml + frontmatter model introduced in #141. Adds 28 new
authors to authors.yml (the 9 seeded in #141 stay unchanged) and
populates the `authors:` frontmatter on 41 patterns.

Migration was driven by a one-off Python script (not committed)
that:

- Parses each pattern's `## Contributors & Acknowledgement` /
  `Acknowledgements` section, skipping any "in alphabetical order"
  preamble.
- Extracts (name, affiliation, orcid) from each bullet using
  regex-based ORCID detection and a name/affiliation parser that
  handles both `Name, Affiliation, <orcid>` and `Name (Affiliation),
  <orcid>` shapes.
- Strips academic titles (Dr./Prof./etc.) before slug generation
  and before name-based deduplication, so "Dr. Angela Newell" maps
  to the existing `angela-newell` slug instead of creating a
  duplicate.
- Folds diacritics for slug generation via unicodedata.normalize
  (NFKD + combining-character strip), so "David Pérez-Suárez"
  becomes `david-perez-suarez` rather than `david-p-rez-su-rez`.
- Dedupes by ORCID first, then by case-insensitive name; backfills
  missing ORCID or affiliation on existing records when a later
  pattern provides one.
- Preserves any trailing prose under the Contributors heading
  (e.g. AI-use disclaimers, "Special thanks to X for ..." notes).

Skipped (no Contributors section, intentionally left untouched):
framework-managing-university-oss.md, lunch-and-learn.md,
open-research-community-accelerator.md, open-source-catalog.md,
open-source-software-prize.md, open-source-survey.md,
oss-tutorials-using-authoring-tools.md, summer-internship-program.md,
integrating-oss-into-institutional-software-pathways.md.

Verified:
- `mkdocs build --strict --site-dir /tmp/patterns_site` clean.
- All 37 author pages generate; each lists the patterns the author
  contributed to.
- Preserved prose still renders on
  individual-consultations-office-hours.md (Duane O'Brien thanks),
  source-industry-mentors-for-the-icorps-program.md (Jeffrey Young
  thanks), project-rolodex.md (Megan Forbes thanks), and the
  several patterns with "A note on AI use" paragraphs.
- `npm run lint` passes (one MD012 stray blank line in
  project-rolodex.md auto-fixed by `lint-fix`).
The feed lists each pattern by first-publication date and credits every
author with a clickable link to their ORCID profile. Subscribers can
find it via the Material-emitted autodiscovery <link> tag, the new
"Stay in the loop" section on the home README, or the subscribe line
on every author page.

How it works:

- mkdocs-rss-plugin (pinned in .config/requirements.txt) generates the
  feed; filenames are left at the plugin defaults so Material's hard-
  coded autodiscovery URLs resolve correctly.
- .config/hooks/authors.py enriches each pattern's page metadata so the
  plugin sees real author names and an ORCID-linked attribution body.
  After build, the same hook post-processes the generated XML for spec
  compliance: <author> → <dc:creator>, item-level <description> wrapped
  in CDATA so the ORCID anchor tags render in feed readers, the self-
  referencing <source> element dropped, and unescaped categories
  (e.g. "Education & Skills") re-escaped to keep the feed well-formed.
- authors.yml entries are now validated at build time. Slugs, names,
  affiliations, and ORCIDs must match strict regexes; any HTML-tag
  character in a name or affiliation fails the build. This closes a
  stored-XSS hole that would otherwise let a malicious PR inject script
  via author metadata.

Follow-ups to file as separate GitHub issues:

1. Broader feed coverage — per-tag feeds, an Atom variant, and a JSON
   Feed mirror so modern readers (NetNewsWire, Feedbin) have a
   first-class option.

2. Scholarly identifiers — DOIs per pattern via Zenodo, ROR IDs for
   affiliations, and surfacing both in the feed metadata so scholarly
   graph crawlers (OpenAlex, CrossRef) can ingest the catalogue.

3. Tests + UX iteration — unit tests for the authors hook (the
   pre-merge review surfaced concrete failure cases worth pinning),
   an RSS icon for the subscribe affordance, consistent wording across
   the three subscribe surfaces, and revisiting the per-author
   placement which currently sits below the patterns list.

Stacked on top of #141 (author pages) and #142 (ORCID iD icon).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant