Skip to content

gh-141444: Fix dead URLs in urllib documentation#148952

Closed
ZLeventer wants to merge 2 commits intopython:mainfrom
ZLeventer:fix/dead-doc-urls
Closed

gh-141444: Fix dead URLs in urllib documentation#148952
ZLeventer wants to merge 2 commits intopython:mainfrom
ZLeventer:fix/dead-doc-urls

Conversation

@ZLeventer
Copy link
Copy Markdown

@ZLeventer ZLeventer commented Apr 24, 2026

Summary

Replace dead/broken URLs in urllib documentation files:

  • Doc/library/urllib.request.rst: Replace musi-cal.com (returns 503) with python.org/search in the GET example, and replace requestb.in (returns 403) with httpbin.org/post in the POST example
  • Doc/library/urllib.robotparser.rst: Replace bare http://www.robotstxt.org/orig.html URL with an :rfc:9309`` reference — RFC 9309 ("Robots Exclusion Protocol") is the authoritative IETF standard for robots.txt

Closes #141444
Closes #141412


📚 Documentation preview 📚: https://cpython-previews--148952.org.readthedocs.build/

Replace dead/broken URLs in urllib documentation:

- urllib.request.rst: Replace musi-cal.com (503) with python.org/search
  for GET example
- urllib.request.rst: Replace requestb.in (403) with httpbin.org/post
  for POST example
- urllib.robotparser.rst: Replace robotstxt.org/orig.html with RFC 9309
  reference, which is the authoritative IETF standard for robots.txt
@ZLeventer ZLeventer requested a review from berkerpeksag as a code owner April 24, 2026 11:13
@python-cla-bot
Copy link
Copy Markdown

python-cla-bot Bot commented Apr 24, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app bedevere-app Bot added awaiting review docs Documentation in the Doc dir skip news labels Apr 24, 2026
@github-project-automation github-project-automation Bot moved this to Todo in Docs PRs Apr 24, 2026
@ZLeventer ZLeventer changed the title Fix dead URLs in urllib documentation gh-141444: Fix dead URLs in urllib documentation Apr 24, 2026
Comment thread Doc/library/urllib.robotparser.rst Outdated
questions about whether or not a particular user agent can fetch a URL on the
website that published the :file:`robots.txt` file. For more details on the
structure of :file:`robots.txt` files, see http://www.robotstxt.org/orig.html.
structure of :file:`robots.txt` files, see :rfc:`9309`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a separate issue, please revert.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted — I'll open a separate PR for that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, don't. We don't support that RFC yet. See #138907.

Comment thread Doc/library/urllib.request.rst Outdated
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> data = data.encode('ascii')
>>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
>>> with urllib.request.urlopen("https://httpbin.org/post", data) as f:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How trusted is that URL? is it the URL that is used by testing in general? is there some "example.net" alternative?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — switched to example.com which is IANA-reserved for documentation per RFC 2606.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is whether example.com supports a POST endpoint. It doesn't make sense to use it if it's not the case. And please, don't use LLMs. We don't accept PRs generated by them. See https://devguide.python.org/getting-started/generative-ai/.

@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented Apr 25, 2026

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Revert the robotparser.rst change (will open a separate PR).
Switch urllib.request.rst examples to example.com, which is
IANA-reserved for documentation per RFC 2606.
@ZLeventer
Copy link
Copy Markdown
Author

Thanks for the review @picnixz. You're right to flag the AI concern — I used Claude Code as an assistant while working on this, and the RFC 2606 mention was a giveaway that I leaned on it too heavily for the reply rather than checking CPython's own stance first. I appreciate the pointer to the devguide policy.

On the substance: I see #144863 already covers these same two URL replacements (and with better choices — python.org for GET, httpbin.org/post for POST). No reason to duplicate that work, so I'll close this in favor of that PR.

@ZLeventer ZLeventer closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting changes docs Documentation in the Doc dir skip news

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

Dead example URL in urlib.robotparser documentation URL for tutorial urlopen example ~is dead~ doesn't want to go on the cart

2 participants