Skip to content

Use the :abbr: role for BMP (Basic Multilingual Plane)#150673

Closed
serhiy-storchaka wants to merge 2 commits into
python:mainfrom
serhiy-storchaka:docs-abbr-BMP
Closed

Use the :abbr: role for BMP (Basic Multilingual Plane)#150673
serhiy-storchaka wants to merge 2 commits into
python:mainfrom
serhiy-storchaka:docs-abbr-BMP

Conversation

@serhiy-storchaka
Copy link
Copy Markdown
Member

The BMP abbreviation may be not well known to all users. So it is better to use the :abbr: role, which allows to explain it.

@serhiy-storchaka serhiy-storchaka added the docs Documentation in the Doc dir label May 31, 2026
@serhiy-storchaka serhiy-storchaka added skip news needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes needs backport to 3.15 pre-release feature fixes, bugs and security fixes labels May 31, 2026
@github-project-automation github-project-automation Bot moved this to Todo in Docs PRs May 31, 2026
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented May 31, 2026

Comment thread Doc/library/idle.rst Outdated

A Tk Text widget, and hence IDLE's Shell, displays characters (codepoints) in
the BMP (Basic Multilingual Plane) subset of Unicode. Which characters are
the :abbr:`BMP (Basic Multilingual Plane)` subset of Unicode. Which characters are
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a regression for mobile users, since they can't easily access the now hidden meaning, I suggest keep it like it was.

Copy link
Copy Markdown
Member

@StanFromIreland StanFromIreland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a regression for mobile users, since they can't easily access the now hidden meaning, I suggest keep it like it was.

Not only mobile users, but users of assistive technology such as screen readers and other touch devices.

If we're introducing the full form of an abbreviation for the benefit of all users who don't know it, we should do it in a way that's useable for all of them.

This RST:

:abbr:`BMP (Basic Multilingual Plane)`

gives this HTML:

<abbr title="Basic Multilingual Plane">BMP</abbr>

Which we can see on this page, the "Basic Multilingual Plane" is only visible in a tooltip when hovering the mouse:

https://cpython-previews--150673.org.readthedocs.build/en/150673/whatsnew/3.16.html#xml

That's no use for mobile and screen readers.

The MDN advice:

Accessibility

Spelling out the acronym or abbreviation in full the first time it is used on a page is beneficial for helping people understand it, especially if the content is technical or industry jargon.

Only include a title if expanding the abbreviation or acronym in the text is not possible. Having a difference between the announced word or phrase and what is displayed on the screen, especially if it's technical jargon the reader may not be familiar with, can be jarring.

They give this example:

<p>
  JavaScript Object Notation (<abbr>JSON</abbr>) is a lightweight
  data-interchange format.
</p>

Which renders like:

JavaScript Object Notation (JSON) is a lightweight data-interchange format.

See also https://adrianroselli.com/2024/01/using-abbr-element-with-title-attribute.html

Don’t use <abbr>. Also don’t use it with title. Exposure continues to be inconsistent across browsers and assistive technologies. Some set of users will always miss some piece of information.

Explain abbreviations, acronyms, initialisms, numeronyms, etc. on first use and then feel free to fall back to the shortened form.

The page also lists how many browsers fail to display title across different devices.

So I think we should ditch the inaccessible title. In RST:

Basic Multilingual Plane (:abbr:`BMP`)

gives this HTML:

Basic Multilingual Plane (<abbr>BMP</abbr>)

The browser gives "BMP" a dotted underline, and a question mark on hover but no "Basic Multilingual Plane" tooltip, which is a little unusual. But the plaintext "Basic Multilingual Plane" is there for everyone.

Or even better, leave out abbr altogether:

Basic Multilingual Plane (BMP)

Comment thread Doc/library/pyexpat.rst
falls back to Python.
It supports most of 8-bit encodings and many multi-byte encodings
like Shift_JIS, although only BMP characters (``U+0000-U+FFFF``)
like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)`
like Shift_JIS, although only the Basic Multilingual Plane (BMP)

Comment thread Doc/whatsnew/3.16.rst
<xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP",
"GB2312", "GBK", "johab", and "Shift_JIS".
Add partial support (only BMP characters) for multi-byte encodings
Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)`
Add partial support (only the Basic Multilingual Plane (BMP)

Comment thread Doc/whatsnew/3.3.rst
* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;

* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
* :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use
* Basic Multilingual Plane (BMP) strings (``U+0000-U+FFFF``) use

Comment thread Doc/whatsnew/3.8.rst
OS native encoding is now used for converting between Python strings and Tcl
objects. This allows IDLE to work with emoji and other non-BMP characters.
objects. This allows IDLE to work with emoji and other characters that are not
in the :abbr:`BMP (Basic Multilingual Plane)`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in the :abbr:`BMP (Basic Multilingual Plane)`.
in the Basic Multilingual Plane (BMP).

@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented Jun 4, 2026

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@StanFromIreland
Copy link
Copy Markdown
Member

@hugovk, I encourage you put that write up in the devguide for future reference, we don't have any docs for it currently.

@hugovk
Copy link
Copy Markdown
Member

hugovk commented Jun 4, 2026

@hugovk, I encourage you put that write up in the devguide for future reference, we don't have any docs for it currently.

Issue for now: python/devguide#1824.

@serhiy-storchaka
Copy link
Copy Markdown
Member Author

Well, then I withdraw my PR.

@serhiy-storchaka serhiy-storchaka deleted the docs-abbr-BMP branch June 4, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting changes docs Documentation in the Doc dir needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes needs backport to 3.15 pre-release feature fixes, bugs and security fixes skip issue skip news

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants