Use the :abbr: role for BMP (Basic Multilingual Plane)#150673
Use the :abbr: role for BMP (Basic Multilingual Plane)#150673serhiy-storchaka wants to merge 2 commits into
Conversation
Documentation build overview
78 files changed ·
|
|
|
||
| A Tk Text widget, and hence IDLE's Shell, displays characters (codepoints) in | ||
| the BMP (Basic Multilingual Plane) subset of Unicode. Which characters are | ||
| the :abbr:`BMP (Basic Multilingual Plane)` subset of Unicode. Which characters are |
There was a problem hiding this comment.
This is actually a regression for mobile users, since they can't easily access the now hidden meaning, I suggest keep it like it was.
hugovk
left a comment
There was a problem hiding this comment.
This is actually a regression for mobile users, since they can't easily access the now hidden meaning, I suggest keep it like it was.
Not only mobile users, but users of assistive technology such as screen readers and other touch devices.
If we're introducing the full form of an abbreviation for the benefit of all users who don't know it, we should do it in a way that's useable for all of them.
This RST:
:abbr:`BMP (Basic Multilingual Plane)`gives this HTML:
<abbr title="Basic Multilingual Plane">BMP</abbr>Which we can see on this page, the "Basic Multilingual Plane" is only visible in a tooltip when hovering the mouse:
https://cpython-previews--150673.org.readthedocs.build/en/150673/whatsnew/3.16.html#xml
That's no use for mobile and screen readers.
The MDN advice:
Accessibility
Spelling out the acronym or abbreviation in full the first time it is used on a page is beneficial for helping people understand it, especially if the content is technical or industry jargon.
Only include a title if expanding the abbreviation or acronym in the text is not possible. Having a difference between the announced word or phrase and what is displayed on the screen, especially if it's technical jargon the reader may not be familiar with, can be jarring.
They give this example:
<p>
JavaScript Object Notation (<abbr>JSON</abbr>) is a lightweight
data-interchange format.
</p>Which renders like:
JavaScript Object Notation (JSON) is a lightweight data-interchange format.
See also https://adrianroselli.com/2024/01/using-abbr-element-with-title-attribute.html
Don’t use
<abbr>. Also don’t use it withtitle. Exposure continues to be inconsistent across browsers and assistive technologies. Some set of users will always miss some piece of information.Explain abbreviations, acronyms, initialisms, numeronyms, etc. on first use and then feel free to fall back to the shortened form.
The page also lists how many browsers fail to display title across different devices.
So I think we should ditch the inaccessible title. In RST:
Basic Multilingual Plane (:abbr:`BMP`)gives this HTML:
Basic Multilingual Plane (<abbr>BMP</abbr>)The browser gives "BMP" a dotted underline, and a question mark on hover but no "Basic Multilingual Plane" tooltip, which is a little unusual. But the plaintext "Basic Multilingual Plane" is there for everyone.
Or even better, leave out abbr altogether:
Basic Multilingual Plane (BMP)| falls back to Python. | ||
| It supports most of 8-bit encodings and many multi-byte encodings | ||
| like Shift_JIS, although only BMP characters (``U+0000-U+FFFF``) | ||
| like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)` |
There was a problem hiding this comment.
| like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)` | |
| like Shift_JIS, although only the Basic Multilingual Plane (BMP) |
| <xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP", | ||
| "GB2312", "GBK", "johab", and "Shift_JIS". | ||
| Add partial support (only BMP characters) for multi-byte encodings | ||
| Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)` |
There was a problem hiding this comment.
| Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)` | |
| Add partial support (only the Basic Multilingual Plane (BMP) |
| * pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point; | ||
|
|
||
| * BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point; | ||
| * :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use |
There was a problem hiding this comment.
| * :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use | |
| * Basic Multilingual Plane (BMP) strings (``U+0000-U+FFFF``) use |
| OS native encoding is now used for converting between Python strings and Tcl | ||
| objects. This allows IDLE to work with emoji and other non-BMP characters. | ||
| objects. This allows IDLE to work with emoji and other characters that are not | ||
| in the :abbr:`BMP (Basic Multilingual Plane)`. |
There was a problem hiding this comment.
| in the :abbr:`BMP (Basic Multilingual Plane)`. | |
| in the Basic Multilingual Plane (BMP). |
|
When you're done making the requested changes, leave the comment: |
|
@hugovk, I encourage you put that write up in the devguide for future reference, we don't have any docs for it currently. |
Issue for now: python/devguide#1824. |
|
Well, then I withdraw my PR. |
The BMP abbreviation may be not well known to all users. So it is better to use the
:abbr:role, which allows to explain it.