python · serhiy-storchaka · May 31, 2026 · StanFromIreland · May 31, 2026
@@ -864,7 +864,7 @@ A Windows console, for instance, keeps a user-settable 1 to 9999 lines,
 with 300 the default.
 
 A Tk Text widget, and hence IDLE's Shell, displays characters (codepoints) in
-the BMP (Basic Multilingual Plane) subset of Unicode.  Which characters are
+the :abbr:`BMP (Basic Multilingual Plane)` subset of Unicode.  Which characters are
 displayed with a proper glyph and which with a replacement box depends on the
 operating system and installed fonts.  Tab characters cause the following text
 to begin after the next tab stop. (They occur every 8 'characters').  Newline

diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst
@@ -76,7 +76,8 @@ The :mod:`!xml.parsers.expat` module contains two functions:
       For other encodings (including aliases like Latin1 and ASCII) it
       falls back to Python.
       It supports most of 8-bit encodings and many multi-byte encodings
-      like Shift_JIS, although only BMP characters (``U+0000-U+FFFF``)
+      like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)`
+      characters (U+0000 through U+FFFF)
       are supported with non-native encodings (this restriction is also
       applied to aliases like UTF8).
       These restrictions only apply if *encoding* is not given.

@@ -115,7 +115,8 @@ xml
 * Add support for multiple multi-byte encodings in the :mod:`XML parser
   <xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP",
   "GB2312", "GBK", "johab", and "Shift_JIS".
-  Add partial support (only BMP characters) for multi-byte encodings
+  Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)`
+  characters) for multi-byte encodings
   "Big5-HKSCS", "EUC_JIS-2004", "EUC_JISX0213", "Shift_JIS-2004",
   "Shift_JISX0213", "utf-8-sig" and non-standard aliases like "UTF8"
   (without hyphen).

@@ -262,7 +262,8 @@ The storage of Unicode strings now depends on the highest code point in the stri
 
 * pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;
 
-* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
+* :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use
+  2 bytes per code point;
 
 * non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.
 

@@ -418,7 +418,8 @@ Some smaller changes made to the core Python language are:
 * All the UTF-\* codecs (except UTF-7) now reject surrogates during both
   encoding and decoding unless the ``surrogatepass`` error handler is used,
   with the exception of the UTF-16 decoder (which accepts valid surrogate pairs)
-  and the UTF-16 encoder (which produces them while encoding non-BMP characters).
+  and the UTF-16 encoder (which produces them while encoding characters that
+  are not in the :abbr:`BMP (Basic Multilingual Plane)`).
   (Contributed by Victor Stinner, Kang-Hao (Kenny) Lu and Serhiy Storchaka in
   :issue:`12892`.)
 

@@ -868,7 +868,8 @@ window are shown and hidden in the Options menu.
 (Contributed by Tal Einat and Saimadhav Heblikar in :issue:`17535`.)
 
 OS native encoding is now used for converting between Python strings and Tcl
-objects. This allows IDLE to work with emoji and other non-BMP characters.
+objects. This allows IDLE to work with emoji and other characters that are not
+in the :abbr:`BMP (Basic Multilingual Plane)`.
 These characters can be displayed or copied and pasted to or from the
 clipboard.  Converting strings from Tcl to Python and back now never fails.
 (Many people worked on this for eight years but the problem was finally

diff --git a/Misc/NEWS.d/next/Library/2026-05-14-17-01-19.gh-issue-62259.ytlFD5.rst b/Misc/NEWS.d/next/Library/2026-05-14-17-01-19.gh-issue-62259.ytlFD5.rst
@@ -1,6 +1,6 @@
 Add support for multiple multi-byte encodings in the :mod:`XML parser
 <xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP", "GB2312",
-"GBK", "johab", and "Shift_JIS". Add partial support (only BMP characters)
+"GBK", "johab", and "Shift_JIS". Add partial support (only the BMP characters)
 for multi-byte encodings "Big5-HKSCS", "EUC_JIS-2004", "EUC_JISX0213",
 "Shift_JIS-2004", "Shift_JISX0213", "utf-8-sig" and non-standard aliases
 like "UTF8" (without hyphen). The parser now raises :exc:`ValueError` for