Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Doc/library/idle.rst
Original file line number Diff line number Diff line change
Expand Up @@ -864,7 +864,7 @@ A Windows console, for instance, keeps a user-settable 1 to 9999 lines,
with 300 the default.

A Tk Text widget, and hence IDLE's Shell, displays characters (codepoints) in
the BMP (Basic Multilingual Plane) subset of Unicode. Which characters are
the :abbr:`BMP (Basic Multilingual Plane)` subset of Unicode. Which characters are
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a regression for mobile users, since they can't easily access the now hidden meaning, I suggest keep it like it was.

displayed with a proper glyph and which with a replacement box depends on the
operating system and installed fonts. Tab characters cause the following text
to begin after the next tab stop. (They occur every 8 'characters'). Newline
Expand Down
3 changes: 2 additions & 1 deletion Doc/library/pyexpat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,8 @@ The :mod:`!xml.parsers.expat` module contains two functions:
For other encodings (including aliases like Latin1 and ASCII) it
falls back to Python.
It supports most of 8-bit encodings and many multi-byte encodings
like Shift_JIS, although only BMP characters (``U+0000-U+FFFF``)
like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)`
characters (U+0000 through U+FFFF)
are supported with non-native encodings (this restriction is also
applied to aliases like UTF8).
These restrictions only apply if *encoding* is not given.
Expand Down
3 changes: 2 additions & 1 deletion Doc/whatsnew/3.16.rst
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,8 @@ xml
* Add support for multiple multi-byte encodings in the :mod:`XML parser
<xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP",
"GB2312", "GBK", "johab", and "Shift_JIS".
Add partial support (only BMP characters) for multi-byte encodings
Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)`
characters) for multi-byte encodings
"Big5-HKSCS", "EUC_JIS-2004", "EUC_JISX0213", "Shift_JIS-2004",
"Shift_JISX0213", "utf-8-sig" and non-standard aliases like "UTF8"
(without hyphen).
Expand Down
3 changes: 2 additions & 1 deletion Doc/whatsnew/3.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,8 @@ The storage of Unicode strings now depends on the highest code point in the stri

* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;

* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
* :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use
2 bytes per code point;

* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.

Expand Down
3 changes: 2 additions & 1 deletion Doc/whatsnew/3.4.rst
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,8 @@ Some smaller changes made to the core Python language are:
* All the UTF-\* codecs (except UTF-7) now reject surrogates during both
encoding and decoding unless the ``surrogatepass`` error handler is used,
with the exception of the UTF-16 decoder (which accepts valid surrogate pairs)
and the UTF-16 encoder (which produces them while encoding non-BMP characters).
and the UTF-16 encoder (which produces them while encoding characters that
are not in the :abbr:`BMP (Basic Multilingual Plane)`).
(Contributed by Victor Stinner, Kang-Hao (Kenny) Lu and Serhiy Storchaka in
:issue:`12892`.)

Expand Down
3 changes: 2 additions & 1 deletion Doc/whatsnew/3.8.rst
Original file line number Diff line number Diff line change
Expand Up @@ -868,7 +868,8 @@ window are shown and hidden in the Options menu.
(Contributed by Tal Einat and Saimadhav Heblikar in :issue:`17535`.)

OS native encoding is now used for converting between Python strings and Tcl
objects. This allows IDLE to work with emoji and other non-BMP characters.
objects. This allows IDLE to work with emoji and other characters that are not
in the :abbr:`BMP (Basic Multilingual Plane)`.
These characters can be displayed or copied and pasted to or from the
clipboard. Converting strings from Tcl to Python and back now never fails.
(Many people worked on this for eight years but the problem was finally
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Add support for multiple multi-byte encodings in the :mod:`XML parser
<xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP", "GB2312",
"GBK", "johab", and "Shift_JIS". Add partial support (only BMP characters)
"GBK", "johab", and "Shift_JIS". Add partial support (only the BMP characters)
for multi-byte encodings "Big5-HKSCS", "EUC_JIS-2004", "EUC_JISX0213",
"Shift_JIS-2004", "Shift_JISX0213", "utf-8-sig" and non-standard aliases
like "UTF8" (without hyphen). The parser now raises :exc:`ValueError` for
Expand Down
Loading