Original Translation
17
We intend PEPs to be the primary mechanisms for proposing new features, for collecting community input on an issue, and for documenting the design decisions that have gone into Python. The PEP author is responsible for building consensus within the community and documenting dissenting opinions.
18
Read the rest of PEP 1 for the details of the PEP editorial process, style, and format. PEPs are kept in the Python CVS tree on SourceForge, though they're not part of the Python 2.0 distribution, and are also available in HTML form from http://www.python.org/peps/. As of September 2000, there are 25 PEPS, ranging from PEP 201, "Lockstep Iteration", to PEP 225, "Elementwise/Objectwise Operators".
19
Unicode
20
The largest new feature in Python 2.0 is a new fundamental data type: Unicode strings. Unicode uses 16-bit numbers to represent characters instead of the 8-bit number used by ASCII, meaning that 65,536 distinct characters can be supported.
21
The final interface for Unicode support was arrived at through countless often- stormy discussions on the python-dev mailing list, and mostly implemented by Marc-André Lemburg, based on a Unicode string type implementation by Fredrik Lundh. A detailed explanation of the interface was written up as :pep:`100`, "Python Unicode Integration". This article will simply cover the most significant points about the Unicode interfaces.
22
In Python source code, Unicode strings are written as ``u"string"``. Arbitrary Unicode characters can be written using a new escape sequence, ``\uHHHH``, where *HHHH* is a 4-digit hexadecimal number from 0000 to FFFF. The existing ``\xHHHH`` escape sequence can also be used, and octal escapes can be used for characters up to U+01FF, which is represented by ``\777``.
23
Unicode strings, just like regular strings, are an immutable sequence type. They can be indexed and sliced, but not modified in place. Unicode strings have an ``encode( [encoding] )`` method that returns an 8-bit string in the desired encoding. Encodings are named by strings, such as ``'ascii'``, ``'utf-8'``, ``'iso-8859-1'``, or whatever. A codec API is defined for implementing and registering new encodings that are then available throughout a Python program. If an encoding isn't specified, the default encoding is usually 7-bit ASCII, though it can be changed for your Python installation by calling the :func:`sys.setdefaultencoding(encoding)` function in a customised version of :file:`site.py`.
24
Combining 8-bit and Unicode strings always coerces to Unicode, using the default ASCII encoding; the result of ``'a' + u'bc'`` is ``u'abc'``.
25
New built-in functions have been added, and existing built-ins modified to support Unicode:
26
``unichr(ch)`` returns a Unicode string 1 character long, containing the character *ch*.