2626Introduction
2727============
2828
29- This API is mainly for Terminal Emulator implementors -- any python program
30- that attempts to determine the printable width of a string on a Terminal. It
31- is implemented in python (no C library calls) and has no 3rd-party dependencies.
32-
33- It is certainly possible to use your Operating System's ``wcwidth(3) `` and
34- ``wcswidth(3) `` calls if it is POSIX-conforming, but this would not be possible
35- on non-POSIX platforms, such as Windows, or for alternative Python
36- implementations, such as jython. It is also commonly many releases older
37- than the most current Unicode Standard release files, which this project
38- aims to track.
39-
40- The most current release of this API is based from Unicode Standard release
41- *7.0.0 *, dated *2014-02-28, 23:15:00 GMT [KW, LI] * for table generated by
42- file ``EastAsianWidth-7.0.0.txt `` and *2014-02-07, 18:42:08 GMT [MD] * for
43- ``DerivedCombiningClass-7.0.0.txt ``.
29+ This API is mainly for Terminal Emulator implementors, or those writing
30+ programs that expect to interpreted by a terminal emulator and wish to
31+ determine the printable width of a string on a Terminal.
4432
45- Installation
46- ------------
33+ Usually, the length of the string is equivalent to the number of cells
34+ it occupies except that there are are also some categories of characters
35+ which occupy 2 or even 0 cells. POSIX-conforming systems provide
36+ ``wcwidth(3) `` and ``wcswidth(3) `` of which this module's interface mirrors
37+ precisely.
4738
48- The stable version of this package is maintained on pypi, install using pip::
39+ This library aims to be forward-looking, portable, and most correct. The most
40+ current release of this API is based from Unicode Standard release files:
4941
50- pip install wcwidth
42+ ``EastAsianWidth-8.0.0.txt ``
43+ *2015-02-10, 21:00:00 GMT [KW, LI] *
5144
52- Problem
53- -------
54-
55- You may have noticed some characters especially Chinese, Japanese, and
56- Korean (collectively known as the *CJK Unified Ideographs *) consume more
57- than 1 terminal cell. If you ask for the length of the string, ``u'コンニチハ' ``
58- (Japanese: Hello), it is correctly determined to be a length of **5 ** using
59- the ``len() `` built-in.
60-
61- However, if you were to print this to a Terminal Emulator, such as xterm,
62- urxvt, Terminal.app, PuTTY, or iTerm2, it would consume **10 ** *cells * (columns).
63- This causes problems for many of the text-alignment functions, such as ``rjust() ``.
64- On an 80-wide terminal, the following would wrap along the margin, instead
65- of displaying it right-aligned as desired::
66-
67- >>> text = u'コンニチハ'
68- >>> print(text.rjust(80))
69- コン
70- ニチハ
45+ ``DerivedGeneralCategory-8.0.0.txt ``
46+ *2015-02-13, 13:47:11 GMT [MD] *
7147
72- Solution
73- --------
48+ Installation
49+ ------------
7450
75- This API allows one to determine the printable length of these strings,
76- that the length of ``wcwidth(u'コ') `` is reported as ``2 ``, and
77- ``wcswidth(u'コンニチハ') `` as ``10 ``.
51+ The stable version of this package is maintained on pypi, install using pip::
7852
79- This allows one to determine the printable effects of displaying *CJK *
80- characters on a terminal emulator.
53+ pip install wcwidth
8154
8255wcwidth, wcswidth
8356-----------------
@@ -89,39 +62,45 @@ To Display ``u'コンニチハ'`` right-adjusted on screen of 80 columns::
8962 >>> from wcwidth import wcswidth
9063 >>> text = u'コンニチハ'
9164 >>> print(u' ' * (80 - wcswidth(text)) + text)
92- コンニチハ
9365
66+ Return Values
67+ -------------
68+
69+ ``-1 ``
70+ Indeterminate (not printable).
9471
95- Values
96- ------
72+ `` 0 ``
73+ Does not advance the cursor, such as NULL or Combining.
9774
98- A general overview of return values:
75+ ``2 ``
76+ Characters of category East Asian Wide (W) or East Asian
77+ Full-width (F) which are displayed using two terminal cells.
9978
100- - ``-1 ``: indeterminate (see Todo _).
101- - ``0 ``: do not advance the cursor, such as NULL.
102- - ``2 ``: East_Asian_Width property values W and F (Wide and Full-width).
103- - ``1 ``: all others.
79+ ``1 ``
80+ All others.
10481
10582``wcswidth() `` simply returns the sum of all values along a string, or
106- ``-1 `` if it has occurred for any value returned by ``wcwidth() ``. A more
107- exacting list of conditions and return values may be found in the docstring
108- for ``wcwidth() ``.
83+ ``-1 `` in total if any part of the string results in -1. A more exact
84+ list of conditions and return values may be found in the docstring::
85+
86+ $ pydoc wcwidth
10987
110- Discrepacies
111- ------------
11288
113- There may be discrepancies with the determined printable width of of characters
114- by *wcwidth * and the results of any given terminal emulator -- most commonly,
115- emulators are using your Operating System's ``wcwidth(3) `` implementation which
116- is often based on tables much older than the most current Unicode Specification.
117- Python's determination of non-zero combining _ characters may also be based on an
118- older specification.
89+ Discrepancies
90+ -------------
11991
120- You may determine an exacting list of these discrepancies using files
121- `wcwidth-libc-comparator.py `_ and `wcwidth-combining-comparator.py `_
92+ This library does its best to return the most appropriate return value for a
93+ very particular terminal user interface where a monospaced fixed-cell
94+ rendering is expected. As the POSIX Terminal programming interfaces do not
95+ provide any means to determine the unicode support level, we can only do our
96+ best to return the *correct * result for the given codepoint, and not what any
97+ terminal emulator particular does.
12298
123- .. _`wcwidth-libc-comparator.py` : https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-libc-comparator.py
124- .. _`wcwidth-combining-comparator.py` : https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-combining-comparator.py
99+ Python's determination of non-zero combining _ characters may also be based on
100+ an older specification.
101+
102+ You may determine an exacting list of these discrepancies using the project
103+ files `wcwidth-libc-comparator.py <https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-libc-comparator.py >`_ and `wcwidth-combining-comparator.py <https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-combining-comparator.py >`_.
125104
126105
127106==========
@@ -140,22 +119,20 @@ Updating Tables
140119The command ``python setup.py update `` will fetch the following resources:
141120
142121- http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
143- - http://www.unicode.org/Public/UNIDATA/extracted/DerivedCombiningClass .txt
122+ - http://www.unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory .txt
144123
145- And generate the table files ` wcwidth/table_wide.py `_ and ` wcwidth/table_comb.py `_.
124+ And generates the table files:
146125
147- .. _ `wcwidth/table_wide.py` : https://github.com/jquast/wcwidth/tree/master/wcwidth/table_wide.py
148- .. _ `wcwidth/table_comb .py` : https://github.com/jquast/wcwidth/tree/master/wcwidth/table_comb .py
126+ - `wcwidth/table_wide.py < https://github.com/jquast/wcwidth/tree/master/wcwidth/table_wide.py >`_
127+ - `wcwidth/table_zero .py < https://github.com/jquast/wcwidth/tree/master/wcwidth/table_zero .py >`_
149128
150129wcwidth.c
151130---------
152131
153132This code was originally derived directly from C code of the same name,
154- whose latest version is available at: `wcwidth.c `_ And is authored by
155- Markus Kuhn -- 2007-05-26 (Unicode 5.0)
156-
157- .. _`wcwidth.c` : http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
158-
133+ whose latest version is available at
134+ http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c And is authored by Markus Kuhn,
135+ 2007-05-26 (Unicode 5.0).
159136
160137Examples
161138--------
@@ -167,85 +144,49 @@ This library is used in:
167144- `jonathanslenders/python-prompt-toolkit `_, a Library for building powerful
168145 interactive command lines in Python.
169146
170- Additional tools for displaying and testing wcwidth is found in the `` bin/ ``
171- folder of this project (github link: ` wcwidth/bin `_) . They are not distributed
172- as a script or part of the module.
147+ Additional tools for displaying and testing wcwidth are found in the `bin/
148+ <https://in.linkedin.com/in/chiragjog> `_ folder of this project. They are not
149+ distributed as a script or part of the module.
173150
174151.. _`jquast/blessed` : https://github.com/jquast/blessed
175152.. _`jonathanslenders/python-prompt-toolkit` : https://github.com/jonathanslenders/python-prompt-toolkit
176- .. _`wcwidth/bin` : https://github.com/jquast/wcwidth/tree/master/bin
177-
178- Todo
179- ----
180-
181- Though some of the most common ("zero-width") `combining `_ characters
182- are understood by wcswidth, there are still many edge cases that need
183- to be covered, especially certain kinds of sequences such as those
184- containing Control-Sequence-Inducer (CSI).
185-
186-
187- License
188- -------
189153
190- The original license is as follows::
191-
192- Permission to use, copy, modify, and distribute this software
193- for any purpose and without fee is hereby granted. The author
194- disclaims all warranties with regard to this software.
195-
196- No specific licensing is specified, and Mr. Kuhn resides in the UK which allows
197- some protection from Copyrighting. As this derivative is based on US Soil,
198- an OSI-approved license that appears most-alike has been chosen, the MIT license::
199-
200- The MIT License (MIT)
201-
202- Copyright (c) 2014 <[email protected] > 203-
204- Permission is hereby granted, free of charge, to any person obtaining a copy
205- of this software and associated documentation files (the "Software"), to deal
206- in the Software without restriction, including without limitation the rights
207- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
208- copies of the Software, and to permit persons to whom the Software is
209- furnished to do so, subject to the following conditions:
210-
211- The above copyright notice and this permission notice shall be included in
212- all copies or substantial portions of the Software.
213-
214- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
215- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
216- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
217- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
218- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
219- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
220- THE SOFTWARE.
221154
222155Changes
223156-------
224157
225- 0.1.4
158+ 0.1.5 *2015-09-13 Alpha *
159+ * **Bugfix **:
160+ Resolution of "combining character width", most especially
161+ those that previously returned -1 now often (correctly) return 0.
162+ resolved by `Philip Craig `_ via `PR #11 `_.
163+
164+ 0.1.4 *2014-11-20 Pre-Alpha *
226165 * **Feature **: ``wcswidth() `` now determines printable length
227166 for (most) combining characters. The developer's tool
228167 `bin/wcwidth-browser.py `_ is improved to display combining _
229168 characters when provided the ``--combining `` option
230169 (`Thomas Ballinger `_ and `Leta Montopoli `_ `PR #5 `_).
231170 * added static analysis (prospector _) to testing framework.
232171
233- 0.1.3
172+ 0.1.3 * 2014-10-29 Pre-Alpha *
234173 * **Bugfix **: 2nd parameter of wcswidth was not honored.
235- (`Thomas Ballinger `_, `PR #4 `).
174+ (`Thomas Ballinger `_, `PR #4 `_ ).
236175
237- 0.1.2
176+ 0.1.2 * 2014-10-28 Pre-Alpha *
238177 * **Updated ** tables to Unicode Specification 7.0.0.
239- (`Thomas Ballinger `_, `PR #3 `).
178+ (`Thomas Ballinger `_, `PR #3 `_ ).
240179
241- 0.1.1
180+ 0.1.1 * 2014-05-14 Pre-Alpha *
242181 * Initial release to pypi, Based on Unicode Specification 6.3.0
243182
244183.. _`prospector` : https://github.com/landscapeio/prospector
245184.. _`combining` : https://en.wikipedia.org/wiki/Combining_character
246185.. _`bin/wcwidth-browser.py` : https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py
247186.. _`Thomas Ballinger` : https://github.com/thomasballinger
248187.. _`Leta Montopoli` : https://github.com/lmontopo
188+ .. _`Philip Craig` : https://github.com/philipc
249189.. _`PR #3` : https://github.com/jquast/wcwidth/pull/3
250190.. _`PR #4` : https://github.com/jquast/wcwidth/pull/4
251191.. _`PR #5` : https://github.com/jquast/wcwidth/pull/5
192+ .. _`PR #11` : https://github.com/jquast/wcwidth/pull/11
0 commit comments