145
|
1 This directory contains a mechanism for GCC to have its own internal
|
|
2 implementation of wcwidth functionality. (cpp_wcwidth () in libcpp/charset.c).
|
|
3
|
|
4 The idea is to produce the necessary lookup table
|
|
5 (../../libcpp/generated_cpp_wcwidth.h) in a reproducible way, starting from the
|
|
6 following files that are distributed by the Unicode Consortium:
|
|
7
|
|
8 ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
|
|
9 ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt
|
|
10 ftp://ftp.unicode.org/Public/UNIDATA/PropList.txt
|
|
11
|
|
12 These three files have been added to source control in this directory;
|
|
13 please see unicode-license.txt for the relevant copyright information.
|
|
14
|
|
15 In order to keep in sync with glibc's wcwidth as much as possible, it is
|
|
16 desirable for the logic that processes the Unicode data to be the same as
|
|
17 glibc's. To that end, we also put in this directory, in the from_glibc/
|
|
18 directory, the glibc python code that implements their logic. This code was
|
|
19 copied verbatim from glibc, and it can be updated at any time from the glibc
|
|
20 source code repository. The files copied from that respository are:
|
|
21
|
|
22 localedata/unicode-gen/unicode_utils.py
|
|
23 localedata/unicode-gen/utf8_gen.py
|
|
24
|
|
25 And the most recent versions added to GCC are from glibc git commit:
|
|
26 2a764c6ee848dfe92cb2921ed3b14085f15d9e79
|
|
27
|
|
28 Finally, the script gen_wcwidth.py found here contains the GCC-specific code to
|
|
29 map glibc's output to the lookup tables we require. This script should not need
|
|
30 to change, unless there are structural changes to the Unicode data files or to
|
|
31 the glibc code.
|
|
32
|
|
33 The procedure to update GCC's wcwidth tables is the following:
|
|
34
|
|
35 1. Update the three Unicode data files from the above URLs.
|
|
36
|
|
37 2. Update the two glibc files in from_glibc/ from glibc's git. Update
|
|
38 the commit number above in this README.
|
|
39
|
|
40 3. Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h
|
|
41 (where X.Y is the version of the Unicode standard corresponding to the
|
|
42 Unicode data files being used, most recently, 12.1).
|
|
43
|
|
44 After that, GCC's wcwidth will match the most recent glibc.
|