2 Jun 17:27
Re: Last Call: Preparation of Internationalized Strings
Doug Ewell <dewell <at> adelphia.net>
2002-06-02 15:27:50 GMT
2002-06-02 15:27:50 GMT
Simon, There have been two corrections to normalization since Unicode 3.0. One involved a Chinese (Han) compatibility character that was mapped to the wrong "normal" character by error. The other involved a Yiddish (Hebrew) compatibility character that should have had a compatibility mapping, but did not, also by error. Both corrections were made to characters that are supposedly "very rare" in actual use, so that the real-world impact would be minimal. Neither one has anything to do with transcoding tables. I know you are very concerned that Unicode has "broken its promise" by making changes to the normalization tables after claiming they would not do so. I think if the corrections had not been made, there would have been an equal but opposite reaction that Unicode was too stubborn to correct its own mistakes, and that NFKC was rendered "useless" because of these two incorrect mappings. The pages explaining the corrigenda include lengthy, detailed explanations of why the Technical Committee felt they were necessary and justified. As someone already mentioned, one of the justifications given for the Yiddish change was that no normative references existed *yet* for the Unicode normalization tables (i.e. from IDN). This implies that once such normative references *do* exist, a similar decision to correct an error might not be made. I imagine these were very difficult decisions for the UTC, who knew that someone would jump on the changes immediately as evidence that normalization is inherently unstable and Unicode is therefore "not(Continue reading)
RSS Feed