2 Nov 2012 09:28
Unicode case (in)stability and Haskell identifiers.
Richard O'Keefe <ok <at> cs.otago.ac.nz>
2012-11-02 08:28:06 GMT
2012-11-02 08:28:06 GMT
I've been putting together a proposal for Unicode identifiers in Erlang (it's EEP 40 if anyone wants to look it up). In the course of this, it has turned out that there is a technical problem for languages with case-significant identifiers. Haskell 2010 report, chapter 2. http://www.haskell.org/onlinereport/haskell2010/haskellch2.html varid → (small {small | large | digit | ' })\⟨reservedid⟩ conid → large {small | large | digit | ' } small → ascSmall | uniSmall | _ ascSmall → a | b | … | z uniSmall → any Unicode lowercase letter large → ascLarge | uniLarge ascLarge → A | B | … | Z uniLarge → any uppercase or titlecase Unicode letter This is actually ambiguous: any ascSmall is also a uniSmall and any ascLarge is also a uniLarge. I take it that this is intended to mean "any Unicode xxx letter other than an ASCII one" in each case. That's not the problem. The definition currently bans Hebrew, Arabic, Chinese, Japanese, all the Indic scripts, and basically only allows Latin, Greek, Coptic, Cyrillic, Glagolitic, Armenian, arguably Georgian, and Deseret (but not Shavian). That's not the problem either.(Continue reading)
RSS Feed