Richard O'Keefe | 2 Nov 09:28 2012

Unicode case (in)stability and Haskell identifiers.

I've been putting together a proposal for Unicode identifiers
in Erlang (it's EEP 40 if anyone wants to look it up).  In
the course of this, it has turned out that there is a technical
problem for languages with case-significant identifiers.

Haskell 2010 report, chapter 2.

varid → (small {small | large | digit | ' })\⟨reservedid⟩
conid →	 large {small | large | digit | ' }

small    → ascSmall | uniSmall | _
ascSmall → a | b | … | z
uniSmall → any Unicode lowercase letter

large    → ascLarge | uniLarge
ascLarge → A | B | … | Z
uniLarge → any uppercase or titlecase Unicode letter

This is actually ambiguous: any ascSmall is also a uniSmall
and any ascLarge is also a uniLarge.  I take it that this
is intended to mean "any Unicode xxx letter other than an ASCII one"
in each case.

That's not the problem.  The definition currently bans Hebrew,
Arabic, Chinese, Japanese, all the Indic scripts, and basically
only allows Latin, Greek, Coptic, Cyrillic, Glagolitic,
Armenian, arguably Georgian, and Deseret (but not Shavian).
That's not the problem either.

(Continue reading)

Max Rabkin | 2 Nov 19:09 2012

Re: Unicode case (in)stability and Haskell identifiers.

I try to maintain some knowledge of Unicode issues, but this one never occurred to me.

On Fri, Nov 2, 2012 at 10:28 AM, Richard O'Keefe <ok <at>> wrote:
Would anyone care to see and comment on the proposal
before I send it to  Anyone got any suggestions
before I begin to write it?

I don't have any suggestions, but I would certainly be interested in seeing your proposal.
Haskell-Cafe mailing list
Haskell-Cafe <at>