Dave Crocker | 30 Aug 18:40

Re: Document Status?

At 09:36 AM 8/25/2002 +0200, Erik Nordmark wrote:
>The second set of last-called documents (IDNA, punycode, and nameprep) still
>have some IETF last call issues to resolve. The resultion will be in the form
>of an added applicability section in the IDNA document. There is
>still some word-smithing on that section, after which the documents will
>be ready to be discussed by the IESG as a whole.

Erik,

One last try... Having not yet seen any detailed response on the matters I 
put forward some time ago, I will raise them again.

Your note implies that the IESG does not agree that the current IDN 
specification suffers the following, basic deficiencies:

>1. IDNA makes a formal change to the DNS, by expanding the name space from 
>a subset of ASCII to a subset of Unicode. This change is not clearly 
>documented in the IDNA specification.

         We usually document such major changes to basic protocols rather 
more explicitly.

>2. The IDNA specification does not provide enough detail to permit its use 
>for the most common Domain names, which is those used in URLs and email 
>addresses.

         This means that someone registering an IDN domain name for use in 
email addresses and web addresses cannot know the exact set of valid IDN 
characters avalailable for use.

(Continue reading)

Dave Crocker | 30 Aug 20:07

Re: Document Status?

At 10:38 AM 8/30/2002 -0700, Paul Hoffman / IMC wrote:
>The sky is not falling, Dave.
>A new draft that we believe meets the IESG's request for clarifications 
>has already been turned into the Internet Drafts repository but has not 
>been announced by the IETF Secretariat yet.

Paul, it would be far more helpful to see responses to the particulars, 
rather than this sort of dismissive, handwaving condescension.

It is absolutely fine for my concerns to be unwarranted.

What is not fine is the continuing failure to address those concerns, with 
explanations that make clear why they are not warranted.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Dave Crocker | 31 Aug 06:39

Re: Document Status?

James,

At 10:32 AM 8/31/2002 +0800, James Seng wrote:
>I concur with Paul. The authors and the co-chairs have been working with 
>the ADs to address these issues.

There seems to be a basic disparity of view about IETF process, 
here.  Comments in a working group forum are not simply one-way input for a 
design team to take in and privately decide whether it wishes to 
incorporate.  Comments are for public discussion, review, and acceptance or 
rejection.

>We have several emails discussion on the drafts.

These concerns were posted two months ago and no one, on that long list of 
folks working diligently on this specification, has publicly responded to 
the details of those concerns.  If I am incorrect please point to the place 
in the working group archive that shows otherwise.

>The feeling I get from these discussion is that IESG is trying its best to 
>make sure the document got things right, rather then finding faults with 
>the draft.

James, presumably you are not directing the "rather than" comment at 
me.  If you were directing it at me, I would be very confused, since I made 
a point of submitting extensive text that attempts to correct the concerns 
I have raised.

Please forgive my sensitivity on this matter, but the responses from chairs 
and authors have been problematic, since you and they are preferring to 
(Continue reading)

Dave Crocker | 31 Aug 08:45

Re: Document Status?

At 07:41 AM 8/31/2002 +0200, Patrik Fältström wrote:
>--On 2002-08-30 21.39 -0700 Dave Crocker <dhc <at> dcrocker.net> wrote:
>Were those concernes yours? I can not see you starting a thread on this
>mailing list since May 27, and that is three months ago.
>
>Can you send me the Message-ID with the list of issues?

         <http://www.imc.org/idn/mail-archive/msg06956.html>
         <http://www.imc.org/idn/mail-archive/msg06884.html>
         <http://www.imc.org/idn/mail-archive/msg06885.html>

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Dave Crocker | 31 Aug 08:05

Re: Document Status?

At 07:39 AM 8/31/2002 +0200, Patrik Fältström wrote:
>So, if you had issues during last call which you find are not implemented,
>you should rise the issue with the AD in question.

Patrik,

In that case it is a good thing that my original posting on this thread was 
in response to the AD's status report.

It is a pity that others did not wait for him to respond.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Patrik Fältström | 31 Aug 07:39
Picon
Favicon

Re: Document Status?

--On 2002-08-30 21.39 -0700 Dave Crocker <dhc <at> dcrocker.net> wrote:

> At 10:32 AM 8/31/2002 +0800, James Seng wrote:
>> I concur with Paul. The authors and the co-chairs have been working with 
>> the ADs to address these issues.
> 
> There seems to be a basic disparity of view about IETF process, here.
> Comments in a working group forum are not simply one-way input for a
> design team to take in and privately decide whether it wishes to
> incorporate.  Comments are for public discussion, review, and acceptance
> or rejection.

Dave, not at this time in the process. All documents have been handed over
from the wg to the AD. We have also passed last call.

When the document is in the lap of the AD, i.e. the wg chairs have passed
the document to the IESG, the comments follow a different path than before.

Comments during and after last call is to go to the AD.

In this case, we as document editors get request for changes from the AD,
we implement them, and report back to the mailing list what the changes are.

So, if you had issues during last call which you find are not implemented,
you should rise the issue with the AD in question.

As AD myself (for Applications Area), I find this being the _only_ way
possible to work. It can not happen that the IESG work on comments received
during last call. Normally about 50% of the (very few unfortunately) is
only sent to the IESG so the wg chair or document editors have absolutely
(Continue reading)

John C Klensin | 31 Aug 10:44

Re: Document Status?


--On Saturday, August 31, 2002 7:39 AM +0200 Patrik Fältström
<paf <at> cisco.com> wrote:

> --On 2002-08-30 21.39 -0700 Dave Crocker <dhc <at> dcrocker.net>
> wrote:
> 
>> At 10:32 AM 8/31/2002 +0800, James Seng wrote:
>>> I concur with Paul. The authors and the co-chairs have been
>>> working with  the ADs to address these issues.
>> 
>> There seems to be a basic disparity of view about IETF
>> process, here. Comments in a working group forum are not
>> simply one-way input for a design team to take in and
>> privately decide whether it wishes to incorporate.  Comments
>> are for public discussion, review, and acceptance or
>> rejection.
> 
> Dave, not at this time in the process. All documents have been
> handed over from the wg to the AD. We have also passed last
> call.
> 
> When the document is in the lap of the AD, i.e. the wg chairs
> have passed the document to the IESG, the comments follow a
> different path than before.
>...

> Comments during and after last call is to go to the AD.
> 
> In this case, we as document editors get request for changes
(Continue reading)

Patrik Fältström | 31 Aug 14:02
Picon
Favicon

Re: Document Status?

--On 2002-08-31 04.44 -0400 John C Klensin <klensin <at> jck.com> wrote:

> I agree with you that this is a reasonable way to proceed.
> However, I trust that the results of the rewriting process will
> be circulated back to the WG and last-called again, and possibly
> run back through an additional IETF Last Call.

I see this being of the decisions an AD _always_ have to make after getting
comments, and a new document, back after issues have been rised.

I.e. not my call.

   paf

Patrik Fältström | 31 Aug 17:07
Picon
Favicon

Re: Document Status?

--On 2002-08-31 14.02 +0200 Patrik Fältström <paf <at> cisco.com> wrote:

> --On 2002-08-31 04.44 -0400 John C Klensin <klensin <at> jck.com> wrote:
> 
>> I agree with you that this is a reasonable way to proceed.
>> However, I trust that the results of the rewriting process will
>> be circulated back to the WG and last-called again, and possibly
>> run back through an additional IETF Last Call.
> 
> I see this being of the decisions an AD _always_ have to make after
> getting comments, and a new document, back after issues have been rised.

s/being of the/being one of the/

  paf

Patrik Fältström | 31 Aug 07:41
Picon
Favicon

Re: Document Status?

--On 2002-08-30 21.39 -0700 Dave Crocker <dhc <at> dcrocker.net> wrote:

>> We have several emails discussion on the drafts.
> 
> These concerns were posted two months ago and no one, on that long list
> of folks working diligently on this specification, has publicly responded
> to the details of those concerns.  If I am incorrect please point to the
> place in the working group archive that shows otherwise.

Were those concernes yours? I can not see you starting a thread on this
mailing list since May 27, and that is three months ago.

Can you send me the Message-ID with the list of issues?

   paf

James Seng | 31 Aug 04:32
Picon

Re: Document Status?

Dave,

I concur with Paul. The authors and the co-chairs have been working with the
ADs to address these issues. We have several emails discussion on the
drafts. Any non-editorial changes (e.g. the bidi & unicode 3.2) was and will
be bought to the group again.

The feeling I get from these discussion is that IESG is trying its best to
make sure the document got things right, rather then finding faults with the
draft.

-James Seng

----- Original Message -----
From: "Dave Crocker" <dhc <at> dcrocker.net>
To: "Paul Hoffman / IMC" <phoffman <at> imc.org>
Cc: <idn <at> ops.ietf.org>
Sent: Saturday, August 31, 2002 2:07 AM
Subject: Re: [idn] Document Status?

> At 10:38 AM 8/30/2002 -0700, Paul Hoffman / IMC wrote:
> >The sky is not falling, Dave.
> >A new draft that we believe meets the IESG's request for clarifications
> >has already been turned into the Internet Drafts repository but has not
> >been announced by the IETF Secretariat yet.
>
>
> Paul, it would be far more helpful to see responses to the particulars,
> rather than this sort of dismissive, handwaving condescension.
>
(Continue reading)

Dave Crocker | 31 Aug 19:13

Re: Document Status?

John's comments were so nicely done, the only reason I am responding is 
because he cited me a number of times and seemed to be seeking some 
confirmation about those citations.  (And, of course, I will fail to limit 
myself to those citations...)

At 05:06 AM 8/31/2002 -0400, John C Klensin wrote:
>It seems to me that Dave (and I) have raised two sorts of issues
>which are very different in character than, e.g., bidi and
>unicode 3.2.   One has to do with the _style_ of the documents,
>e.g., to paraphrase Dave (I hope accurately), whether they
>specify a protocol or outline an implementation.

correct.

>(i) If there is any question at all about how a given codepoint
>...
>(ii) If there is a substantive claim that the document cannot be
>implemented in an interoperable way without out-of-band
>profiling or oral tradition --and I think Dave has made exactly
>that claim, although in different language-- then either
>
>         * the document must be fixed to reflect that fact (and,
>...
>         * the document must be fixed to eliminate the ambiguity/
>...
>         * the documents should be published as Experimental, not
>...
>         * someone needs to come up with a persuasive case that
>         Dave (and I) have misread and misinterpreted the
>         document.  And, procedurally, I believe that case needs
(Continue reading)

Dave Crocker | 7 Sep 18:16

Re: Document Status?

At 10:17 AM 9/7/2002 +0200, Erik Nordmark wrote:
> > The current specification has not made at least one essential, difficult
> > decision.  As a consequence, someone registering an IDN for use with email
> > or the web cannot know the correct set of characters available to them.
>
>Which text in the specification are you referring to?
>I need specifics to understand your issue.

Erik,

A moment of process concern:

         You have been given the specifics, including a specific Last Call 
posting to the IESG, private discussion from me, and an independent 
internet draft on the subject.

         Your question implies rather strongly that the IESG did not 
consider this input.  The input was provided 2 months ago.  If that input 
was not understood, why were no clarifications requested?

         I will add that asking someone to point to the place in a document 
where something is *missing* is a rather curious assignment.  How can one 
point to something that does not exist?

To provide you with the specifics, again:

         The current DNS has usage for email and web addresses that limits 
the range of valid characters.  IDN does not define a parallel 
structure.  The implication is that IDN strings do not have different 
ranges of valid characters for email/web domain names and for, for example, 
(Continue reading)

Erik Nordmark | 9 Sep 11:03
Picon

Re: Document Status?


> A moment of process concern:
> 
>          You have been given the specifics, including a specific Last Call 
> posting to the IESG, private discussion from me, and an independent 
> internet draft on the subject.

Dave,

I have to vested interest in the IDNA specification but I'm merely
trying to help move things along in a productive fashion based on the
wishes of the WG rough consensus as guaged and expressed by the WG co-chairs
(which is to advance the set of specifications).

In my personal view the most productive way to get changes into the WG
rough consensus documents is to point out specific issues and suggested
text from clarifying things, since I have a hard time understanding how
the independent drafts can actually make the process move forward in
an effective manner. Of course, the WG is free to take a different
tack than my personal view e.g. to adopt your drafts and
move them forward instead of the current set of drafts - all I would expect
is discussions on the WG mailing list and a note from the co-chairs
to tell me to hold off moving the current IDNA draft through the process.

>          I will add that asking someone to point to the place in a document 
> where something is *missing* is a rather curious assignment.  How can one 
> point to something that does not exist?

Perhaps by suggesting specific limited text (a paragraph or so) which
fills in the missing piece?
(Continue reading)

Erik Nordmark | 7 Sep 10:17
Picon

Re: Document Status?

> The current specification has not made at least one essential, difficult 
> decision.  As a consequence, someone registering an IDN for use with email 
> or the web cannot know the correct set of characters available to them.

Which text in the specification are you referring to?
I need specifics to understand your issue.

The only text I recall which has some bearing on this is:
  It is expected that some name-handling bodies, such as large
  zone administrators and groups of affiliated zone administrators, will
  want to limit the characters allowed in IDNs further than what is
  specified in this document, such as to prohibit additional characters
  that they feel are unneeded or harmful in registered domain names.

I don't see how registries limiting the set of Unicode code points
that can be used in a part of the name space has any negative impact on 
interoperability; registries limit the types of names for various policy
reasons already (such as only allowing registered companies in some
parts of the name space) without any adverse effect on interoperability.

   Erik

Dave Crocker | 2 Sep 05:37

Re: The purpose of IDN

At 07:32 PM 9/1/2002 -0400, vinton g. cerf wrote:
>I think it is important for IETF to understand as well as it can the side 
>effects of using IDNA in applications

understanding is always a good thing.  in this case, however, concern over 
the user interface has been one of the tenacious reasons for taking 3 years 
to do work that should have taken 1.

That is why it is so striking to see such basic issues being raised again, 
during Last Call.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Patrik Fältström | 2 Sep 07:15
Picon
Favicon

Re: The purpose of IDN

--On 2002-09-01 20.37 -0700 Dave Crocker <dhc <at> dcrocker.net> wrote:

> That is why it is so striking to see such basic issues being raised
> again, during Last Call.

Not surprising at all. People still don't understand how MIME works, and
that has only been defined since around 1993.

We will have this problem as long as we use multiple representations of
characters when moving them around as bits.

Which will be forever.

   paf

vinton g. cerf | 2 Sep 01:32
Favicon

Re: The purpose of IDN

I think it is important for IETF to understand as well as it can the side effects of using IDNA in applications
- since I think user expectations will be high. This need not necessarily impinge on the IDNA
specifications but it might cast light on the effectiveness of the IDNA design in actual use.

vint

At 01:25 PM 9/1/2002 -0700, Dave Crocker wrote:
>Is there some reason that the IETF should pursue this matter any more deeply than it has done for MIME?

Vint Cerf
SVP Architecture & Technology
WorldCom
22001 Loudoun County Parkway, F2-4115
Ashburn, VA 20147
703 886 1690 (v806 1690)
703 886 0047 fax

Dave Crocker | 1 Sep 22:25

The purpose of IDN

At 12:28 AM 9/1/2002 -0400, John C Klensin wrote:
>I believe that IDNA (and the supporting documents) are... a reasonably 
>well-understood solution to _some_ problem. I'm not sure I know what that 
>problem is, who cares about it, and whether it is important enough to 
>justify changes to the way the DNS works and is interpreted.

We should all be *very* worried that this major IETF effort has gone on for 
nearly 3 years, yet such a question can be seriously raised.  (Or rather, 
that such a question legitimately reflects a concern among serious 
participants, as it clearly does.)

This suggests strongly that the charter and the working group process 
adequately characterized *neither* the problem to solve nor the benefactors 
of the solution.

Here is my own effort at doing both:

IDN Scope and Goals
-------------------

1.  The set of characters available for use in domain names is problematic 
for large portions of the global Internet user population.  Many users do 
not use Latin characters *ever*.  Current technology standards permit them 
to use their local set of non-Latin characters for all their Internet 
activities, EXCEPT domain names!  Hence, the set of characters that can be 
used for domain names needs to be increased.

2.  The most immediate need is for support of this increased set of 
characters in email and web domain names.

(Continue reading)

Eric A. Hall | 2 Sep 05:53

Re: The purpose of IDN


on 9/1/2002 3:25 PM Dave Crocker wrote:

> This suggests strongly that the charter and the working group process 
> adequately characterized *neither* the problem to solve nor the benefactors 
> of the solution.
> 
> Here is my own effort at doing both:

> 4.  A long history of making changes to Internet infrastructure highly 
> recommends finding a way to do this enhancement so that it a) does not 
> disturb the installed base, b) interworks with that installed base, and c) 
> permits incremental benefit when there is incremental adoption.

I would reiterate the factual point that IDNA requires a forklift upgrade
of the Internet's current application clients, and it therefore does not
satisfy the ground covered in points "a" and "b".

Call it what it is, not what we want it to be. It's a cute hack that
allows for the sale and register of i18n domain names, and which may allow
for the visual representation of these domain names once compliant
applications are deployed.

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Stephen Dyer | 1 Sep 19:34
Favicon

Re: Document Status?

At 00:28 01/09/2002 -0400, John C Klensin wrote:

>Well, I think there is a problem that may border on
>understanding, and it is tied up with my major personal
>objection to the general style of the work that has come out of
>the WG.

Dear John & All,

I have followed this group with interest and often mystification. It 
appears that members of the group have widely different understandings of 
the scope and objective. The multi-lingual and cross-cultural aspects have 
exacerbated this situation.

(I do not blame or criticise the IDN group - it's not their fault if the 
initial question is wrong, and they were asked to look at a narrow 
technical aspect of a much wider problem.)

My "view from the hill" is that overall we want to get to a position where 
the Internet can be comfortably and reliably used by the users of different 
character sets and languages.

That simply expressed goal is a long way off, but we have at our disposal a 
set of possible technologies and protocols. It seems to me that IDN has 
been a process of trying to rationalise these with the character sets and 
produce a single, fully inter-operable answer.

I think we need an additional process - a plan for evolving 
internationalization over an extended period. There is however a need for 
some visible progress here because the current hegemony of the English 
(Continue reading)

Stephen Dyer | 2 Sep 19:23
Favicon

Re: Document Status?

At 14:19 02/09/2002 +0200, Stephane Bortzmeyer wrote:
>On Sun, Sep 01, 2002 at 06:34:34PM +0100,
>  Stephen Dyer <steve <at> uk.com> wrote
>  a message of 82 lines which said:
>
> > Firstly, I believe we should examine a process that deploys quickly the
> > fullest possible range of 8-bit ascii characters.
> > Many of these, especially accented characters, were sidelined by us Anglos
> > or unnecessarily hi-jacked by operating systems (especially by Unix). This
> > is a quick fix, but will give great benefits to the populations of Western
> > Europe, South America, much of Africa and ex-colonies of Western
> > "Imperialist" nations in general.
>
>There is not one 8-bits character set which can be used for all the
>European languages, unless you convince many countries to change their
>default script :-)
>snip..
>So, although "Let's make the simple thing first and we'll see later
>for the complicated one" is often reasonable, it cannot work here. We
>need Unicode from the beginning (handling Unicode only is simpler than
>handling Latin-1, Latin-2 and Greek, and waiting Bulgaria to join with
>its Cyrillic alphabet).

If you deploy the fullest possible range of ASCII codes, in *practice* a 
huge range of usable domains will become available. It may be that by 
allowing "élève.com" you may eliminate a Bulgarian word that happens to use 
exactly the same code string but it's a very long shot and results in a 
happy Frenchman and an unhappy Bulgarian. At the moment they are both 
unhappy, and this sort of intersect can happen right now anyway.

(Continue reading)

Dave Crocker | 3 Sep 19:49

Re: Document Status?

At 01:30 PM 9/3/2002 -0400, Edmon Chung wrote:
>Since this is a transitional strategy, it is not going to be the "standard"

Edmon,

ACE technology is the same as used for MIME.  Is MIME a transitional strategy?

MIME is 10 years old.  When are we going to see its replacement?

Why will ACE follow a path that is different?

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Dave Crocker | 3 Sep 23:16

Re: Document Status?

At 03:34 PM 9/3/2002 -0500, Eric A. Hall wrote:
>ACE is not the same as MIME. I have avoided making this argument because
>the comparison is useful for introductory purposes, but there are subtle
>yet strong differences at lower levels which keep them from being mirrors.
>
>Specifically, end-nodes typically extract data from MIME and use it within
>the local context of that application *ONLY*. The encoded data is RARELY
>used inside of other applications,

1.  you are wrong about the range and frequency of venues for MIME data.

2.  what difference does it make where the data are used, in terms of 
utility of IDNA?

>  On the other hand, domain names are
>frequently cross-populated among applications, and as the argument
>regarding clipboards shows, the encoded form will be the norm.

The discussion about uses environment implementation and use is mostly 
doing a good job of showing that this is not a productive forum for 
discussing user environment implementation and use.  Given the skillset of 
this community, that should not come as a surprise to anyone.

>If all Internet applications had to support the MIME encoding formats for
>all of their data -- FTP offering quoted-printable as a transfer type --
>then it would be equal to IDNA.

Yeah.  I guess that having MIME in both email and the web does not mean much.

>Furthermore, MIME and its constituent application protocols are being
(Continue reading)

Dave Crocker | 4 Sep 04:27

Re: Document Status?

At 08:36 PM 9/3/2002 -0500, Eric A. Hall wrote:
>on 9/3/2002 4:16 PM Dave Crocker wrote:
> > 10 years, Eric.  10 years.  So far.  And still counting.
>
>What is this comment supposed to mean exactly?

It means that talking about such work for MIME does not mean much, since we 
still do not have a demonstration that it is particularly popular 
yet.  Until it is popular, it does not demonstrate real need or real solution.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Re: Document Status?

On 05:46 04/09/02, Dave Crocker said:
>You think otherwise, so the real question is the basis for YOUR 
>issues.  So far, you appear to be preferring to rely on possible futures, 
>and on claims that IDNA is somehow special.  Yet the implementation and 
>operational impacts of this specialness has not yet been made clear

Dear Dave,
I am afraid you base your comments on what you consider as the standard 
common today practice. That practice for you today is to use standard 
writing when sending a mail or entering a domain name. To universally 
enlarge that capacity (such as using 16 or 64 bytes character coding) is 
for you a future scenario of middle interest.

Our today practice is to be blocked by a passed scenario (8 bits) in not 
using the standard writing when sending a mail or entering a domain name. 
We are only trying to remove asap that specialness. Because we do suffer 
for years of that limited implementation and of the operational impact on 
all our other application and common life. As if you were obliged to use 
computers only in Cyrillic uppercases.

As long as this has not been accepted, we obviously have an understanding 
and a wording problem. Which may result into a disinterest, and alternative 
solutions to be used, there too.
jfc

Dave Crocker | 4 Sep 05:46

Re: Document Status?

At 10:20 PM 9/3/2002 -0500, Eric A. Hall wrote:

>on 9/3/2002 9:27 PM Dave Crocker wrote:
> > At 08:36 PM 9/3/2002 -0500, Eric A. Hall wrote:
> >
> >> on 9/3/2002 4:16 PM Dave Crocker wrote:
> >>
> >>> 10 years, Eric.  10 years.  So far.  And still counting.
> >>
> >> What is this comment supposed to mean exactly?
> >
> > It means that talking about such work for MIME does not mean much,
> > since we still do not have a demonstration that it is particularly
> > popular yet.
>
>Binary transfers aren't popular with HTTP?

They are not popular in email.  Please remember that IDNA tackles the 
problem of placing 8-bit data in a 7-bit field, the same as MIME for email.

>I still don't understand the "10 years" comment,

The comment is about how long it takes to make change and it is about being 
careful what we use for making decisions.  Making decisions based on a 
hypothetical future is usually a mistake.  Making a decision based on 
actual practise is always safer and usual makes for better decisions.

>  either. RFC 1341 defined
>a Content-Transfer-Encoding type of BINARY in 1992, while RFC 1830 defined
>a binary transfer mode for SMTP in 1995, while RFC 1945 did the same for
(Continue reading)

Dave Crocker | 4 Sep 08:05

Re: Document Status?

At 12:38 AM 9/4/2002 -0500, Eric A. Hall wrote:
>on 9/3/2002 10:46 PM Dave Crocker wrote:
> > You should worry about the fact that after 10 years, it still does not
> > have a particularly major position in Internet mail.
>
>Oh please, that's more of a reflection on deploying features into a
>hop-by-hop network than anything. Cripes, ~everything takes a decade to
>widely deploy with SMTP.  Meanwhile, you are blatantly ignoring the other
>side of the coin, which is that protocols like FTP and HTTP had a null
>deployment window simply because they did not require encoding (FTP could
>ignore it, and HTTP could go straight to binary).

1.  Long revision adoption latencies are primarily caused by installed base.

2.  To use your own perspective, DNS is a hop-by-hop system, when viewed 
properly.

> > the real question is the basis for YOUR issues.
>
>My position has been consistent. Regardless, this thread is about your
>claim that IDNA is "the same as MIME" which is patently false.

The comment was that the encoding approach of IDNA was the same as the MIME 
encoding approach for binary data in a 7-bit environment.  And that, I'm 
afraid, is patently true.

>  They have different usage models and different architectures.

they have the same architecture.  8-bit data translated onto a 7-bit 
environment.
(Continue reading)

Dave Crocker | 4 Sep 09:11

Re: Document Status?

At 01:41 AM 9/4/2002 -0500, Eric A. Hall wrote:
>on 9/4/2002 1:05 AM Dave Crocker wrote:
> > 2.  To use your own perspective, DNS is a hop-by-hop system, when
> > viewed properly.
>
>Indeed it is, ...
>I have also said that a *complete* migration of the DNS service would be a
>20 year effort, so I don't know what your statement hopes to disprove.

you were somehow implying that email's hop-by-hop nature was a relevant 
point of distinction.

i was merely trying to take that perspective and note that it applies 
equally to the DNS.

> >> They have different usage models and different architectures.
> >
> > they have the same architecture.  8-bit data translated onto a 7-bit
> > environment.
>
>MIME provides multiple data-types with multiple codecs, with applications
>being able to choose among the data-types and codecs that are best suited
>to that specific usage. IDNA provides a single [usable] data-type with a
>single codec. They are completely different architectures.

The comparison was not between their full architectures.

It was between the encoding methods, in particular the one MIME uses for 
binary data...  Does it help you if 'base64' is cited explicitly?

(Continue reading)

Eric A. Hall | 4 Sep 08:41

Re: Document Status?


on 9/4/2002 1:05 AM Dave Crocker wrote:

> 2.  To use your own perspective, DNS is a hop-by-hop system, when
> viewed properly.

Indeed it is, with proxies and replication and everything. But we're not
just talking about DNS, we're also talking about the applications that
have to accept domain names and use them for in-band structured data. The
majority of the applications which will make use of domain names do not
suffer these considerations, and they can be enabled with a simple option
negotiation sequence.

I have also said that a *complete* migration of the DNS service would be a
20 year effort, so I don't know what your statement hopes to disprove.

>> They have different usage models and different architectures.
>
> they have the same architecture.  8-bit data translated onto a 7-bit
> environment.

MIME provides multiple data-types with multiple codecs, with applications
being able to choose among the data-types and codecs that are best suited
to that specific usage. IDNA provides a single [usable] data-type with a
single codec. They are completely different architectures.

> Eric, forgive me.  I forgot that you had extensive experience in
> developing and revising these protocols and in seeing them deployed.

Your mix of ignorance and paternalistic politics is what makes people
(Continue reading)

Eric A. Hall | 4 Sep 07:38

Re: Document Status?


on 9/3/2002 10:46 PM Dave Crocker wrote:

> You should worry about the fact that after 10 years, it still does not
> have a particularly major position in Internet mail.

Oh please, that's more of a reflection on deploying features into a
hop-by-hop network than anything. Cripes, ~everything takes a decade to
widely deploy with SMTP. Meanwhile, you are blatantly ignoring the other
side of the coin, which is that protocols like FTP and HTTP had a null
deployment window simply because they did not require encoding (FTP could
ignore it, and HTTP could go straight to binary).

> the real question is the basis for YOUR issues.

My position has been consistent. Regardless, this thread is about your
claim that IDNA is "the same as MIME" which is patently false. They have
different usage models and different architectures. Such a comparison is
useful to the hoi polloi but it is a dead-end as any sort of engineering
discussion point.

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Eric A. Hall | 4 Sep 05:20

Re: Document Status?


on 9/3/2002 9:27 PM Dave Crocker wrote:
> At 08:36 PM 9/3/2002 -0500, Eric A. Hall wrote:
>
>> on 9/3/2002 4:16 PM Dave Crocker wrote:
>>
>>> 10 years, Eric.  10 years.  So far.  And still counting.
>>
>> What is this comment supposed to mean exactly?
>
> It means that talking about such work for MIME does not mean much,
> since we still do not have a demonstration that it is particularly
> popular yet.

Binary transfers aren't popular with HTTP?

I still don't understand the "10 years" comment, either. RFC 1341 defined
a Content-Transfer-Encoding type of BINARY in 1992, while RFC 1830 defined
a binary transfer mode for SMTP in 1995, while RFC 1945 did the same for
HTTP in 1996. That is ~four years between publication of the original spec
and the two most common protocols.

If you originally meant to say that i18n DNS *should* be just like MIME,
then I would agree with that statement for as far as it would carry us.
That would of course entail reducing IDNA down to a simple codec and
defining alternative codecs that protocols could deploy according to their
usage models. It seems that you agree with part of this but not the other
part, although I'm unable to figure out where the line is.

Can you state your issues without using this metaphor?
(Continue reading)

Eric A. Hall | 4 Sep 03:36

Re: Document Status?


on 9/3/2002 4:16 PM Dave Crocker wrote:

>>Furthermore, MIME and its constituent application protocols are being
>>extended so that binary data *CAN* be transferred without being encoded
>>first.
> 
> 10 years, Eric.  10 years.  So far.  And still counting.

What is this comment supposed to mean exactly?

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Eric A. Hall | 3 Sep 22:34

Re: Document Status?


on 9/3/2002 12:49 PM Dave Crocker wrote:

> ACE technology is the same as used for MIME.  Is MIME a transitional
> strategy?

ACE is not the same as MIME. I have avoided making this argument because
the comparison is useful for introductory purposes, but there are subtle
yet strong differences at lower levels which keep them from being mirrors.

Specifically, end-nodes typically extract data from MIME and use it within
the local context of that application *ONLY*. The encoded data is RARELY
used inside of other applications, and the extracted data is typically
used for this purpose, when needed. On the other hand, domain names are
frequently cross-populated among applications, and as the argument
regarding clipboards shows, the encoded form will be the norm.

If all Internet applications had to support the MIME encoding formats for
all of their data -- FTP offering quoted-printable as a transfer type --
then it would be equal to IDNA.

Furthermore, MIME and its constituent application protocols are being
extended so that binary data *CAN* be transferred without being encoded
first. Think about why this is so, and then ask yourself if the ability to
transfer i18n domain names in binary form would also not be desirable, for
the same kinds of reasons.

As far as that goes, an argument for MIME equality is an argument in favor
of an unencoded transfer mode.

(Continue reading)

Paul Hoffman / IMC | 4 Sep 01:51
Picon

Re: Document Status?

At 3:34 PM -0500 9/3/02, Eric A. Hall wrote:
>Specifically, end-nodes typically extract data from MIME and use it within
>the local context of that application *ONLY*. The encoded data is RARELY
>used inside of other applications, and the extracted data is typically
>used for this purpose, when needed. On the other hand, domain names are
>frequently cross-populated among applications, and as the argument
>regarding clipboards shows, the encoded form will be the norm.

Many people copy-and-paste MIME content all the time, just as they 
will copy-and-paste IDNs. Non-mail programs look at mail mailboxes 
for things like harvesting for address books or PIMs, and those 
programs interpret the MIME because they store the text in some 
internal format that is certainly not MIME.

>Furthermore, MIME and its constituent application protocols are being
>extended so that binary data *CAN* be transferred without being encoded
>first.

Which Internet Drafts are you speaking of here?

--Paul Hoffman, Director
--Internet Mail Consortium

Dave Crocker | 4 Sep 04:28

Re: Document Status?

At 10:57 AM 9/4/2002 +0900, Soobok Lee wrote:
>MIME (and even RFC2047)encodes texts or descriptors , not ciritical 
>identifiers
>like domain names as far as i understand.

How does that difference matter?  What is it about that difference that 
makes experience with MIME not apply for IDNA?

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Dave Crocker | 4 Sep 05:34

Re: Document Status?

Mostly the problem with this thread is that people are trying to debate 
concepts and principles, but they are failing to provide any detailed 
scenarios that cause problems.

At 04:53 AM 9/4/2002 +0200, Simon Josefsson wrote:
>Dave Crocker <dhc <at> dcrocker.net> writes:
>If I press the PRINT button on an IDNA in my mail reader, it is
>important that people are able to put it back into the original IDNA

If you are printing it, you are not "putting it back" anywhere, so i have 
no idea what you mean.

In any event, as noted, user interfaces deal with multiple data types just 
fine.  There is nothing about an DNA that makes it particularly distinctive 
from other, labeled, structured data types that UIs already deal with.

But mostly this line of discussion continues to fail to distinguish between 
over-the-wire encoding, versus encodings within the host.  After 30 years 
of networking protocols, it would be nice for people to keep this 
distinction clear.

>IDNA will be used to identify entities.  If there is any confusion as
>to which identity a certain IDNA really points at, there will be
>security consequences.

IDNA strings are unique domain name strings.  They are unambiguous 
identifies.  Nothing that has been presented here alters that simple 
fact.  Hence there is no way of guessing what "confusion" you are referring to.

And, by the way, it has nothing to do with the encoding trick that makes 
(Continue reading)

Dave Crocker | 4 Sep 08:44

Re: Re: Document Status?

At 01:19 AM 9/4/2002 -0500, Eric A. Hall wrote:
> > If you are printing it, you are not "putting it back" anywhere, so i
> > have no idea what you mean.
>
>I would assume he means typing the printed output into an application,

the printed output is a valid, 7-bit domain name.  typing it in will work 
the same as all other 7-bit domain names.

> > In any event, as noted, user interfaces deal with multiple data types
> > just fine.  There is nothing about an DNA that makes it particularly
> > distinctive from other, labeled, structured data types that UIs already
> > deal with.
>
>Few will care if the subject line isn't transcribed properly, but

1.  No one has said that UI issues are not important.

2.  The UIs handle these sorts of things quite well already.

>  This is patently false with the email
>address however, since the headers will break the protocol.

I don't know what "headers" you mean, but in any event, the ACE version of 
an IDN is valid to type in anywhere that a "regular" 7-bit ascii domain 
name is allowed.

So, if you are claiming that it won't work, you still need to provide a 
specific, concrete scenario that will fail.

(Continue reading)

Eric A. Hall | 4 Sep 08:19

Re: Re: Document Status?


on 9/3/2002 10:34 PM Dave Crocker wrote:

> At 04:53 AM 9/4/2002 +0200, Simon Josefsson wrote:
>
>> Dave Crocker <dhc <at> dcrocker.net> writes: If I press the PRINT button
>> on an IDNA in my mail reader, it is important that people are able to
>> put it back into the original IDNA
>
> If you are printing it, you are not "putting it back" anywhere, so i
> have no idea what you mean.

I would assume he means typing the printed output into an application,
since that's pretty much what he said.

> In any event, as noted, user interfaces deal with multiple data types
> just fine.  There is nothing about an DNA that makes it particularly
> distinctive from other, labeled, structured data types that UIs already
> deal with.

Few will care if the subject line isn't transcribed properly, but
everybody will care if an email address or URL isn't transcribed properly.
To stick with the current example, this means that printed output will
also have to use the encoded form if it is to survive. Even with mail
bodies this isn't a problem since the body itself is still relatively
unstructured, and something like kanji text being exchanged between two
japanese-speaking users is likely to be understandable by those users even
if it is packaged in non-MIME form. This is patently false with the email
address however, since the headers will break the protocol.

(Continue reading)

Dave Crocker | 4 Sep 08:14

Re: Document Status?

At 07:57 AM 9/4/2002 +0200, Simon Josefsson wrote:
>+ The entire world doesn't use Unicode, which is where IDNA starts.

Protocol standards rarely cover 100% of all possible situations.

The result is a limitation, not a failure.  The difference is key.

>+ The choice of Unicode normalization KC has been questioned.

Are you claiming that a) the behavior is not well understood, or that b) 
the working group did not reach rough consensus on this matter?  If you are 
claiming anything else, then it is not a "failure".

>+ Any modifications to the Unicode code charts or normalizations
>   tables destroy stability of IDN.

Even I remember this issue being resolved.  Efforts like these always have 
an issue with outside work being incorporated, and that outside work 
getting revised.

The IETF approach is the usual one:  The specification refers to a specific 
version of Unicode.  If Unicode gets revised, the IETF may consider 
adopting it.  Just because there is a new version of Unicode, the old one 
does not stop working.

>+ Unicode normalization and bidi rules interact problematically.

Please refer to the "are you claiming" response, above.  It applies here, too.

>These are things I've discovered by participating here for a month or
(Continue reading)

Simon Josefsson | 6 Sep 04:15

Re: Document Status?

Dave Crocker <dhc <at> dcrocker.net> writes:

[snip]
> Are you claiming that a) the behavior is not well understood, or that
> b) the working group did not reach rough consensus on this matter?  If
> you are claiming anything else, then it is not a "failure".
[snip]

You said people in this thread argued without having detailed specific
scenarios that fail in mind.  I tried to provide some specific such
scenarios that I had in mind.  I don't claim any of the above, nor
that IDN is a failure.

Simon Josefsson | 4 Sep 07:57

Re: Document Status?

Dave Crocker <dhc <at> dcrocker.net> writes:

> Mostly the problem with this thread is that people are trying to
> debate concepts and principles, but they are failing to provide any
> detailed scenarios that cause problems.

The failure scenarios have been named, and as a result I think some
are now even discussed in the specifications.  From the top of my
tired head (I'm sure you'll correct my errors):

+ The entire world doesn't use Unicode, which is where IDNA starts.
  There are examples of characters in european charsets that may fail
  to translate into Unicode properly (e.g., greek beta and german ss
  in CP437).  I suspect this might be more common in non-western
  charsets.  If someone has looked into this area closer, I'd
  appreciate a pointer.  The IDN specifications surely doesn't deal
  with it.  Detailed scenario that fails: www.ßeta.com browsed from
  CP437 platform.

+ The choice of Unicode normalization KC has been questioned.  Again
  since I'm familiar with european charsets, I have the simple example
  of normalization of ß into ss.  There are supposedly distinct words
  where this normalization process removes the possibility of
  distinguishing between the words.  Non-western charsets probably has
  more cases like this. Detailed scenario that fails: www.masse.de
  (translation: mass, majority) and www.maße.de (translation: metrics,
  gauges) are indestinguishable.

+ Any modifications to the Unicode code charts or normalizations
  tables destroy stability of IDN.  This is handled by locking IDN to
(Continue reading)

Doug Ewell | 4 Sep 17:56
Picon

Re: Re: Document Status?

Simon Josefsson <jas at extundo dot com> wrote:

> + The entire world doesn't use Unicode, which is where IDNA starts.
>   There are examples of characters in european charsets that may fail
>   to translate into Unicode properly (e.g., greek beta and german ss
>   in CP437).  I suspect this might be more common in non-western
>   charsets.  If someone has looked into this area closer, I'd
>   appreciate a pointer.  The IDN specifications surely doesn't deal
>   with it.  Detailed scenario that fails: www.ßeta.com browsed from
>   CP437 platform.

This is not a case of a character "failing to translate to Unicode
properly."  That would be true if we were talking about a Glagolitic
letter or Egyptian hieroglyph.  This is a case of ambiguity between
legacy character mappings.

The argument here is that Unicode should not be used for intersystem
communication because characters in legacy character sets have a history
of being overloaded.  Arguments of this type are gradually disappearing
as people begin to realize that the problem exists with or without
Unicode.

The alternative would be to stick with legacy character sets and tag all
IDN's with the character set.  This doesn't solve the overloading
problem, since non-CP437 applications still must decide whether to
convert CP437 0xE1 to German ss or Greek beta.  And you would
additionally run into the ISO 2022 problem, in which applications are
expected to implement the entire repertoire of character conversion
tables, but most don't because of the overhead.  Are you sure that every
Unix, Linux, Windows, and Mac client would provide a mapping table for
(Continue reading)

Dave Crocker | 4 Sep 05:40

Re: Document Status?

At 12:10 PM 9/4/2002 +0900, Soobok Lee wrote:
>From: "Dave Crocker" <dhc <at> dcrocker.net>
> > At 10:57 AM 9/4/2002 +0900, Soobok Lee wrote:
> > >MIME (and even RFC2047)encodes texts or descriptors , not ciritical
> > >identifiers like domain names as far as i understand.
> >
> > How does that difference matter?  What is it about that difference that
> > makes experience with MIME not apply for IDNA?
>
>IDN penetrates into all the applications which use domain names as local or
>...
>I see huge groups of interconnected and interoperating 
>databases,applications and
>...
>domain names have different criteria for interoperability ,

All of your note is devoted to very nice description of concepts and 
principles.  However you do not describe any networking technical scenarios 
that will fail.

IDNA is networking standard.

If there is a problem with it, please describe a precise scenario for which 
IDNA will fail, and explain how it will fail.

d/

>----------

Dave Crocker <mailto:dave <at> tribalwise.com>
(Continue reading)

Dave Crocker | 4 Sep 06:34

Re: Document Status?

At 01:10 PM 9/4/2002 +0900, Soobok Lee wrote:
>IDNA does not extend the LDH namespace, but just redefine xx--yyy subset of
>the space to have new i18n meanings.

Sorry, but IDNA very much does define an enhanced character space for 
domain names.

Yes, it also defines a mapping for those enhanced characters onto the 
existing 7-bit ASCII domain name space, but it is important to distinguish 
between the Unicode domain name, versus the ASCII encoding.

>Moreover, all the mess around i18n which does not belong
>to "network standard" , was not addressed and just postponed or ignored.

Please provide descriptions of specific usage scenarios that demonstrate 
technical failings that are special to IDNA.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Dave Crocker | 4 Sep 06:56

Re: Document Status?

At 01:48 PM 9/4/2002 +0900, Soobok Lee wrote:
> > Sorry, but IDNA very much does define an enhanced character space for
> > domain names.
>
>true in display and input (that is what i meant),

you need to distinguish between the abstract, formal name space, versus the 
over-the-wire encoding of that name space.

the current specification does not make this distinction clear, but that is 
a problem with the documentation, not the formal change.

> > Please provide descriptions of specific usage scenarios that demonstrate
> > technical failings that are special to IDNA.
>
>If some company don't want to trust and invite IDN and don't want to 
>modify its applications,
>but, IDN is encoded in trusted ASCII and penetrate into the applications and
>get trusted by the unmodified applications as other trusted ASCII domains.

You are concerned that an IDNA string somehow has a lower level of 
authentication than a regular, ASCII DNS string?

This cannot be true, since an IDNA string is a regular ASCII DNS string.

>  IN protocol world,
>Dynamic DNS updates protocols are affected directly by that holes.

IDNA introduces no changes to the mechanism for Dynamic DNS updating.

(Continue reading)

Adam M. Costello | 4 Sep 07:41

Re: Document Status?

Dave Crocker <dhc <at> dcrocker.net> wrote:

> you need to distinguish between the abstract, formal name space,
> versus the over-the-wire encoding of that name space.

Just to clarify, this is NOT a picture of the IDNA model:

    +----------+            +---------+
    |          |            |         |
    | abstract |            | encoded |
    | domain   |   <---->   | domain  |
    | labels   |            | labels  |
    |          |            |         |
    +----------+            +---------+

This is a picture of the IDNA model:

    +----------------------------+
    |  internationalized labels  |
    |                            |
    |  +----------------+        |
    |  |  ASCII labels  |        |
    |  |                |        |
    |  |  +--------+    |        |
    |  |  | ACE    |    |        |
    |  |  | labels |    |        |
    |  |  +--------+    |        |
    |  +----------------+        |
    +----------------------------+

(Continue reading)

Dave Crocker | 7 Sep 18:01

Re: Document Status?

At 10:37 AM 9/7/2002 +0200, Erik Nordmark wrote:
> > This is a picture of the IDNA model:
>
>Hmm - I wonder if putting that picture in the draft would result in
>less confusion.

I kept looking at that picture and stayed confused.  Until this morning 
when I finally realized that that the picture is exactly correct.

In other words, the process of resolving the confusion is exactly what the 
picture is good for.

It does not show a connection between the outer-most box and the inner-most 
box, but a) it's not clear how to create one usefully, and b) it is not 
essential for showing the difference between the name space and the 
encoding space.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Erik Nordmark | 7 Sep 10:37
Picon

Re: Document Status?

> This is a picture of the IDNA model:
> 
>     +----------------------------+
>     |  internationalized labels  |
>     |                            |
>     |  +----------------+        |
>     |  |  ASCII labels  |        |
>     |  |                |        |
>     |  |  +--------+    |        |
>     |  |  | ACE    |    |        |
>     |  |  | labels |    |        |
>     |  |  +--------+    |        |
>     |  +----------------+        |
>     +----------------------------+
> 

Hmm - I wonder if putting that picture in the draft would result in
less confusion.

  Erik

Adam M. Costello | 8 Sep 03:58

Re: Document Status?

Erik Nordmark <Erik.Nordmark <at> sun.com> wrote:

> >     +----------------------------+
> >     |  internationalized labels  |
> >     |                            |
> >     |  +----------------+        |
> >     |  |  ASCII labels  |        |
> >     |  |                |        |
> >     |  |  +--------+    |        |
> >     |  |  | ACE    |    |        |
> >     |  |  | labels |    |        |
> >     |  |  +--------+    |        |
> >     |  +----------------+        |
> >     +----------------------------+
> > 
> 
> Hmm - I wonder if putting that picture in the draft would result in
> less confusion.

Maybe, although I must warn you that the above picture is simplified,
and contains a slight lie.  The box called "ASCII labels" is really
"labels X such that Nameprep(X) is ASCII".  Let's use the phrase
"effectively ASCII labels" for that.  An accurate and complete picture
would be:

    +-----------------------------------+
    | internationalized labels          |
    |                                   |
    |  +---------------------------+    |
    |  | effectively ASCII labels  |    |
(Continue reading)

Simon Josefsson | 8 Sep 18:18

Re: Document Status?

"Adam M. Costello" <idn.amc+0 <at> nicemice.net.RemoveThisWord> writes:

>     +-----------------------------------+
>     | internationalized labels          |
>     |                                   |
>     |  +---------------------------+    |
>     |  | effectively ASCII labels  |    |
>     |  |                           |    |
>     |  |  +--------------+         |    |
>     |  |  | ASCII labels |         |    |
>     |  |  |              |         |    |
>     |  |  |  +-----------+----+    |    |
>     |  |  |  |    ACE labels  |    |    |
>     |  |  |  |           |    |    |    |
>     |  |  |  |           |    |    |    |
>     |  |  |  +-----------+----+    |    |
>     |  |  +--------------+         |    |
>     |  +---------------------------+    |
>     +-----------------------------------+
>
>
> Unfortunately, the picture is starting to look daunting.  The exact
> subset relationships between every pair of the various sets is not so
> important, as long as you know these two:
>
>   * The ASCII labels are a subset of the internationalized labels.
>   * The ACE labels are a subset of the internationalized labels.
>
> The other things you need to know to make sense of the model, which are
> not easily depicted, are:
(Continue reading)

Soobok Lee | 4 Sep 07:09
Picon

Re: Document Status?


----- Original Message ----- 
From: "Dave Crocker" <dhc <at> dcrocker.net>
 > >
> >If some company don't want to trust and invite IDN and don't want to 
> >modify its applications,
> >but, IDN is encoded in trusted ASCII and penetrate into the applications and
> >get trusted by the unmodified applications as other trusted ASCII domains.
> 
> You are concerned that an IDNA string somehow has a lower level of 
> authentication than a regular, ASCII DNS string?
> 
> This cannot be true, since an IDNA string is a regular ASCII DNS string.
> 
> >  IN protocol world,
> >Dynamic DNS updates protocols are affected directly by that holes.
> 
> IDNA introduces no changes to the mechanism for Dynamic DNS updating.
> 

THey have been suggested may reasons why some companies may not want to trust
and support IDN (IDNA + all other utf8 based proposals) in their applications. 
IDN  introduce ambiguity in higer layers than machine protocols.

By "trust" i mean that 7bit applications accept 7bit ASCII domains as fairly 
"unambiguous" identifiers historically and had been putting trust in it ..
I didn't mean soome trust mechanisms like DNSSEC or X509. Just common trust in ASCII
which is  the greast common divisor in all internationally-used scripts.

IDNA strings does not deserve the same trust as ASCII domains have now in end 
(Continue reading)

Soobok Lee | 4 Sep 06:48
Picon

Re: Document Status?


----- Original Message ----- 
From: "Dave Crocker" <dhc <at> dcrocker.net>
To: "Soobok Lee" <lsb <at> postel.co.kr>
Cc: <idn <at> ops.ietf.org>
Sent: Wednesday, September 04, 2002 1:34 PM
Subject: Re: [idn] Document Status?

> At 01:10 PM 9/4/2002 +0900, Soobok Lee wrote:
> >IDNA does not extend the LDH namespace, but just redefine xx--yyy subset of
> >the space to have new i18n meanings.
> 
> Sorry, but IDNA very much does define an enhanced character space for 
> domain names.

true in display and input (that is what i meant), but the norm is ACE in protocols.
of course, some new application protocols may negotiate to use native-encoding instead.

> 
> Yes, it also defines a mapping for those enhanced characters onto the 
> existing 7-bit ASCII domain name space, but it is important to distinguish 
> between the Unicode domain name, versus the ASCII encoding.
> 
> 
> >Moreover, all the mess around i18n which does not belong
> >to "network standard" , was not addressed and just postponed or ignored.
> 
> Please provide descriptions of specific usage scenarios that demonstrate 
> technical failings that are special to IDNA.

(Continue reading)

Soobok Lee | 4 Sep 06:10
Picon

Re: Document Status?


----- Original Message ----- 
From: "Dave Crocker" <dhc <at> dcrocker.net>
 > All of your note is devoted to very nice description of concepts and 
> principles.  However you do not describe any networking technical scenarios 
> that will fail.
> 
> IDNA is networking standard.
> 

IDNA does not extend the LDH namespace, but just redefine xx--yyy subset of
the space to have new i18n meanings.  They are ASCII domain names that are declared
as displayable in other scripts. networking scenarios around IDNA are the same to
those of ASCII domains, in principle. IDNA may work well with DNS and SMTP networking. 
But with others ones or new ones to come, it would make another obstacles to optimal
i18n of protocols works.

Moreover, all the mess around i18n which does not belong
to "network standard" , was not addressed and just postponed or ignored.
To be usable, security issues and interoperabilsy issues in applications should be 
addressed in IDN standards because IDN will penetrate to all the applications as soon
as we issue some drafts and a few vendors begin to support IDN.
backwards compatiblity consideration wrt networking seems not enough from
the angle of how to fullfull the need for which we try to provide i18n'ed protocols
to end users.

Networking is not enough. Applications' need matters.

Soobok Lee

(Continue reading)

Soobok Lee | 4 Sep 05:10
Picon

Re: Document Status?


----- Original Message ----- 
From: "Dave Crocker" <dhc <at> dcrocker.net>
To: "Soobok Lee" <lsb <at> postel.co.kr>
Cc: <idn <at> ops.ietf.org>; "Paul Hoffman / IMC" <phoffman <at> imc.org>
Sent: Wednesday, September 04, 2002 11:28 AM
Subject: Re: [idn] Document Status?

> At 10:57 AM 9/4/2002 +0900, Soobok Lee wrote:
> >MIME (and even RFC2047)encodes texts or descriptors , not ciritical 
> >identifiers
> >like domain names as far as i understand.
> 
> How does that difference matter?  What is it about that difference that 
> makes experience with MIME not apply for IDNA?
> 

IDN penetrates into all the applications which use domain names as local or
online identifiers, while MIME is used within internet messaging applications.
For example, MIME is not used in customer enrollment databases, while domain
names are. Future SQL statements support separate IDN comparison operators  ??
One database uses ACE in primary key, while joined database in other host uses UTF8. 
Is it allowed?

I see huge groups of interconnected and interoperating databases,applications and 
directories which came from differnt sources and regions and have their own management 
policies in the global internet. 

domain names have different criteria for interoperability ,
revision, and stability than messages and MIME have.  introducing IDN will
(Continue reading)

Simon Josefsson | 4 Sep 04:53

Re: Document Status?

Dave Crocker <dhc <at> dcrocker.net> writes:

> At 10:57 AM 9/4/2002 +0900, Soobok Lee wrote:
>> MIME (and even RFC2047)encodes texts or descriptors , not ciritical
>> identifiers
>>like domain names as far as i understand.
>
> How does that difference matter?  What is it about that difference
> that makes experience with MIME not apply for IDNA?

If I press the PRINT button on a MIME part in my mail reader, noone
cares that the rendered MIME body, which is printed, is infeasible to
put back into the original MIME part.

If I press the PRINT button on an IDNA in my mail reader, it is
important that people are able to put it back into the original IDNA
[given some reasonable assumptions about the people].

IDNA will be used to identify entities.  If there is any confusion as
to which identity a certain IDNA really points at, there will be
security consequences.  MIME doesn't have this property.

Soobok Lee | 4 Sep 03:57
Picon

Re: Document Status?


----- Original Message ----- 
From: "Paul Hoffman / IMC" <phoffman <at> imc.org>
To: <idn <at> ops.ietf.org>
Sent: Wednesday, September 04, 2002 8:51 AM
Subject: Re: [idn] Document Status?

> At 3:34 PM -0500 9/3/02, Eric A. Hall wrote:
> >Specifically, end-nodes typically extract data from MIME and use it within
> >the local context of that application *ONLY*. The encoded data is RARELY
> >used inside of other applications, and the extracted data is typically
> >used for this purpose, when needed. On the other hand, domain names are
> >frequently cross-populated among applications, and as the argument
> >regarding clipboards shows, the encoded form will be the norm.
> 
> Many people copy-and-paste MIME content all the time, just as they 
> will copy-and-paste IDNs. Non-mail programs look at mail mailboxes 
> for things like harvesting for address books or PIMs, and those 
> programs interpret the MIME because they store the text in some 
> internal format that is certainly not MIME.

MIME (and even RFC2047)encodes texts or descriptors , not ciritical identifiers 
like domain names as far as i understand. 
So, I can't accept any attemps to draw an analogy between MIME and IDN encodings.

Soobok Lee

Eric A. Hall | 4 Sep 03:25

Re: Document Status?


on 9/3/2002 6:51 PM Paul Hoffman / IMC wrote:
> At 3:34 PM -0500 9/3/02, Eric A. Hall wrote:

>>Furthermore, MIME and its constituent application protocols are being
>>extended so that binary data *CAN* be transferred without being encoded
>>first.
> 
> Which Internet Drafts are you speaking of here?

| Network Working Group                                       G. Vaudreuil
| Request for Comments: 3030                           Lucent Technologies
| Obsolete: 1830                                             December 2000
| Category: Standards Track
|
|                         SMTP Service Extensions
|                        for Transmission of Large
|                         and Binary MIME Messages

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Dave Crocker | 4 Sep 00:47

Re: Document Status?

At 05:38 PM 9/3/2002 -0400, Edmon Chung wrote:
>I cannot agree more to Eric's analogy.  To me the key is that IDN has to be
>used in multiple apps and it is therefore important we consider that.

TCP is used in multiple apps.  I guess we had better not standardize it 
until we have considered that.

HTTP is used in multiple apps.  I guess we...

You are confusing host-specific implementation issues with network protocol 
issues.

> > Furthermore, MIME and its constituent application protocols are being
> > extended so that binary data *CAN* be transferred ...
>
>This is what IDNA has not provided us with and what we probably should
>consider.  I am not sure when... but now might be a good time to start.

Actually, now is a particularly bad time to start.

Starting now provides excellent distraction from the necessary focus on the 
immediate work, namely IDNA.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

(Continue reading)

Dave Crocker | 4 Sep 01:54

Re: Document Status?

At 07:43 PM 9/3/2002 -0400, Edmon Chung wrote:
>fine.  aside from making the doc more readable to ppl who need to use it on
>development works, whatelse do we need to add to IDNA?

Very little is required, in my view.

I have twice listed my own concerns on this list, and issued an I-D with 
suggested changes to resolve those concerns.

What has been missing is any public discussion and resolution of those few 
points.

When IDNA is issued as a Draft Standard it will be time to consider 
enhancements.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Edmon Chung | 4 Sep 01:43

Re: Document Status?

fine.  aside from making the doc more readable to ppl who need to use it on
development works, whatelse do we need to add to IDNA?  Personally I dont
quite like the new bidi arrangement, but since I know nothing about the
languages, I have nothing much to offer.  Except perhaps that I do know that
Chinese also has a bidi issue!  I havent seen anyone raise this, but sure
enough Chinese could be written from right to left with English words going
left to right.  Pretty much like the Arabic case.  Anyway, that was not my
point... Other than that, I think the IDNA framework is fairly stable now,
whether it is because people are just too tired to keep talking about it or
that we have successfully dealt with the mulitple issues and found some
ground that we could live with.  Going forward, there are really just two
things I can see that needs real thoughts:

1. A set of possible transition schemes that will get us from here to IDNA
2. Do we want anything beyond IDNA?  Perhaps a protocol that handles the
binary correctly...

what are your thoughts on the remaining issues anyway?

Edmon

----- Original Message -----
From: "Dave Crocker" <dhc <at> dcrocker.net>
To: "Edmon Chung" <edmon <at> neteka.com>
Cc: "Eric A. Hall" <ehall <at> ehsco.com>; <idn <at> ops.ietf.org>
Sent: Tuesday, September 03, 2002 6:47 PM
Subject: Re: [idn] Document Status?

> At 05:38 PM 9/3/2002 -0400, Edmon Chung wrote:
> >I cannot agree more to Eric's analogy.  To me the key is that IDN has to
(Continue reading)

Edmon Chung | 3 Sep 23:38

Re: Document Status?

I cannot agree more to Eric's analogy.  To me the key is that IDN has to be
used in multiple apps and it is therefore important we consider that.

> Furthermore, MIME and its constituent application protocols are being
> extended so that binary data *CAN* be transferred without being encoded
> first. Think about why this is so, and then ask yourself if the ability to
> transfer i18n domain names in binary form would also not be desirable, for
> the same kinds of reasons.

This is what IDNA has not provided us with and what we probably should
consider.  I am not sure when... but now might be a good time to start.

Edmon

Dave Crocker | 3 Sep 20:18

Re: Document Status?

At 01:59 PM 9/3/2002 -0400, Edmon Chung wrote:
>I am just saying that it is important to make clear what the "standard" is
>and what the "transition" might look like.

A standard is what we are specifying now and what will be used now.

Anything for the future is just that.  For the future.

Having a discussion about possible transition issues and targets is fine, 
but it is not fine to deprecate the current work as "merely" transitional.

Something that is going to be used for more than a decade is more than 
transitional.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Edmon Chung | 3 Sep 21:52

Re: Document Status?

ok, let me make it really clear.  In my opinion this is where I see us
heading:
1. the standard being IDNA (MIME like approach)
2. transition strategies that takes into consideration immediate user
experience will be deployed by responsible registries for their users
    - these may be and are most likely 8bit+ solutions
    - and takes the query sent by existing applications and resolves it
correctly

I am saying that my work on TRACE-01 is transitional.  Yes maybe it will be
used for a decade and more, but it is meant to be an informational document
on how to get from where we are now towards IDNA, however long it may take.

Edmon

----- Original Message -----
From: "Dave Crocker" <dhc <at> dcrocker.net>
To: "Edmon Chung" <edmon <at> neteka.com>
Cc: <idn <at> ops.ietf.org>
Sent: Tuesday, September 03, 2002 2:18 PM
Subject: Re: [idn] Document Status?

> At 01:59 PM 9/3/2002 -0400, Edmon Chung wrote:
> >I am just saying that it is important to make clear what the "standard"
is
> >and what the "transition" might look like.
>
> A standard is what we are specifying now and what will be used now.
>
> Anything for the future is just that.  For the future.
(Continue reading)

Edmon Chung | 3 Sep 19:59

Re: Document Status?


----- Original Message -----
From: "Dave Crocker" <dhc <at> dcrocker.net>
> ACE technology is the same as used for MIME.  Is MIME a transitional
strategy?
>
> MIME is 10 years old.  When are we going to see its replacement?
>
> Why will ACE follow a path that is different?

I am not saying it will ;-)

In fact, I believe it will take as long if not longer...

I am just saying that it is important to make clear what the "standard" is
and what the "transition" might look like.  They are two different
discussions.  We could say that the "transition" is to have 8bit clean, but
it sure doesnt look like a viable long term "standard" to me.

Edmon

Edmon Chung | 3 Sep 19:30

Re: Document Status?

Hi Steve,

I concur with your analogy, but I think it is a transition strategy for
registries you are talking about here.  How responsible registry operators
should roll out multilingual domain names to their users, and how
responsible software providers should allow passageway for resolutions to
happen immediately.

Since this is a transitional strategy, it is not going to be the "standard"
and I think this is a very important distinction to me made.  To that end, I
have put together an I-D that is intended to be "informational"
(http://www.ietf.org/internet-drafts/draft-ietf-idn-dnsii-trace-01.txt) for
registries, or shall I say industry operators who actually care about the
fact that names sold work.

It talks about how a registry/DNS operator could prepare its zone to respond
positively to IDN requests sent by existing applications, while being fully
prepared for the eventual "standard".  The reason this is very important, is
that we all know that users wil attempt to access multilingual domain names
they see using their existing applications and that request will reach the
registry name servers.  It will be very irresponsible to ignore these
request, and create a negative user experience for the technically less
sophisticated public.

Anyway, my point is that while I agree to your point, I think we should make
it clear whether they are discussions on the "standard" or the "transition".
To me, the "transition" must be to allow 8bit requests to be correctly
responded to inorder to create a positive and transparent end-user
experience for multilingual domain names, regardless of the "standard", yet
driving towards it.
(Continue reading)

Picon

Re: Document Status?

On Sun, Sep 01, 2002 at 06:34:34PM +0100,
 Stephen Dyer <steve <at> uk.com> wrote 
 a message of 82 lines which said:

> Firstly, I believe we should examine a process that deploys quickly the 
> fullest possible range of 8-bit ascii characters.
> Many of these, especially accented characters, were sidelined by us Anglos 
> or unnecessarily hi-jacked by operating systems (especially by Unix). This 
> is a quick fix, but will give great benefits to the populations of Western 
> Europe, South America, much of Africa and ex-colonies of Western 
> "Imperialist" nations in general.

Since you work on the .eu project, I will limit myself to Europe, but
the problem is much broader. Even only for Europe, your suggestion
is not a good one.

There is not one 8-bits character set which can be used for all the
European languages, unless you convince many countries to change their
default script :-)

In the present European Union, Latin-1 and Greek are both required and
they already do not fit together in 8-bits charsets. In 2004, Latin-2
countries like Poland will join the EU.

So, although "Let's make the simple thing first and we'll see later
for the complicated one" is often reasonable, it cannot work here. We
need Unicode from the beginning (handling Unicode only is simpler than
handling Latin-1, Latin-2 and Greek, and waiting Bulgaria to join with
its Cyrillic alphabet).

(Continue reading)

John C Klensin | 1 Sep 06:28

Re: Document Status?

The rarity of Dave's and my agreeing is again noted for the
record.  Lest we get too carried away by it, one observation
below.

--On Saturday, August 31, 2002 10:13 AM -0700 Dave Crocker
<dhc <at> dcrocker.net> wrote:

>...
> Personally, I believe that publishing as Experimental is
> unnecessary and that it would be disastrous.
> 
> Experimental makes sense when a technology is not well
> understood.  That is not the problem, here.  The problem,
> here, about making difficult decisions, not about
> understanding them.
>...

Well, I think there is a problem that may border on
understanding, and it is tied up with my major personal
objection to the general style of the work that has come out of
the WG.  I believe that IDNA (and the supporting documents) are,
with the exceptions and qualifications Dave and I are pushing
on, a reasonably well-understood solution to _some_ problem.
I'm not sure I know what that problem is, who cares about it,
and whether it is important enough to justify changes to the way
the DNS works and is interpreted.  Re those changes, we can
debate how significant they will be, and there are differences
of opinion about who will be impacted and how much pain they
will feel, but I think it is relatively certain that the pain
level will be non-zero.
(Continue reading)

John C Klensin | 31 Aug 11:06

Re: Document Status?


--On Saturday, August 31, 2002 10:32 AM +0800 James Seng
<jseng <at> pobox.org.sg> wrote:

> I concur with Paul. The authors and the co-chairs have been
> working with the ADs to address these issues. We have several
> emails discussion on the drafts. Any non-editorial changes
> (e.g. the bidi & unicode 3.2) was and will be bought to the
> group again.

James,

It seems to me that Dave (and I) have raised two sorts of issues
which are very different in character than, e.g., bidi and
unicode 3.2.   One has to do with the _style_ of the documents,
e.g., to paraphrase Dave (I hope accurately), whether they
specify a protocol or outline an implementation.   That is
somewhat a matter of taste, and you could legitimately argue
that it is an editorial matter, as long as the specification is
complete and unambiguous.

The other issues go directly to the questions of completeness
and ambiguity.  They are, by definition, substantive rather than
editorial unless there is very clear consensus within the IETF
about what the answers are to any questions that are unresolved
by  the text and the text merely needs to be clarified to
reflect that consensus.

To be specific about this,

(Continue reading)

JFC (Jefsey) Morfin | 31 Aug 15:10

Re: Document Status?

On 11:37 31/08/02, James Seng said:
>My note is to clarify to the group (not just to Dave), that we are in this 
>process with the ADs. Most of the stuff going on are minor, request for 
>additional paragraph for clarification. But a few are substained enough to 
>warrant another partial wg last call (which is what we did).

May I offer a remark for this AD tuning? As a new comer I read the proposed 
text as will do reader. I am confused by two wordings and uncertain about 
others. Dave proposes a lexical: I think the idea helpful (as long as it 
respects the text).

1. would it not be a good occasion of getting rid of the odd phrase about 
domain/host names and to introduce a stable wording such as "internet 
name"and "international internet names" or "multilingual internet names" 
which corresponds to the compromise we actually use? I am concerned about:

- the confusion it adds about what is a domain name, specialy in this 
complex context. We try to simplify and stabilize in discussing only 
strings, not what international lawyers may do with them. In making clear 
we only talk about alphanum pointers to IP addresses we migh help 
disjointing the legal and the technical aspects?

- I am concerned about using a concept (international) for another 
(multilingual) when the international concept may become another issue with 
national DNS views.

2.  I am confused about the implications of the proposed change of part 7. 
If I am right the target is to stick to the current common status of the 
DNS, whatever it may be. Could we not just define a "DNS character set" (as 
"0-9 a-Z -." today) and say that it can extend with DNS specifications. 
(Continue reading)

Martin Duerst | 2 Sep 07:48
Picon
Favicon

Re: Document Status?

At 08:53 02/09/01 -0400, vinton g. cerf wrote:
>One working definition of internationalization is that the 
>encoding/expression is "understood" by speakers of all languages. There is 
>global agreement, I believe, that block Latin characters can be used by 
>anyone in any country to express the name of a destination country in a 
>postal address. So for example "UNITED STATES" or "FRANCE" or "AUSTRALIA", 
>"JAPAN", "VIETNAM" are all considered acceptable in every country. This 
>agreement allows, for example, that the destination address, except for 
>the name of the country, can be rendered in a language local to the target 
>country and does not have to be understood by the postal service in the 
>originating country. Consequently, someone sending a letter from the US to 
>a recipient in Vietnam can write the destination address in Vietnamese and 
>the US postal service need only understand the characters "VIETNAM" at the 
>bottom of the destination address.

Some comments:

- 'Internationalization', in the context of software, means the basic
   work that is needed to use different scripts, languages,...
   James and John already explained that.

- For postal addresses, strictly speaking, you are guaranteed delivery
   from any country if you use the French contry name. English may also
   work pretty well in practice, but I'm not sure it's guaranteed.

>Concerns about how cut/paste will work are germane to the discussion about 
>the utility of IDNs because such actions may be the ONLY way in which 
>someone may be able to enter special character strings into text intended 
>to represent an IDN. Something like this happens to me regularly as I 
>compose email to friends whose names involve the use of characters with 
(Continue reading)

Re: Document Status?

On 14:53 01/09/02, vinton g. cerf said:
>One working definition of internationalization is that the 
>encoding/expression is "understood" by speakers of all languages. There is 
>global agreement, I believe, that block Latin characters can be used by 
>anyone in any country to express the name of a destination country in a 
>postal address. So for example "UNITED STATES" or "FRANCE" or "AUSTRALIA", 
>"JAPAN", "VIETNAM" are all considered acceptable in every country. This 
>agreement allows, for example, that the destination address, except for 
>the name of the country, can be rendered in a language local to the target 
>country and does not have to be understood by the postal service in the 
>originating country. Consequently, someone sending a letter from the US to 
>a recipient in Vietnam can write the destination address in Vietnamese and 
>the US postal service need only understand the characters "VIETNAM" at the 
>bottom of the destination address.

Absolutely correct. This is what is used as a default international set by 
common sense,  postal agreements and EDI. You may note that this is also 
the way international mnemonics work (JFK, CDG, LAX, ... and ISO 3166 2/3 
letters we use in ccTLDs, or as X.121 DNICs or telephone numbers, etc.).

They usually are organized in a way O, I, 0 and 1 cannot be confused. As 
you note it, they are often used in printed uppercases.

This means that we are using a 28 character set (0-9, A Z, dot and dash). 
In adding column, slash, comma/crosshatch and star we may have a 32 touch 
pad for future telephone sets?

That reasoning in line with EDI, common language, etc. makes the current 
domain names the international default.

(Continue reading)

vinton g. cerf | 1 Sep 14:53
Favicon

Re: Document Status?

One working definition of internationalization is that the encoding/expression is "understood" by
speakers of all languages. There is global agreement, I believe, that block Latin characters can be used
by anyone in any country to express the name of a destination country in a postal address. So for example
"UNITED STATES" or "FRANCE" or "AUSTRALIA", "JAPAN", "VIETNAM" are all considered acceptable in every
country. This agreement allows, for example, that the destination address, except for the name of the
country, can be rendered in a language local to the target country and does not have to be understood by the
postal service in the originating country. Consequently, someone sending a letter from the US to a
recipient in Vietnam can write the destination address in Vietnamese and the US postal service need only
understand the characters "VIETNAM" at the bottom of the destination address.

Multilingualization is more focused on what is sometimes called "localization" - that is, the characters
used in rendering a local language can be used (e.g. for domain names or for filling out forms etc) and these
renderings need not be universally understood.

This definitional distinction helps (me anyway) to appreciate that the creation of multilingual domain
names may not necessarily contribute to universal ability to use the resulting strings because it may be
difficult to impossible to render or enter arbitrary character sets at the user interface to a local
service. We have collectively probably created some confusion for ourselves by using the term
"internationalized domain names" to cover both concepts. It strikes me that the IDNA documents are more
aimed at localization/multilingualization than internationalization, using the "definition" in the
first paragraph above. 

Concerns about how cut/paste will work are germane to the discussion about the utility of IDNs because such
actions may be the ONLY way in which someone may be able to enter special character strings into text
intended to represent an IDN. Something like this happens to me regularly as I compose email to friends
whose names involve the use of characters with various accent markings. Since I don't know how to enter
these from my simple ASCII keyboard, I usually end up cutting and pasting the characters. This works
because the text of email is permitted to be pretty general in its encoding. I don't know how that would work
out if I were dealing with non-Latin character sets. I know I would need special software to render Hangul
or Kanji, for instance, but I assume that the rendering packages also serve to make highlighting and
(Continue reading)

vinton g. cerf | 2 Sep 04:28
Favicon

Re: Document Status?

part of the problem is that scripts lead to languages and that's likely how most people will think of the IDN names.

vint

At 01:40 AM 9/2/2002 +0000, Adam M. Costello wrote:
>What we really want is a word that means multiple scripts,
>rather than multiple languages, but I don't know of such a word.

Vint Cerf
SVP Architecture & Technology
WorldCom
22001 Loudoun County Parkway, F2-4115
Ashburn, VA 20147
703 886 1690 (v806 1690)
703 886 0047 fax

Adam M. Costello | 2 Sep 03:40

Re: Document Status?

"vinton g. cerf" <vinton.g.cerf <at> wcom.com> wrote:

> It strikes me that the IDNA documents are more aimed at
> localization/multilingualization than internationalization

"JFC (Jefsey) Morfin" <jefsey <at> jefsey.com> wrote:

> This is why we should correct the wording now, while we still can do
> it.

> IMHO multilingual Internet names support is what we talk about.

I personally wouldn't mind seeing IDN renamed to MDN, but I suppose it's
the area directors that would need to make that decision.

What we really want is a word that means multiple scripts,
rather than multiple languages, but I don't know of such a word.

AMC

Erik Nordmark | 7 Sep 10:31
Picon

Re: Document Status?

> I personally wouldn't mind seeing IDN renamed to MDN, but I suppose it's
> the area directors that would need to make that decision.

Perhaps I should give my views on this then ...

> What we really want is a word that means multiple scripts,
> rather than multiple languages, but I don't know of such a word.

I think such a change at this point in time would be unwise.
If nothing else, it would probably cause us to replay 6 months
worth of discussion around languages vs. scripts vs. codepoints vs ...

But even having this discussion seems to be a distraction from
gettting IDNA finished.

  Erik

James Seng | 2 Sep 04:40
Picon

Re: Document Status?

This is a deja vu, a discussion we have before. I tried to refrain myself
from kicking at this dead horse again but...*sigh*

There is a difference between "script" vs "language".

Language are written in scripts. Some languages use more then one script.
(e.g. Japanese). And some languages share and use the same script (e.g.
Arabic script, Han Ideograph). So language != script.

When we say "localization", we deal with "local language, local expression,
local pharses" etc.

When we say "multilingual", we deal with multi-languages.

When we say "internationalization", we design system that handle multiple
scripts.

Domain name deal with script. It has no capability to deal with language.
When I write a domain name on a napkin (aka "the napkin test"), say
"現代.com", and you give it to someone else, you have no way knowing this is a
chinese or japanese or korean without me telling you (out-of-band
communication).

So in domain name, we cant do "multilingual". We do "internationalization".
If you want "multilingual", you are not looking at domain name but something
else.

These definition is a bit different from the layman defintion of
"international" and "local" in Jefsey's email. But we engineers have to be
more precies.
(Continue reading)

Martin Duerst | 2 Sep 07:31
Picon
Favicon

Re: Document Status?

At 22:58 02/09/01 -0400, John C Klensin wrote:
>--On Monday, September 02, 2002 10:40 AM +0800 James Seng
><jseng <at> pobox.org.sg> wrote:

> > So in domain name, we cant do "multilingual". We do
> > "internationalization". If you want "multilingual", you are
> > not looking at domain name but something else.
>
>Yes.  But, again, no restrictions to scripts either.

Please note that the recent changes for bidi introduced
some restrictions on how you can combine scripts in
a single label.

Regards,    Martin.

John C Klensin | 2 Sep 04:58

Re: Document Status?

--On Monday, September 02, 2002 10:40 AM +0800 James Seng
<jseng <at> pobox.org.sg> wrote:

>...
> Domain name deal with script. It has no capability to deal
> with language. When I write a domain name on a napkin (aka
> "the napkin test"), say "�代.com", and you give it to
> someone else, you have no way knowing this is a chinese or
> japanese or korean without me telling you (out-of-band
> communication).

Actually, James, domain names don't deal with scripts, either.
They deal with characters, chosen without restrictions from a
repertoire.  As long as that repertiore is, as the IDN WG has
specified so far, all of Unicode less some prohibited characters
(or, more precisely, code points) than any of the non-prohibited
Unicode characters can appear in a DNS name, in any order, with
no restriction to, e.g., script homogenity within a label.

> So in domain name, we cant do "multilingual". We do
> "internationalization". If you want "multilingual", you are
> not looking at domain name but something else.

Yes.  But, again, no restrictions to scripts either.

     john

James Seng | 2 Sep 05:13
Picon

Re: Document Status?

This is really deja vu :)

To be even more precies, domain names don't deal with characters either. It
deals with bits that represent codepoints, that may be grapheme that forms
characters.

-James Seng

----- Original Message -----
From: "John C Klensin" <klensin <at> jck.com>
To: "James Seng" <jseng <at> pobox.org.sg>; "IETF idn working group"
<idn <at> ops.ietf.org>
Sent: Monday, September 02, 2002 10:58 AM
Subject: Re: [idn] Document Status?

> --On Monday, September 02, 2002 10:40 AM +0800 James Seng
> <jseng <at> pobox.org.sg> wrote:
>
> >...
> > Domain name deal with script. It has no capability to deal
> > with language. When I write a domain name on a napkin (aka
> > "the napkin test"), say "�代.com", and you give it to
> > someone else, you have no way knowing this is a chinese or
> > japanese or korean without me telling you (out-of-band
> > communication).
>
> Actually, James, domain names don't deal with scripts, either.
> They deal with characters, chosen without restrictions from a
> repertoire.  As long as that repertiore is, as the IDN WG has
> specified so far, all of Unicode less some prohibited characters
(Continue reading)

Dave Crocker | 2 Sep 06:23

Re: Document Status?

At 11:31 PM 9/1/2002 -0400, John C Klensin wrote:
>the IDN WG's output constitutes a (or several) significant change(s) in 
>how the DNS is used and interpreted,

1.  IDNA makes no changes to DNS "semantics" and no changes to basic DNS use.

2.  IDNA increases the set of valid domain name characters from a subset of 
ASCII to a subset of Unicode.

3.  IDNA uses an encoding trick of the type used in MIME, to provide an 
upgrade path that has minimal impact on the installed base, yet permits 
interoperability.

>If one of them is that we have moved from "characters" to "bits that 
>represent...",

It is NOT one of the changes.

At 10:53 PM 9/1/2002 -0500, Eric A. Hall wrote:
>I would reiterate the factual point that IDNA requires a forklift upgrade 
>of the Internet's current application clients, and it therefore does not 
>satisfy the ground covered in points "a" and "b".

Hmmmm.  Let's see...

Applications that currently speak only ASCII, for domain names, need to be 
changed to support Unicode for domain names.  This sort of change is 
certain to be significant for any application making the enhancement, no 
matter what the details are.

(Continue reading)

Adam M. Costello | 2 Sep 05:55

Re: Document Status?

James Seng <jseng <at> pobox.org.sg> wrote:

> To be even more precies, domain names don't deal with characters
> either.  It deals with bits that represent codepoints, that may be
> grapheme that forms characters.

In Unicode terminology, no code point represents multiple characters,
and no code point represents a part of a character.  Each code point
represents one character (or no characters).

A case can be made that domain names do (or should) deal with
characters.  The domain label "nicemice" is a sequence of eight
characters.  They happen to be represented by ASCII codes whenever the
are sent via DNS (or most other protocols), but the label can also
be written on paper, or represented in EBCDIC.  Its essence is the
characters, not the bits.

At least, that's the view taken by URIs.  A URI is explicitly defined
as a sequence of characters, not bytes, and one field of most URIs is a
domain name.

AMC

John C Klensin | 2 Sep 05:31

Re: Document Status?


--On Monday, September 02, 2002 11:13 AM +0800 James Seng
<jseng <at> pobox.org.sg> wrote:

> This is really deja vu :)
> 
> To be even more precies, domain names don't deal with
> characters either. It deals with bits that represent
> codepoints, that may be grapheme that forms characters.

James, I was about to respond by saying "yes".   Then I realized
that this is actually an area of controversy, and one of the
sources of my "and what problem did you say the IDN WG solved?"
question.  Let me try to explain, in the unlikely event that it
is helpful...

The defining DNS documents, at least I read them, really do talk
about "characters".  That was, I think, largely because in ASCII
there are no composed characters (combining of two code points
to produce one character); there are no composite sequences to
make up phonemes or words (as in Hangul, at least as I
understand it); there are no diacriticals, no optional accent
marks or vowels, no peculiar spaces or breaks, and no alternate
(or multiple) codings for the same glyph.

I think your "bits that represent codepoints, that may be
grapheme that forms characters" definition may be an accurate
characterization of where we have ended up.  But I believe we
have ended up here more or less as an accidental sequence of
events, driven by the design work of a small number of people,
(Continue reading)

Patrik Fältström | 1 Sep 18:03
Picon
Favicon

Re: Document Status?

--On 2002-09-01 08.53 -0400 "vinton g. cerf" <vinton.g.cerf <at> wcom.com> wrote:

> I know I would need special software to render Hangul or Kanji, for
> instance, but I assume that the rendering packages also serve to make
> highlighting and cut/paste work.

The copy and paste problem is difficult, but not so hard as people belive
(I think).

I know how copy and paste work on the Apple Macintosh platform, and as that
has been around and worked that way for decades(!) I take for granted it
works the same way in for example Windows.

When doing "copy", the software "sending" the copied information identifies
the selection and calls a routine which notifies the operating system that
data exists in the paste buffer. The information passed include information
like what type(s) the data can be fetched as, the size(s) etc. Note that
several alternatives can be stored there.

It looks like the content-type mechanism in email. Very precise tagging of
the data.

Now, some other application have a menu which is to be drawn. The menu
includes an item called "paste". Before doing the actual drawing, it calls
a routine to check (a) if there is something in the paste buffer, and (b)
if the data is of a type which it can interpret. If both are true, the menu
item "paste" is _not_ shadowed.

The paste operation happens, and it can either grab data which is already
generated by the sender application, or the sender application is called
(Continue reading)

Soobok Lee | 1 Sep 20:59
Picon

Re: Document Status?

On Sun, Sep 01, 2002 at 06:03:00PM +0200, Patrik F?ltstr?m wrote:
> 
> (a) I get an email with IDNA encoded sender address. I want to add that to
> some address book software. That imply copy and paste from email program to
> address book program. The email address have ACE encoded labels in them.
> 
> (a1) The email program understand IDNA, but not the address book program.
> As it understands IDNA, it will display (if the script and font exists) the
> correct Unicode characters, and not the ACE encoded string. Now, the copy
> operation happens, and I would if I were the email programmer put two (2)
> things in the paste buffer: One "email address" which is the ACE encoded
> string. Same thing as what is passed in SMTP or POP. One which is the
> address in Unicode (or local script, which will be named as part of the
> tag). The addressbook which fetches data from the paste buffer gets the
> string, and notice it is ace encoded, and can choose to decode that if it
> can/know etc.
>

I often run xterm and then launch MUTT (or PINE).
Even if MUTT would become IDNA-aware in the future, copy & paste operations 
grab the IDN-like strings directly from the xterm, not from the MUTT.
So, the MUTT cannot have any opportunity to toss ACE-encod the IDN into the 
receiving applications or the clip board area. Text-based MUA does not have
any copy&paste support to/from it. Xterm does all the job.

Consistent IDNA-specific and IDNA-aware copy&paste operations, if we make any,
should be implementable and meaningful also in xterm which has been regarded
as a purely textual application.

Soobok Lee
(Continue reading)

Eric A. Hall | 2 Sep 05:45

Re: Document Status?


on 9/1/2002 1:59 PM Soobok Lee wrote:

> I often run xterm and then launch MUTT (or PINE).

<snip>

This problem goes beyond xterm. For all practical purposes, all apps will
be required to use the text form for all exchanges, except in those cases
where the operating environment provides an "i18n domain name" data-type
as part of the clipboard protocol. Unless an application is willing to
explicitly claim support for ACE encoded domain names, there can be no
guarantee that the recipient application will be able to make sense of the
domain name.

Separately, the canonical problem here is managing multiple
representations and trying to negotiate over which representation should
be used for some specific function. This problem won't go away until the
i18n form is available in all protocols and applications directly. In this
regard, IDNA is a patch that introduces a new problem, not a cure to the
existing problem. Having said that, some degree of transition and
therefore negotiation is of course necessary. It should not be the end,
however. I would hate to think that we consider the problem resolved after
only having made it worse.

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

(Continue reading)

Eric A. Hall | 2 Sep 06:03

Re: Document Status?


[oops]

on 9/1/2002 10:45 PM Eric A. Hall wrote:

> Unless an application is willing to explicitly claim support for
> ACE encoded domain names,
  ^^^^^^^^^^^
 internationalized

> there can be no guarantee that the recipient
> application will be able to make sense of the domain name.

...so the default will always be ASCII.

Proof: dozens if not hundreds of common applications that accept i18n data
as input and don't error out until the back-end protocol(s) fail.

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Simon Josefsson | 1 Sep 21:34

Re: Document Status?

Soobok Lee <lsb <at> postel.co.kr> writes:

> On Sun, Sep 01, 2002 at 06:03:00PM +0200, Patrik F?ltstr?m wrote:
>> 
>> (a) I get an email with IDNA encoded sender address. I want to add that to
>> some address book software. That imply copy and paste from email program to
>> address book program. The email address have ACE encoded labels in them.
>> 
>> (a1) The email program understand IDNA, but not the address book program.
>> As it understands IDNA, it will display (if the script and font exists) the
>> correct Unicode characters, and not the ACE encoded string. Now, the copy
>> operation happens, and I would if I were the email programmer put two (2)
>> things in the paste buffer: One "email address" which is the ACE encoded
>> string. Same thing as what is passed in SMTP or POP. One which is the
>> address in Unicode (or local script, which will be named as part of the
>> tag). The addressbook which fetches data from the paste buffer gets the
>> string, and notice it is ace encoded, and can choose to decode that if it
>> can/know etc.
>
> I often run xterm and then launch MUTT (or PINE).
> Even if MUTT would become IDNA-aware in the future, copy & paste operations 
> grab the IDN-like strings directly from the xterm, not from the MUTT.
> So, the MUTT cannot have any opportunity to toss ACE-encod the IDN into the 
> receiving applications or the clip board area. Text-based MUA does not have
> any copy&paste support to/from it. Xterm does all the job.

The specifications seems quite clear on what should happen here -- if
there is no negotiation, ACE should be used.  TTY MUAs therefor must
display ACE strings as there is no negotiation between xterm and the
MUA that an IDNA string is being displayed.
(Continue reading)

Patrik Fältström | 2 Sep 07:05
Picon
Favicon

Re: Document Status?

--On 2002-09-01 21.34 +0200 Simon Josefsson <jas <at> extundo.com> wrote:

> The specifications seems quite clear on what should happen here -- if
> there is no negotiation, ACE should be used.  TTY MUAs therefor must
> display ACE strings as there is no negotiation between xterm and the
> MUA that an IDNA string is being displayed.

It is not more strange than what we have for Subject-lines today.

Should the tty client display subject lines encoded according to RFC 2047,
or decoded? If you cut and paste via xterm something which is decoded, and
paste somewhere else, it is still up to the email client and other involved
applications "to do the right thing". Right thing is to see that non-IDNA
applications get the ACE encoded version of domain names.

   paf

Simon Josefsson | 2 Sep 08:25

Re: Document Status?

Patrik Fältström <paf <at> cisco.com> writes:

> --On 2002-09-01 21.34 +0200 Simon Josefsson <jas <at> extundo.com> wrote:
>
>> The specifications seems quite clear on what should happen here -- if
>> there is no negotiation, ACE should be used.  TTY MUAs therefor must
>> display ACE strings as there is no negotiation between xterm and the
>> MUA that an IDNA string is being displayed.
>
> It is not more strange than what we have for Subject-lines today.

There is a important difference; subject lines aren't used to route
mail or identify persons.

If I cut'n'paste a subject line and it is garbled, the only harm will
be a garbled subject line.

If I cut'n'paste an IDN email address and it is garbled, I will send
mail to the wrong person, potentially even encrypted mail embedding
sensitive information, as security systems like OpenPGP and S/MIME
uses email addresses to identify people.

Just making IDN "work", as in displaying fancy glyphs, isn't good
enough, it also shouldn't generate new security problems.  MIME only
dealt with the data, it did not modify interpretation of the
addressing system, and still managed to generate security problems.

Dave Crocker | 2 Sep 10:28

Re: Re: Document Status?

At 09:16 AM 9/2/2002 +0200, Patrik Fältström wrote:
>--On 2002-09-02 08.25 +0200 Simon Josefsson <jas <at> extundo.com> wrote:
> >> It is not more strange than what we have for Subject-lines today.
> > There is a important difference; subject lines aren't used to route
> > mail or identify persons.
>
>I was not thinking of what we use it for.

Folks,

Let's assume that these various "differences" do exist.

So what?

If someone is going to claim that IDNA will not work as it is designed to 
work, they need to explain how it will fail.

d/

----------
Dave Crocker <mailto:dave <at> tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850

Patrik Fältström | 2 Sep 09:16
Picon
Favicon

Re: Document Status?

--On 2002-09-02 08.25 +0200 Simon Josefsson <jas <at> extundo.com> wrote:

>> It is not more strange than what we have for Subject-lines today.
> 
> There is a important difference; subject lines aren't used to route
> mail or identify persons.

I was not thinking of what we use it for. 

I was thinking of the errors one can get.

   paf

Eric A. Hall | 2 Sep 08:12

Re: Re: Document Status?


on 9/2/2002 12:05 AM Patrik Fältström wrote:

> It is not more strange than what we have for Subject-lines today.

Except that Subject: is unstructured data which is not processed as
protocol data. And even when where parts of Subject: are interpreted as
data (such as "Re:"), they are required to be in ASCII for them to work.

Domain names are structured and used as protocol data everywhere. Any
strangeness with a domain name is an opportunity for total failure, even
outside the scope of the local application. It won't take long for folks
to figure out that they have to use ASCII for everything, except where
some function provides an explicit IDN data-type and can guarantee that it
will handle any necessary conversion.

--

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

YangWoo Ko | 2 Sep 04:45
Picon

Re: Re: Document Status?

On Sun, Sep 01, 2002 at 09:34:26PM +0200, Simon Josefsson wrote:
> Soobok Lee <lsb <at> postel.co.kr> writes:
> >
> > I often run xterm and then launch MUTT (or PINE).
> > Even if MUTT would become IDNA-aware in the future, copy & paste
operations
> > grab the IDN-like strings directly from the xterm, not from the MUTT.
> > So, the MUTT cannot have any opportunity to toss ACE-encod the IDN into
the
> > receiving applications or the clip board area. Text-based MUA does not
have
> > any copy&paste support to/from it. Xterm does all the job.
>
> The specifications seems quite clear on what should happen here -- if
> there is no negotiation, ACE should be used.  TTY MUAs therefor must
> display ACE strings as there is no negotiation between xterm and the
> MUA that an IDNA string is being displayed.

Dear Simon,

Oops. Then I will encounter very ugly environment in the near future.

Dear idn-ers,

Please tell me that simon's understanding is wrong. Negotiation in IDNA
draft seems either poorly documented or poorly understood.

Korean mutt user.

(Continue reading)

Simon Josefsson | 2 Sep 05:16

Re: Document Status?

YangWoo Ko <yw <at> mrko.pe.kr> writes:

>> The specifications seems quite clear on what should happen here -- if
>> there is no negotiation, ACE should be used.  TTY MUAs therefor must
>> display ACE strings as there is no negotiation between xterm and the
>> MUA that an IDNA string is being displayed.
>
> Dear Simon,
>
> Oops. Then I will encounter very ugly environment in the near future.
>
> Dear idn-ers,
>
> Please tell me that simon's understanding is wrong. Negotiation in IDNA
> draft seems either poorly documented or poorly understood.

I agree the specifications aren't clear, but at least Patrik Fältström
answered clearly in another part of this thread, unless I
misunderstood him again -- which isn't unlikely, as I fail to locate
the text in the IDNA specification that back the claims made.  From
<35817839.1030912013 <at> localhost>:

Patrik Fältström wrote:

> Simon Josefsson wrote:
>
> If the
> strings are to be ACE encoded or raw encoded is not specified anywhere
> as far as I can tell, and different implementations will chose
> different strategies.
(Continue reading)

Soobok Lee | 2 Sep 04:43
Picon

Re: Re: Document Status?

On Mon, Sep 02, 2002 at 11:09:29AM +0900, YangWoo Ko wrote:
> On Sun, Sep 01, 2002 at 09:34:26PM +0200, Simon Josefsson wrote:
> > Soobok Lee <lsb <at> postel.co.kr> writes:
> > >
> > > I often run xterm and then launch MUTT (or PINE).
> > > Even if MUTT would become IDNA-aware in the future, copy & paste operations 
> > > grab the IDN-like strings directly from the xterm, not from the MUTT.
> > > So, the MUTT cannot have any opportunity to toss ACE-encod the IDN into the 
> > > receiving applications or the clip board area. Text-based MUA does not have
> > > any copy&paste support to/from it. Xterm does all the job.
> > 
> > The specifications seems quite clear on what should happen here -- if
> > there is no negotiation, ACE should be used.  TTY MUAs therefor must
> > display ACE strings as there is no negotiation between xterm and the
> > MUA that an IDNA string is being displayed.
> 
> Dear Simon,
> 
> Oops. Then I will encounter very ugly environment in the near future.

Yes, if and only if we enforce that the copy & paste buffer should *always* be filled 
with the ACE-encoded IDN . Otherwise,if we also allow in the paste buffer 
other local-charset-encoded IDN , instead, various interoperability problems 
will arise. Both situations seem ugly... :-)

Soobok Lee

Patrik Fältström | 2 Sep 07:09
Picon
Favicon

Re: Re: Document Status?

--On 2002-09-02 11.43 +0900 Soobok Lee <lsb <at> postel.co.kr> wrote:

> Yes, if and only if we enforce that the copy & paste buffer should
> *always* be filled  with the ACE-encoded IDN .

Not needed if the copy-paste buffer can include tagged data. You talk about
transcoding tables, but for many non-unicode charsets the mapping table is
a 1:1 mapping, especially Unicode strings after Nameprep has been applied.
And, there is no problem with tagged data to say that "for this type, this
transcoding table is to be used.

You talk about X11 which only have charset information in the tagging, and
yes, there it might be problematic. On other operating systems, like MacOS,
there will be no problem.

But, as you point out, if an application is conservative and only give out
ACE to other applications, you will be safe, as all IDNA-aware applications
can detect and decode ACE.

Be conservative with what you send, and generous with what you receive.

    paf

Simon Josefsson | 1 Sep 19:42

Re: Document Status?

Patrik Fältström <paf <at> cisco.com> writes:

> (a1) The email program understand IDNA, but not the address book program.
> As it understands IDNA, it will display (if the script and font exists) the
> correct Unicode characters, and not the ACE encoded string. Now, the copy
> operation happens, and I would if I were the email programmer put two (2)
> things in the paste buffer: One "email address" which is the ACE encoded
> string. Same thing as what is passed in SMTP or POP. One which is the
> address in Unicode (or local script, which will be named as part of the
> tag). The addressbook which fetches data from the paste buffer gets the
> string, and notice it is ace encoded, and can choose to decode that if it
> can/know etc.

At least in X11 cut'n'paste works by transfering charset tagged but
otherwise opaque character arrays.  What you are proposing seem to
require a cut'n'paste protocol to be implemented in both the MUA and
the address book application.  The protocol must specify how the
structure containing the raw string and the ACE encoded string is
encoded and identified by both applications.  Will IDNA define this
protocol for X11, MacOS, Windows etc?

Assuming IDNA will limit itself to not require modifications to
cut'n'paste operations in various operating systems, you will only be
able to cut'n'paste charset tagged but opaque text strings.  If the
strings are to be ACE encoded or raw encoded is not specified anywhere
as far as I can tell, and different implementations will chose
different strategies.  If the application is running in a Unicode
environment, it might (only might!) make sense to transfer the raw
Unicode encoding, but if it is running in a non-Unicode environment
the IDNA specification leaves you in the cold as for how to implement
(Continue reading)

Adam M. Costello | 2 Sep 05:16

Re: Re: Document Status?

Simon Josefsson <jas <at> extundo.com> wrote:

> At least in X11 cut'n'paste works by transfering charset tagged but
> otherwise opaque character arrays.

Cut & paste in X11 works fine when everything is ASCII.  Otherwise, in
my experience, it is quite broken already, even before IDNs enter the
picture.

For example, I often run a text-mode editor in one kterm, and a
text-mode browser in another.  I can cut and paste English and Japanese
text between them with no problems.  However, if I try to copy Japanese
text from Netscape 4 into a kterm, it does not work.  I figured out how
it was broken and implemented a hack to make it possible (a small tcl/tk
program that provides a button that grabs the selection, performs the
transcoding, and re-exports the selection).

Now when I try to copy text from Mozilla 1.0 into a kterm, it doesn't
work, but it's broken in some other way that I haven't investigated
yet, so then what I do is copy the URL, load the page into my text-mode
browser, and copy the text from there.

So I think it's true that cutting and pasting IDNs in X11 will fail in
many cases, but not because IDNs are so difficult, but only because
cutting and pasting anything other than ASCII text is already broken to
begin with.

Patrik Fältström <paf <at> cisco.com> wrote:

> IDNA says that if no negotiation exists between two entities which
(Continue reading)

Simon Josefsson | 2 Sep 06:07

Re: Document Status?

"Adam M. Costello" <idn.amc+0 <at> nicemice.net.RemoveThisWord> writes:

> Simon Josefsson <jas <at> extundo.com> wrote:
>
>> At least in X11 cut'n'paste works by transfering charset tagged but
>> otherwise opaque character arrays.
>
> Cut & paste in X11 works fine when everything is ASCII.  Otherwise, in
> my experience, it is quite broken already, even before IDNs enter the
> picture.

Recent versions of X11 and various utilities work better (e.g., in the
Unicode based RedHat Linux beta), but there are still applications
that doesn't work fully.  Latin-1 cut'n'paste has worked for me for
years.

>> > Even if MUTT would become IDNA-aware in the future, copy & paste
>> > operations grab the IDN-like strings directly from the xterm, not
>> > from the MUTT.  So, the MUTT cannot have any opportunity to toss
>> > ACE-encod the IDN into the receiving applications or the clip board
>> > area.  Text-based MUA does not have any copy&paste support to/from
>> > it.  Xterm does all the job.
>>
>> The specifications seems quite clear on what should happen here -- if
>> there is no negotiation, ACE should be used.  TTY MUAs therefore must
>> display ACE strings as there is no negotiation between xterm and the
>> MUA that an IDNA string is being displayed.
>
> That conclusion does not follow from the IDNA spec.  ASCII forms are
> required only in IDN-unaware domain name slots.  The tty is not a domain
(Continue reading)

Adam M. Costello | 3 Sep 04:56

Re: Re: Document Status?

Simon Josefsson <jas <at> extundo.com> wrote:

> In the particular example of a MUA running in XTERM (or a Unicode unix
> console, for that matter), it will likely not work out as I'm not
> aware of any API between a TTY application and the terminal to query
> which unicode characters it can display, and whether it supports bidi,
> and in that case it seems this paragraph from 6.4 would apply,
> suggesting that MUAs should use ACE anyway:
> 
> ,----
> | If an application decodes an ACE name using ToUnicode but cannot
> | show all of the characters in the decoded name, such as if the
> | name contains characters that the output system cannot display,
> | the application SHOULD show the name in ACE format (which always
> | includes the ACE prefix) instead of displaying the name with the
> | replacement character (U+FFFD).
> `----

The spec says that an application SHOULD NOT show the ACE form if it can
correctly display the Unicode form, and it SHOULD show the ACE form if
it cannot correctly display the Unicode form.  Both recommendations have
equal weight.  If the application cannot know which Unicode characters
can be displayed, then it cannot know which recommendation applies.
If the application is optimistic about the display capabilities, it
risks violating the second recommendation, and if the application is
pessimistic, it risks violating the first recommendation.

The situation is symmetric.  I see no reason to conclude that the spec
favors optimism or pessimism.

(Continue reading)

YangWoo Ko | 2 Sep 06:36
Picon

Re: Re: Document Status?

Dear all,

Why should IETF WG handle implementation issues of specific OS/GUI ?
What about just let others handle them ?

Is it possible to remove every application-related features out of IDNA
document except IDN being ACE encoded somewhere and somehow outside of
DNS ? Does it help or harm ?

Regards

On Mon, Sep 02, 2002 at 06:07:23AM +0200, Simon Josefsson wrote:
> "Adam M. Costello" <idn.amc+0 <at> nicemice.net.RemoveThisWord> writes:
>
> > Simon Josefsson <jas <at> extundo.com> wrote:
> >
> >> At least in X11 cut'n'paste works by transfering charset tagged but
> >> otherwise opaque character arrays.
> >
> > Cut & paste in X11 works fine when everything is ASCII.  Otherwise, in
> > my experience, it is quite broken already, even before IDNs enter the
> > picture.
>
> Recent versions of X11 and various utilities work better (e.g., in the
> Unicode based RedHat Linux beta), but there are still applications
> that doesn't work fully.  Latin-1 cut'n'paste has worked for me for
> years.
>
> >> > Even if MUTT would become IDNA-aware in the future, copy & paste
> >> > operations grab the IDN-like strings directly from the xterm, not
(Continue reading)

YangWoo Ko | 2 Sep 06:59
Picon

Re: Re: Document Status?

Dear all,

Why should IETF WG handle implementation issues of specific OS/GUI ?   
What about just let others handle them ?

Is it possible to remove every application-related features out of IDNA
document except IDN being ACE encoded somewhere and somehow outside of
DNS ? Does it help or harm ?

Regards

On Mon, Sep 02, 2002 at 06:07:23AM +0200, Simon Josefsson wrote:
> "Adam M. Costello" <idn.amc+0 <at> nicemice.net.RemoveThisWord> writes:
> 
> > Simon Josefsson <jas <at> extundo.com> wrote:
> >
> >> At least in X11 cut'n'paste works by transfering charset tagged but
> >> otherwise opaque character arrays.
> >
> > Cut & paste in X11 works fine when everything is ASCII.  Otherwise, in
> > my experience, it is quite broken already, even before IDNs enter the
> > picture.
> 
> Recent versions of X11 and various utilities work better (e.g., in the
> Unicode based RedHat Linux beta), but there are still applications
> that doesn't work fully.  Latin-1 cut'n'paste has worked for me for
> years.
> 
> >> > Even if MUTT would become IDNA-aware in the future, copy & paste
> >> > operations grab the IDN-like strings directly from the xterm, not
(Continue reading)

Soobok Lee | 2 Sep 07:37
Picon

Re: Re: Document Status?


----- Original Message ----- 
From: "YangWoo Ko" <newcat <at> spsoft.co.kr>
To: "IETF idn working group" <idn <at> ops.ietf.org>
Sent: Monday, September 02, 2002 1:59 PM
Subject: Re: [idn] Re: Document Status?

> Dear all,
>    
> Why should IETF WG handle implementation issues of specific OS/GUI ?   
> What about just let others handle them ?

Implementation specifics and implementability issues are different. The latter 
one may be one of architectural/security issues which we should discuss on.

>   
> Is it possible to remove every application-related features out of IDNA
> document except IDN being ACE encoded somewhere and somehow outside of
> DNS ? Does it help or harm ?

If you begin to remove those features, only Punycode draft  will remain in the pool.
IDN is not only a problem of DNS server/client, but also problems around applications
which uses domain names as *local* identifiers and do comparisons on them.

Soobok Lee

>   
> Regards
> 

(Continue reading)

Patrik Fältström | 1 Sep 20:26
Picon
Favicon

Re: Document Status?

--On 2002-09-01 19.42 +0200 Simon Josefsson <jas <at> extundo.com> wrote:
> At least in X11 cut'n'paste works by transfering charset tagged but
> otherwise opaque character arrays.

Ok. Good.

> What you are proposing seem to
> require a cut'n'paste protocol to be implemented in both the MUA and
> the address book application.

Not at all.

What I say is that one should send the ACE encoded string in the paste
buffer. Further, that is what will happen when an application doesn't know
anything about IDNA at all. In cases like MacOS where one can have
alternative forms of the data, it is possible to define a new type for the
Unicode version of the domain name.

> If the
> strings are to be ACE encoded or raw encoded is not specified anywhere
> as far as I can tell, and different implementations will chose
> different strategies.

IDNA says that if no negotiation exists between two entities which exchange
domain names between them, ACE encoding should be used. There is no
difference between a protocol which uses IP or the paste buffer. It is the
same thing.

> In general, cut'n'paste of IDNA in the real world is not well defined,
> since IDNA only solves the IDNA problem for Unicode, and the real
(Continue reading)

Simon Josefsson | 1 Sep 21:21

Re: Document Status?

Patrik Fältström <paf <at> cisco.com> writes:

>> What you are proposing seem to
>> require a cut'n'paste protocol to be implemented in both the MUA and
>> the address book application.
>
> Not at all.

OK, then I misunderstood what you said.  Disregard my mail..

>> (d) Email program understands IDNA but is running in a non-Unicode
>>     environment.  The address is tagged and is transfered to address
>>     book application using e.g. ISO-8859-1.  IDNA doesn't handle or
>>     care about this scenario, but it do exists in the real world
>>     (e.g. my machine).
>
> See above, if the address is to be transferred to other application, it
> should either be negotiated that it is an IDN in a specific charset (like
> 8859-1) or sent as ACE.

Even tagging data as being an IDN encoded into a (legacy) coding
system isn't sufficient to use IDNA interoperably, as no transcoding
tables are specified.  But I'm starting to repeat myself from another
thread now.

Adam M. Costello | 1 Sep 07:24

Re: Document Status?

"JFC (Jefsey) Morfin" <jefsey <at> jefsey.com> wrote:

> 1. would it not be a good occasion of getting rid of the odd phrase
> about domain/host names and to introduce a stable wording such as
> "internet name" and "international internet names" or "multilingual
> internet names" which corresponds to the compromise we actually use?

Various people have had various discussions trying to come to a common
understanding of the precise meanings of "domain name" versus "host
name", with very little success.  Just about the only thing everyone
agrees on is that every host name is a domain name, but some domain
names might not be host names.  Settling this type-of-names issue is
beyond the scope and ability of this working group.  IDNA is a mechanism
for allowing non-ASCII characters to be used in domain names, whatever
those might be.  Internationalized domain names can be used wherever
domain names can be used, wherever that might be, except in non-IN
class DNS resource records (this exclusion is stated in the forthcoming
idna-11 draft).

> - I am concerned about using a concept (international) for another
> (multilingual) when the international concept may become another issue
> with national DNS views.

I don't know exactly what the difference is between internationalization
and multilingualization.  I think one reason the latter term was
not used is that domain names have no language tag.  Maybe there
were other reasons, or maybe it was arbitrary.

> 2.  I am confused about the implications of the proposed change of
> part 7.
(Continue reading)

James Seng | 31 Aug 11:37
Picon

Re: Document Status?

John,

> It seems to me that Dave (and I) have raised two sorts of issues
> which are very different in character than, e.g., bidi and
> unicode 3.2.   One has to do with the _style_ of the documents,
> e.g., to paraphrase Dave (I hope accurately), whether they
> specify a protocol or outline an implementation.   That is
> somewhat a matter of taste, and you could legitimately argue
> that it is an editorial matter, as long as the specification is
> complete and unambiguous.

The draft in questions have gone through the working group last call.
Regardless of the merits of these issues, it should have been bought up
before or during the last call. After the co-chairs move the documents for
IESG consideration, we (as the group) have very little control over what
happen next.

As Patrik said, the IESG/ADs would make their own comments. Sometimes, they
reject the draft (for various reasons such as technical failure). If we are
lucky, they may have minor editorial changes. But typically, it is along the
vague line between editorial to substain technical changes.

These are done to make sure the "specification is complete and unambiguous".

My note is to clarify to the group (not just to Dave), that we are in this
process with the ADs. Most of the stuff going on are minor, request for
additional paragraph for clarification. But a few are substained enough to
warrant another partial wg last call (which is what we did).

> (i) If there is any question at all about how a given codepoint
(Continue reading)

Paul Hoffman / IMC | 30 Aug 19:38
Picon

Re: Document Status?

The sky is not falling, Dave. A new draft that we believe meets the 
IESG's request for clarifications has already been turned into the 
Internet Drafts repository but has not been announced by the IETF 
Secretariat yet.

--Paul Hoffman, Director
--Internet Mail Consortium


Gmane