Dave Thaler | 14 Jun 2010 22:20
Picon
Favicon

RE: IDNA and getnameinfo() and getaddrinfo()

> -----Original Message-----
> From: Nicolas Williams [mailto:Nicolas.Williams <at> oracle.com]
> Sent: Monday, June 14, 2010 12:32 PM
> To: Dave Thaler
> Cc: idna-update <at> alvestrand.no; john+ietf <at> jck.com; cheshire <at> apple.com
> Subject: Re: IDNA and getnameinfo() and getaddrinfo()
> 
> On Mon, Jun 14, 2010 at 07:14:12PM +0000, Dave Thaler wrote:
> > > Over in the NFSv4 WG we're discussing how to fix NFSv4.1 to properly
> > > handle IDNA.  In the process of doing so I ran into draft-iab-idn-
> > > encoding, which has a cogent discussion of name service switches (pictured
> in figure 2).
> > >
> > > draft-iab-idn-encoding aims for Informational status.  I'm wondering
> > > if we could publish a Standards-Track document describing how
> > > getnameinfo() and
> > > getaddrinfo() should handle IDNA.
> > >
> > > For example, one could say that when using DNS getnameinfo() should:
> >
> > Be careful not to confuse getnameinfo() with DNS.
> 
> I explicitly pointed out the name service switch architecture usually
> implemented.  I thought that'd suffice to clarify that I really meant "the DNS
> plug-in to the getnameinfo() entry point in the name service switch" -- I just
> didn't want to be too redundant.
> 
> > As noted in draft-iab-idn-encoding and RFC 3493, DNS is just one of a
> > number of mechanisms used under getnameinfo().
> 
(Continue reading)

Simon Josefsson | 14 Jun 2010 22:33
Favicon
Gravatar

Re: IDNA and getnameinfo() and getaddrinfo()

Dave Thaler <dthaler <at> microsoft.com> writes:

> RFC 3493 doesn't say either way whether "char *" is ANSI or UTF-8 or whatever
> else, and as far as I know, neither does POSIX 
> (http://www.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html).

Normally in POSIX, strings are encoded in the locale coding system.  If
you are in a UTF-8 locale, the string can be assumed (by the getaddrinfo
implementation) to be UTF-8.  Otherwise it needs to be transcoded.  This
is how GNU Libc's IDN support works, see:

http://git.savannah.gnu.org/cgit/libidn.git/tree/libc/getaddrinfo-idn.txt

/Simon
Nicolas Williams | 15 Jun 2010 01:42
Picon
Favicon

Re: IDNA and getnameinfo() and getaddrinfo()

On Mon, Jun 14, 2010 at 10:33:48PM +0200, Simon Josefsson wrote:
> Dave Thaler <dthaler <at> microsoft.com> writes:
> 
> > RFC 3493 doesn't say either way whether "char *" is ANSI or UTF-8 or whatever
> > else, and as far as I know, neither does POSIX 
> > (http://www.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html).
> 
> Normally in POSIX, strings are encoded in the locale coding system.  If
> you are in a UTF-8 locale, the string can be assumed (by the getaddrinfo
> implementation) to be UTF-8.  Otherwise it needs to be transcoded.  This
> is how GNU Libc's IDN support works, see:
> 
> http://git.savannah.gnu.org/cgit/libidn.git/tree/libc/getaddrinfo-idn.txt

I like that approach.
Nicolas Williams | 15 Jun 2010 01:42
Picon
Favicon

Re: IDNA and getnameinfo() and getaddrinfo()

On Mon, Jun 14, 2010 at 08:20:55PM +0000, Dave Thaler wrote:
> > > As discussed in draft-iab-idn-encoding section 3, it's not that simple.
> > > The ACE form applies in the public DNS but does not apply in many
> > > private DNS clouds.
> > 
> > I'm not sure I care about those, but one could always implement lists of domains
> > below which to apply alternative algorithms.
> 
> You may not care about them but unfortunately people who provide 
> getaddrinfo/getnameinfo libraries for applications in general need to
> care about them.

For the matter of this discussion, I don't care.  If I were implementing
I'd consider providing a local administrative configuration interface by
which to provide lists of private cloud domains that use alternative IDN
schemes.  (Actually, I'd probably want a distributed configuration
method for that, preferably using DNS itself, but really, that's a
tangent I don't want to go on because it's a distraction from the
purpose of this thread.)

> > So remove all references to aliases from my previous post; instead these
> > functions should return the A-label as the canon name only when the U-label
> > cannot be converted to the caller's locale's codeset losslessly, else they should
> > return the U-label (in the caller's locale's codeset) as the canon name.
> 
> I'd argue that the "canon name" should be the form in which it was 
> resolved over the wire.  So the A-label form if it was resolved in the public DNS,
> and another form (typically the U-label form) if it was resolved via something 
> else (e.g., mDNS or DNS in a private namespace using UTF-8 or whatever else).
> Also note that Windows treats "char *" as ANSI (which has no guarantee of
(Continue reading)

Nicolas Williams | 16 Jun 2010 23:54
Picon
Favicon

Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Mon, Jun 14, 2010 at 06:42:19PM -0500, Nicolas Williams wrote:
> On Mon, Jun 14, 2010 at 08:20:55PM +0000, Dave Thaler wrote:
> > > > As discussed in draft-iab-idn-encoding section 3, it's not that simple.
> > > > The ACE form applies in the public DNS but does not apply in many
> > > > private DNS clouds.
> > > 
> > > I'm not sure I care about those, but one could always implement lists of domains
> > > below which to apply alternative algorithms.
> > 
> > You may not care about them but unfortunately people who provide 
> > getaddrinfo/getnameinfo libraries for applications in general need to
> > care about them.
> 
> For the matter of this discussion, I don't care.  If I were implementing
> I'd consider providing a local administrative configuration interface by
> which to provide lists of private cloud domains that use alternative IDN
> schemes.  (Actually, I'd probably want a distributed configuration
> method for that, preferably using DNS itself, but really, that's a
> tangent I don't want to go on because it's a distraction from the
> purpose of this thread.)

Alright, let's explore that.  What would that look like?  It could look
like a simple NS-like RR -- let's call it the IDNRULES RR.  The RRset
name would be a domainname for which the same zone also has NS RRs.  The
RDATA would indicate what IDN rules apply.

A contrived example for foó.example., foóbar.example. and
óther.example.:

xn--fo-6ja.example.	IN  IDNRULES "IDNA2008"
(Continue reading)

John C Klensin | 17 Jun 2010 03:28

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())


--On Wednesday, June 16, 2010 16:54 -0500 Nicolas Williams
<Nicolas.Williams <at> oracle.com> wrote:

>...
> So, to resolve tést.{foó, foóbar, óther}.example. the
> _resolver_ would first have to split the input string into
> labels using whatever fullstops are legal in the current
> locale, then lookup each of those domains' IDNA rules in the
> example. TLD zone, do whatever codeset conversions and
> pre-processing may be required to meet the rules found, then
> do the next query.  And so on.

Well, remember that, if fullstops are not global, one needs to
be very careful to keep local ones from leaking.  If they do
leak, a parser that tries to separate an FQDN into labels will
end up with a high error rate.  That would make the bad guys,
who have lots of fun with URLs that trick users into believing
that third- or fourth-level names are really second-level ones,
very happy.  I trust their happiness is not our goal.

> Sounds good, BUT there's issues w.r.t. stub resolvers and
> caching: stub resolvers suddenly have to get pretty fancy,
> even if the are using caching servers, because suddenly
> recursive caching servers are not useful for looking up IDNs!

Right.  And, if you start thinking about DNAME and other things
that prevent you from knowing definitively which tree someone
thinks that a name/label is in, the difficulties with caching
servers start looking easy.   Remember that there is not even an
(Continue reading)

Paul Hoffman | 17 Jun 2010 03:39
Picon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

At 9:28 PM -0400 6/16/10, John C Klensin wrote:
> > The good thing about Punycode/IDN is that it enabled DNS.  The
>> bad thing is that suddenly any network app needs to become a
>> DNS expert.
>
>Borrowing a theme from another discussion that has been going on
>in parallel, the good thing about getnameinfo and getaddrinfo
>are that they enable IPv6.  The bad thing is that suddenly any
>network app needs to become a routing preferences expert.

+1. Some complexity can be added along a smooth curve; other complexity takes a changed mindset (and, in
these cases, a changed API). Wishing it weren't so is fine, but by doing so you are rapidly reduced to "get
off my lawn, you kids!".
Nicolas Williams | 17 Jun 2010 18:11
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Wed, Jun 16, 2010 at 09:28:34PM -0400, John C Klensin wrote:
> --On Wednesday, June 16, 2010 16:54 -0500 Nicolas Williams
> <Nicolas.Williams <at> oracle.com> wrote:
> > Makes you think that private DNS clouds with IDN rules other
> > than IETF Standards-Track IDNA rules are not desirable.  And
> > I'd agree.
> > 
> > What's the point of this post?  First: to note that private
> > DNS clouds with non-standard IDN rules are a big PITA since
> > right now they can only be supported by nodes that either
> > happen to implement those rules (and not IDNA) or which have
> > local configuration partitioning the DNS namespace by IDN
> > rulesets, and distributed configuration, though it could be
> > possible, would also be a PITA since stub resolvers would have
> > to get pretty smart.  Second: to outline a meta-IDN system
> > that could work if IDNA2008 should founder (but let's hope
> > not).  Third: I had to write this down :)
> 
> I think there may be a fundamental misunderstanding here.  If
> your point is that we have a mess on our hands, we already know
> that... and that is starting point for this document.

See "First: to note that...".  That is: I don't think DNS, much less
_application_ implementors can be expected to support private DNS clouds
with non-standard IDN rules.  It's just too big a PITA.

My point was definitely not that we have a mess on our hands, UNLESS we
want implementors to support private DNS clouds with non-standard IDN
rules.  But how can we?  If such rules are non-standard...

(Continue reading)

Andrew Sullivan | 17 Jun 2010 22:22

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 11:11:55AM -0500, Nicolas Williams wrote:
> 
> See "First: to note that...".  That is: I don't think DNS, much less
> _application_ implementors can be expected to support private DNS clouds
> with non-standard IDN rules.  It's just too big a PITA.

Hold on, there.  The DNS allows _octets_ in domain name labels.  That
is, you can put "*&^_+é" to you heart's content in a DNS label, and it
all oughta be legal.  STD13 is perfectly clear on that:

    Each node has a label, which is zero to 63 octets in length.
    Brother nodes may not have the same label, although the same label
    can be used for nodes which are not brothers.  One label is
    reserved, and that is the null (i.e., zero length) label used for
    the root.

    […]

    The rationale for this [different-context] choice is that we may
    someday need to add full binary domain names for new services;
    existing services would not be changed.

The actual facts of the matter, and those facts' interaction with
other conventions, restrictions, and the myriad deployed stuff, is
rather different, which is how we got to IDNA2008.  But claiming that
"DNS can't be expected to support private DNS clouds with non-standard
IDN rules" misses the boat by almost 25 years.  It always did.

A

(Continue reading)

Nicolas Williams | 17 Jun 2010 22:39
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 04:22:09PM -0400, Andrew Sullivan wrote:
> On Thu, Jun 17, 2010 at 11:11:55AM -0500, Nicolas Williams wrote:
> > 
> > See "First: to note that...".  That is: I don't think DNS, much less
> > _application_ implementors can be expected to support private DNS clouds
> > with non-standard IDN rules.  It's just too big a PITA.
> 
> Hold on, there.  The DNS allows _octets_ in domain name labels.  That
> ...

It does.  However, there's no way that anyone will bother making
getaddrinfo(), DNS resolver, and application implementations that
actually know when to send A-labels versus when to send something else,
much less what that something else ought to be.

> The actual facts of the matter, and those facts' interaction with
> other conventions, restrictions, and the myriad deployed stuff, is
> rather different, which is how we got to IDNA2008.  But claiming that
> "DNS can't be expected to support private DNS clouds with non-standard
> IDN rules" misses the boat by almost 25 years.  It always did.

DNS can't work interoperably with multiple IDN rulesets for the simple
reason that to do so would require code to decide amongst IDN rules to
apply in context-specific manners.  The necessary guidance for
implementors to do that is missing to begin with.  And since there
almost certainly would be more than one set of IDN rules to choose from
(ISO8859-* encoded labels, UTF-8 encoded labels, with either
un-pre-processed or normalized-and/or-case-folded Unicode, IDNA2008 ACE
encoded labels, and so on).

(Continue reading)

Nicolas Williams | 17 Jun 2010 22:55
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 03:39:43PM -0500, Nicolas Williams wrote:
> On Thu, Jun 17, 2010 at 04:22:09PM -0400, Andrew Sullivan wrote:
> > On Thu, Jun 17, 2010 at 11:11:55AM -0500, Nicolas Williams wrote:
> > > 
> > > See "First: to note that...".  That is: I don't think DNS, much less
> > > _application_ implementors can be expected to support private DNS clouds
> > > with non-standard IDN rules.  It's just too big a PITA.
> > 
> > Hold on, there.  The DNS allows _octets_ in domain name labels.  That
> > ...
> 
> It does.  However, there's no way that anyone will bother making
> getaddrinfo(), DNS resolver, and application implementations that
> actually know when to send A-labels versus when to send something else,
> much less what that something else ought to be.

I should qualify this: someone might do that, but more than one such
implementation, with IDN interoperability between them?  I seriously
doubt it.

Nico
--

-- 
Andrew Sullivan | 17 Jun 2010 22:57

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 03:39:43PM -0500, Nicolas Williams wrote:

> It does.  However, there's no way that anyone will bother making
> getaddrinfo(), DNS resolver, and application implementations that
> actually know when to send A-labels versus when to send something else,
> much less what that something else ought to be.

I think this is probably right.

> DNS can't work interoperably with multiple IDN rulesets for the simple
> reason that to do so would require code to decide amongst IDN rules to
> apply in context-specific manners.  

Right.  See John Klensin's previous remarks about this: in small
communities of well-known behaviour, your favourite encoding as octets
in the zone work fine.  But given that we have multiple different
encodings, we surely do have a problem.  It's nevertheless simply too
late to say that the only thing anyone is allowed to put in a DNS zone
is an A-label.  We don't get to reformat the Internet like that.  The
DNS rules were established a long time ago, so there _is_ non-A-label
data in zone files already.

> If you really, really want this to work, then start thinking about
> solutions along the lines of my strawman proposal for an NS-like RR that
> indicates what IDN rules apply to delegated zones.  I'd rather help make
> IDNA2008 better by working on the APIs aspect of the problem.

I suggested similar things more than once over the past couple years,
and people told me every time that I might be running for the position
of "Bad Idea Fairy".
(Continue reading)

Nicolas Williams | 17 Jun 2010 23:03
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 04:57:28PM -0400, Andrew Sullivan wrote:
> On Thu, Jun 17, 2010 at 03:39:43PM -0500, Nicolas Williams wrote:
> > It does.  However, there's no way that anyone will bother making
> > getaddrinfo(), DNS resolver, and application implementations that
> > actually know when to send A-labels versus when to send something else,
> > much less what that something else ought to be.
> 
> I think this is probably right.

Good, then we can focus on moving forward with IDNA2008 :)

> > DNS can't work interoperably with multiple IDN rulesets for the simple
> > reason that to do so would require code to decide amongst IDN rules to
> > apply in context-specific manners.  
> 
> Right.  See John Klensin's previous remarks about this: in small
> communities of well-known behaviour, your favourite encoding as octets
> in the zone work fine.  But given that we have multiple different
> encodings, we surely do have a problem.  It's nevertheless simply too
> late to say that the only thing anyone is allowed to put in a DNS zone
> is an A-label.  We don't get to reformat the Internet like that.  The
> DNS rules were established a long time ago, so there _is_ non-A-label
> data in zone files already.

I'm not sure you can even get this to work in tiny environments, since
soon enough most operating systems and applications will implement
IDNA...

> > If you really, really want this to work, then start thinking about
> > solutions along the lines of my strawman proposal for an NS-like RR that
(Continue reading)

John C Klensin | 17 Jun 2010 23:28

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())


--On Thursday, June 17, 2010 16:57 -0400 Andrew Sullivan
<ajs <at> shinkuro.com> wrote:

>> DNS can't work interoperably with multiple IDN rulesets for
>> the simple reason that to do so would require code to decide
>> amongst IDN rules to apply in context-specific manners.  
> 
> Right.  See John Klensin's previous remarks about this: in
> small communities of well-known behaviour, your favourite
> encoding as octets in the zone work fine.  But given that we
> have multiple different encodings, we surely do have a
> problem.  It's nevertheless simply too late to say that the
> only thing anyone is allowed to put in a DNS zone is an
> A-label.  We don't get to reformat the Internet like that.  The
> DNS rules were established a long time ago, so there _is_
> non-A-label data in zone files already.

And that clearly applies to server-side application of UTR46 or
any other trick matching as well.  It isn't just that it
violates the spec (since the server-side matching rules for
octets are extremely clear), it is that some servers would be
extended to handle the special mapping, some would not, and one
couldn't tell the difference.  Even then, one would have to
assume that every server that did any mapping did it the same
way.  Despite a lot of interesting ideas and no matter how many
standards were approved by whatever bodies approved them, that
is profoundly unrealistic.   With or without different mapping
variations, "some map and some don't" could, in turn, easily
yield false positives, false negatives, and a collection of
(Continue reading)

Nicolas Williams | 17 Jun 2010 23:46
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 05:28:04PM -0400, John C Klensin wrote:
> The DNS works as well as it does partially because, while caches
> have to follow a few specific rules (including those for
> octet-level matching of labels in length-label pair form),
> caches can be pretty dumb.  Asking caches to be smart and able
> to reflect whatever matching rules the authoritative servers
> (and/or their authoritative parents) think appropriate means
> _really_ smart caches.

Yes.  That might be realistic if correct implementations existed that
were BSD-licensed, portable and very self-contained.  As it is this idea
is not realistic.

> And then there is the DNAME possibility and the consequent need
> for new primitives that authoritatively identify the tree in
> which an FQDN target is really located.
> 
> If one wanted it to work, I suggest that one would want to start
> by deprecating DNAME and maybe CNAME so that there was exactly

I don't think you'd have to deprecate CNAME.  In any case, CNAME can't
really be deprecated -- it's too useful and too widely in use.  I can
see operators coming at us with pitchforks if we tried.

> one way to access a particular DNS node.  Then one would need to
> think about at least one of a new Label Type (my current
> favorite), a new Class (probably not good enough, my early
> proposal to that effect notwithstanding), or an EDNS0 option to
> permit a client to differentiate among servers applying
> different rules (as far as I know, not yet comprehensively
(Continue reading)

Dave Thaler | 17 Jun 2010 23:36
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> -----Original Message-----
> From: Nicolas Williams [mailto:Nicolas.Williams <at> oracle.com]
> Sent: Thursday, June 17, 2010 1:40 PM
> To: Andrew Sullivan
> Cc: John C Klensin; cheshire <at> apple.com; idna-update <at> alvestrand.no; Dave
> Thaler
> Subject: Re: Distributed configuration of "private" IDNA (Re: IDNA and
> getnameinfo() and getaddrinfo())
> 
> On Thu, Jun 17, 2010 at 04:22:09PM -0400, Andrew Sullivan wrote:
> > On Thu, Jun 17, 2010 at 11:11:55AM -0500, Nicolas Williams wrote:
> > >
> > > See "First: to note that...".  That is: I don't think DNS, much less
> > > _application_ implementors can be expected to support private DNS
> > > clouds with non-standard IDN rules.  It's just too big a PITA.
> >
> > Hold on, there.  The DNS allows _octets_ in domain name labels.  That
> > ...
> 
> It does.  However, there's no way that anyone will bother making getaddrinfo(),
> DNS resolver, and application implementations that actually know when to send
> A-labels versus when to send something else, much less what that something
> else ought to be.

Not true.   We already have many widely deployed applications that attempt to
do exactly that (IE is one of them, and there's a number of others).  And of
course they get it wrong in corner cases, so few people notice.

I talked about some of them near the end of the IETF plenary talk.

(Continue reading)

Nicolas Williams | 17 Jun 2010 23:40
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 09:36:40PM +0000, Dave Thaler wrote:
> > > Hold on, there.  The DNS allows _octets_ in domain name labels.  That
> > > ...
> > 
> > It does.  However, there's no way that anyone will bother making getaddrinfo(),
> > DNS resolver, and application implementations that actually know when to send
> > A-labels versus when to send something else, much less what that something
> > else ought to be.
> 
> Not true.   We already have many widely deployed applications that attempt to
> do exactly that (IE is one of them, and there's a number of others).  And of
> course they get it wrong in corner cases, so few people notice.
> 
> I talked about some of them near the end of the IETF plenary talk.

Details?  Which apps, and how do they know when to do IDNA versus
something else, and what is that something else?  Does this interop with
other implementations?

Nico
--

-- 
Dave Thaler | 17 Jun 2010 23:45
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> -----Original Message-----
> From: idna-update-bounces <at> alvestrand.no [mailto:idna-update-
> bounces <at> alvestrand.no] On Behalf Of Nicolas Williams
> Sent: Thursday, June 17, 2010 2:41 PM
> To: Dave Thaler
> Cc: Andrew Sullivan; cheshire <at> apple.com; John C Klensin; idna-
> update <at> alvestrand.no
> Subject: Re: Distributed configuration of "private" IDNA (Re: IDNA and
> getnameinfo() and getaddrinfo())
> 
> On Thu, Jun 17, 2010 at 09:36:40PM +0000, Dave Thaler wrote:
> > > > Hold on, there.  The DNS allows _octets_ in domain name labels.
> > > > That ...
> > >
> > > It does.  However, there's no way that anyone will bother making
> > > getaddrinfo(), DNS resolver, and application implementations that
> > > actually know when to send A-labels versus when to send something
> > > else, much less what that something else ought to be.
> >
> > Not true.   We already have many widely deployed applications that attempt
> to
> > do exactly that (IE is one of them, and there's a number of others).
> > And of course they get it wrong in corner cases, so few people notice.
> >
> > I talked about some of them near the end of the IETF plenary talk.
> 
> Details?  Which apps, and how do they know when to do IDNA versus something
> else, and what is that something else?  Does this interop with other
> implementations?

(Continue reading)

Nicolas Williams | 17 Jun 2010 23:49
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 09:45:22PM +0000, Dave Thaler wrote:
> > > I talked about some of them near the end of the IETF plenary talk.
> > 
> > Details?  Which apps, and how do they know when to do IDNA versus something
> > else, and what is that something else?  Does this interop with other
> > implementations?
> 
> The something else is UTF-8.
> 
> See the plenary slides.  IE tries to guess based on its application configuration
> for intranet vs internet sites.  Other apps like Outlook and Windows Media Player
> try one (ACE form vs UTF-8) first and then try the other.

There have been many plenaries...  Which one?  IETF76?

Does this interop with Firefox?

Nico
--

-- 
Dave Thaler | 18 Jun 2010 00:05
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> -----Original Message-----
> From: idna-update-bounces <at> alvestrand.no [mailto:idna-update-
> bounces <at> alvestrand.no] On Behalf Of Nicolas Williams
> Sent: Thursday, June 17, 2010 2:50 PM
> To: Dave Thaler
> Cc: Andrew Sullivan; cheshire <at> apple.com; John C Klensin; idna-
> update <at> alvestrand.no
> Subject: Re: Distributed configuration of "private" IDNA (Re: IDNA and
> getnameinfo() and getaddrinfo())
> 
> On Thu, Jun 17, 2010 at 09:45:22PM +0000, Dave Thaler wrote:
> > > > I talked about some of them near the end of the IETF plenary talk.
> > >
> > > Details?  Which apps, and how do they know when to do IDNA versus
> > > something else, and what is that something else?  Does this interop
> > > with other implementations?
> >
> > The something else is UTF-8.
> >
> > See the plenary slides.  IE tries to guess based on its application
> > configuration for intranet vs internet sites.  Other apps like Outlook
> > and Windows Media Player try one (ACE form vs UTF-8) first and then try the
> other.
> 
> There have been many plenaries...  Which one?  IETF76?

Yes
http://www.ietf.org/proceedings/76/slides/plenaryt-1.pdf
See especially slides 55-57

(Continue reading)

Nicolas Williams | 18 Jun 2010 00:22
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 10:05:05PM +0000, Dave Thaler wrote:
> http://www.ietf.org/proceedings/76/slides/plenaryt-1.pdf
> See especially slides 55-57

Thanks.

> > Does this interop with Firefox?
> 
> Not sure what you mean by "interop" as it's purely a local algorithm.
> Can you rephrase your question?

If a user e-mails another a URL using an IDN encoded in UTF-8, and the
receipient tries to load it with FF, and the namespace in question uses
UTF-8 instead of IDNA ACE encoding, will the FF user be able to load
that URL?

What about URLs with such IDNs embedded in HTML served by servers in
that namespace?

Nico
--

-- 
Shawn Steele | 18 Jun 2010 00:56
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> If a user e-mails another a URL using an IDN encoded in UTF-8, 
> and the receipient tries to load it with FF, and the namespace in
> question uses UTF-8 instead of IDNA ACE encoding, will the FF
> user be able to load that URL?

It should, the OS sends the request to whatever your default browser is in Unicode, the browser better be
able to correctly handle the link.  AFAIK there's no big problems in this area in Windows (maybe some edge
cases).  Since we use *W APIs for everything, most of these scenarios work.  Problem is only if the client
doesn't know how to punycode, or if punycode gets injected unexpectedly into the *W API calls, or if
there's some other protocol dependency that hasn't been updated beyond ASCII.

> What about URLs with such IDNs embedded in HTML served by
> servers in that namespace?

Href="http://non-ascii" generally works in most browsers, HTML has a code page, so this isn't ambiguous
(unless the code page is wrong or missing).

-Shawn
Nicolas Williams | 18 Jun 2010 01:14
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 10:56:28PM +0000, Shawn Steele wrote:
> > If a user e-mails another a URL using an IDN encoded in UTF-8, 
> > and the receipient tries to load it with FF, and the namespace in
> > question uses UTF-8 instead of IDNA ACE encoding, will the FF
> > user be able to load that URL?
> 
> It should, the OS sends the request to whatever your default browser
> is in Unicode, the browser better be able to correctly handle the
> link.  AFAIK there's no big problems in this area in Windows (maybe

So the OS knows about the intranet/internet distinction?  Dave Thaler's
comments had led me to believe it was IE that knew this.  What if parts
of the intranet are not hosted on AD?  Where's the intranet/internet
distinction configured?

> some edge cases).  Since we use *W APIs for everything, most of these
> scenarios work.  Problem is only if the client doesn't know how to
> punycode, or if punycode gets injected unexpectedly into the *W API
> calls, or if there's some other protocol dependency that hasn't been
> updated beyond ASCII.

And if the receipient is running FF on something other than Windows?
(e.g., MacOS X, Linux, Solaris, *BSD.)

I'm guessing the answer is "no", "I don't know", or even "good luck" :)

The point is: your private-namespace-with-UTF-8-instead-of-IDNA solution
is almost certainly not interoperable in heterogeneous deployments.

(Pardon the redundancy.  I shouldn't have to say "in heterogeneous
(Continue reading)

Shawn Steele | 18 Jun 2010 01:30
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> So the OS knows about the intranet/internet distinction?

No, if you type Windows+R and http://somewhere.com, that URL is passed to the target app in Unicode, it
doesn't matter where the target is.  The intranet (utf-8) / internet (ACE, actually, I did get utf-8 across
the 'net years ago) distinction is currently handled by the application.  Basically it's a "Try UTF-8, if
that doesn't work, try ACE" approach, but some apps do it the other way around.  Dave's looking at how to make
the APIs smarter so the apps don't have to make silly guesses.

> And if the receipient is running FF on something other than Windows?
> (e.g., MacOS X, Linux, Solaris, *BSD.)

I don't have a clue how those work :)  

My understanding is that most browsers are happy to convert URLs to Punycode as necessary, I can't imagine
why they'd have different logic on other OS's.  Certainly as a web author, I am NOT going to write
href="http://xn--punycode".  Content authors are going to use Unicode (because they can read it).  So,
unless you "fix" all the blog tools, etc. to convert Unicode to punycode, there's lots of hrefs in the wild
that are outside of the ASCII space.

I've been led to believe that non-punycode hrefs have been seen "in the wild".  (Indeed I think they
outnumber punycode ones).

I cannot imagine why you'd want to force a Unicode string to punycode to pass it between applications,
unless you are doing a DNS query.  The canonical form should be U-label and that's what apps should exchange.

-Shawn
Nicolas Williams | 18 Jun 2010 01:39
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 11:30:38PM +0000, Shawn Steele wrote:
> > So the OS knows about the intranet/internet distinction?
> 
> No, if you type Windows+R and http://somewhere.com, that URL is passed
> to the target app in Unicode, it doesn't matter where the target is.
> The intranet (utf-8) / internet (ACE, actually, I did get utf-8 across
> the 'net years ago) distinction is currently handled by the
> application.  Basically it's a "Try UTF-8, if that doesn't work, try
> ACE" approach, but some apps do it the other way around.  Dave's
> looking at how to make the APIs smarter so the apps don't have to make
> silly guesses.

OK, that (try UTF-8 first, then ACE) could work.  Do you prepare the
UTF-8 in any way?  E.g., normalize it?  Case-fold it?  Or do you rely on
user input being in NFC already due to how input methods work?

> > And if the receipient is running FF on something other than Windows?
> > (e.g., MacOS X, Linux, Solaris, *BSD.)
> 
> I don't have a clue how those work :)  
> 
> My understanding is that most browsers are happy to convert URLs to
> Punycode as necessary, I can't imagine why they'd have different logic
> on other OS's.  Certainly as a web author, I am NOT going to write
> href="http://xn--punycode".  Content authors are going to use Unicode
> (because they can read it).  So, unless you "fix" all the blog tools,
> etc. to convert Unicode to punycode, there's lots of hrefs in the wild
> that are outside of the ASCII space.
> 
> I've been led to believe that non-punycode hrefs have been seen "in
(Continue reading)

Shawn Steele | 18 Jun 2010 01:53
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> OK, that (try UTF-8 first, then ACE) could work.  Do you prepare the
> UTF-8 in any way?  E.g., normalize it?  Case-fold it?

That's an area that is really not good right now.  DNS is obviously case-insensitive in ASCII, however the
UTF-8 that we've allowed is pretty dumb and doesn't do any mapping/filtering.  It is "obvious" that we
should only allow UTS#46 behavior type U-Labels in the future, however getting to that point might be
tricky since internal machines could already have names that conflict with that mapping..  (We could
apply those rules to the UTF-8 string when doing lookup).  Assuming someone's using the canonical U-Label
form, it's not a problem.  Dave's working with some folks looking at that.

> My concern was: how do you do hostname resolution in an environment with private sub-namespaces with
non-IDNA IDN rules.

Well, that's where "try utf-8 1st, then ACE" happens.  Of course, if they don't both use UTS#46 mappings,
then you've got a problem since one may resolve one way and not the other :(   I think the "right" answer would
be to use UTS#46 conformant U-Labels when doing UTF-8 lookup, however that's breaking in some
environments.  

Historically, it seems to have worked due to coincidences like the IME's generally enter data in the same
form, etc., so we're unlikely to see NFD requests for NFC names, but it's obviously a point of weakness.  I'd
expect other environments got away with similar things in native code page requests, etc.

-Shawn
Nicolas Williams | 18 Jun 2010 02:04
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 11:53:44PM +0000, Shawn Steele wrote:
> > OK, that (try UTF-8 first, then ACE) could work.  Do you prepare the
> > UTF-8 in any way?  E.g., normalize it?  Case-fold it?
> 
> That's an area that is really not good right now.  DNS is obviously
> case-insensitive in ASCII, however the UTF-8 that we've allowed is
> pretty dumb and doesn't do any mapping/filtering.  It is "obvious"
> that we should only allow UTS#46 behavior type U-Labels in the future,
> however getting to that point might be tricky since internal machines
> could already have names that conflict with that mapping..  (We could
> apply those rules to the UTF-8 string when doing lookup).  Assuming
> someone's using the canonical U-Label form, it's not a problem.
> Dave's working with some folks looking at that.

Thanks, this is useful information.  Particularly if you want anyone
else to interop with this.

> > My concern was: how do you do hostname resolution in an environment
> > with private sub-namespaces with non-IDNA IDN rules.
> 
> Well, that's where "try utf-8 1st, then ACE" happens.  Of course, if
> they don't both use UTS#46 mappings, then you've got a problem since
> one may resolve one way and not the other :(   I think the "right"
> answer would be to use UTS#46 conformant U-Labels when doing UTF-8
> lookup, however that's breaking in some environments.  
> 
> Historically, it seems to have worked due to coincidences like the
> IME's generally enter data in the same form, etc., so we're unlikely
> to see NFD requests for NFC names, but it's obviously a point of
> weakness.  I'd expect other environments got away with similar things
(Continue reading)

Shawn Steele | 18 Jun 2010 02:57
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> NFD can and has leaked.  MacOS X's HFS+ normalizes filenames to 
> NFD on create, and when you list directories the names appear in NFD.

We're aware of that issue.

> (Names that breaks because of differences inIDNA2003 and 2008 will
>  be few and far between, and not that big a deal.)

Actually IDNA2008 doesn't provide any standard mapping form, so users expecting reasonable IDNA2003
mappings will break quite often :(  Fortunately UTS#46 helps with that.

> Affected users may be annoyed at first, but also presumably overjoyed as well at the aesthetic/semantic improvement.)

Primarily we hear when they're annoyed :)  When something "breaks" we are bound to have tons of feedback
about the change.  Obviously we want to "do the right thing."  Getting there from the current installed base
is sometimes tricky.

-Shawn
Andrew Sullivan | 18 Jun 2010 19:40

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Thu, Jun 17, 2010 at 11:30:38PM +0000, Shawn Steele wrote:
> I cannot imagine why you'd want to force a Unicode string to punycode to pass it between applications,
unless you are doing a DNS query.  The canonical form should be U-label and that's what apps should exchange.
> 

It seems to me that if you're going to require the exchange of only
U-labels, and you're going to validate your input, then it actually
doesn't matter whether you exchange U-labels or A-labels: they're
freely convertible back and forth, and if you want to know whether
something is a valid U-label (for instance), the easiest way might
well be just to convert it to an A-label, and back into a U-label, and
see if you get the binary equivalent as output.

So an application that expects to hand around DNS label slots as
U-labels actually needs to be able to cope with A-labels, and
conversely, it seems to me.

A

--

-- 
Andrew Sullivan
ajs <at> shinkuro.com
Shinkuro, Inc.
Shawn Steele | 18 Jun 2010 21:08
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> It seems to me that if you're going to require the exchange of only
> U-labels, and you're going to validate your input

How often is data actually validated?  Often href's aren't (at least not when intially entered). 
Applications just assume a domain name will resolve, and, if it doesn't, it fails then.

-Shawn
Patrik Fältström | 18 Jun 2010 21:22
Picon
Gravatar

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())


On 18 jun 2010, at 21.08, Shawn Steele wrote:

>> It seems to me that if you're going to require the exchange of only
>> U-labels, and you're going to validate your input
> 
> How often is data actually validated?  Often href's aren't (at least not when intially entered). 
Applications just assume a domain name will resolve, and, if it doesn't, it fails then.

This is because programmers think they are working with ascii, and comparison algorithm with ascii is so
simple. It is not for "unicode stuff". It is for U-labels, but not for unicode strings in general.

   Patrik

_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
Shawn Steele | 18 Jun 2010 22:00
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

> This is because programmers think they are working with ascii, 
> and comparison algorithm with ascii is so simple. It is not for 
> "unicode stuff". It is for U-labels, but not for unicode strings in 
> general.

Well, yea, people don't handle Unicode very well, which is why it's a good idea to used globalized
comparison APIs from whatever OS you're using to compare strings, so you don't have to rebuild it
yourselves.  (Though you do need to know what flags to use and what APIs to call when :) 

If an app needs to compare domain names, they should call the domain name SDK's "CompareNames()" function. 
Then that can ensure they're in canonical form or whatnot and do the right thing.  Of course that requires
that someone make such a method for apps to call.

Abstraction is important.  Any app that's validating the IDNA2003 bidi rules by themselves now has to
change because the bidi rules changed for IDNA2008.  Nothing about that app's space changed, only the
rules for DNS it depends on.

All I'm trying to get at is that the abstraction is important.  People shouldn't embed detailed knowledge in
layers where it doesn't belong.  Suppose people embed ACE and then "we (the WG/IETF)" decide that we'll
allow UTF-8 DNS after all.  What happens then?  If they just embed Unicode and call "IsValidDomainName()",
"CompareDomainName()", "MakeCanonicalDomainName()" or whatever's necessary, then it doesn't
matter what happens to IDN in the future, when their library gets updated with any updated behavior,
they'll still work.

-Shawn
Andrew Sullivan | 18 Jun 2010 21:33

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Fri, Jun 18, 2010 at 07:08:16PM +0000, Shawn Steele wrote:
> 
> How often is data actually validated?  Often href's aren't (at least not when intially entered). 
Applications just assume a domain name will resolve, and, if it doesn't, it fails then.
> 

And of course, this blind acceptance of any data from any random place
in the Net including random evil humans has caused no trouble?  This
can't, surely, be the plan, even if it is in fact how things are done.
If you're going to insist on U-labels for interchange, you have _no
choice_ but to validate them as actually being U-labels, or they are
all but guaranteed to have crap in them that will never make it
through the IDNA2008 algorithms when it is finally time to do this.

I completely agree with you that it would be insane to require every
application to "do DNS".  But they can't handle domain name slots in
an "internationalized" way, and expect a standard interchange format
(with all its restrictions), but take whatever binary data they get
(or anyway, not reliably).

A

--

-- 
Andrew Sullivan
ajs <at> shinkuro.com
Shinkuro, Inc.
Shawn Steele | 18 Jun 2010 21:50
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

Not "crapping out" is simple, the app can make a DNS request and see if something comes back.  It hardly
matters if it matches all the DNS rules and dots it's i's and crosses it's t's if there no record to back it up. 
It seems like it should be up to the app to decide if it's worth validating up-front, or allowing the failure
to happen rather late.

If the application needs to do validation, then there should be a "CheckValidDomainName" API or
something.  Applications shouldn't be doing those things themselves.  (If they did, then
IDNA2003->IDNA2008 would require all apps be touched, not just whatever APIs they call.)  Email for
example constantly gets complaints about apps inconsistently validating "valid" (but unusual) addresses.

Apps (& even other standards) should NOT be required to know how the bidi rules work and all that.  Instead
they should point to an SDK to handle that (or standards should point to the standard), instead of
rebuilding everything themselves.

-Shawn

 
http://blogs.msdn.com/shawnste



________________________________________
From: Andrew Sullivan [ajs <at> shinkuro.com]
Sent: Friday, June 18, 2010 12:33 PM
To: Shawn Steele
Cc: cheshire <at> apple.com; idna-update <at> alvestrand.no; John C Klensin; Dave Thaler; Nicolas Williams
Subject: Re: Distributed configuration of "private" IDNA (Re: IDNA and  getnameinfo() and getaddrinfo())

On Fri, Jun 18, 2010 at 07:08:16PM +0000, Shawn Steele wrote:
>
> How often is data actually validated?  Often href's aren't (at least not when intially entered). 
(Continue reading)

Nicolas Williams | 17 Jun 2010 19:55
Picon
Favicon

Re: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

On Wed, Jun 16, 2010 at 09:28:34PM -0400, John C Klensin wrote:
> --On Wednesday, June 16, 2010 16:54 -0500 Nicolas Williams
> <Nicolas.Williams <at> oracle.com> wrote:
> > So, to resolve tést.{foó, foóbar, óther}.example. the
> > _resolver_ would first have to split the input string into
> > labels using whatever fullstops are legal in the current
> > locale, then lookup each of those domains' IDNA rules in the
> > example. TLD zone, do whatever codeset conversions and
> > pre-processing may be required to meet the rules found, then
> > do the next query.  And so on.
> 
> Well, remember that, if fullstops are not global, one needs to
> be very careful to keep local ones from leaking.  If they do

Since I was concerning myself with the DNS protocol in particular, there
is no such concern (full stops don't appear in DNS the protocol).

> leak, a parser that tries to separate an FQDN into labels will
> end up with a high error rate.  That would make the bad guys,
> who have lots of fun with URLs that trick users into believing
> that third- or fourth-level names are really second-level ones,
> very happy.  I trust their happiness is not our goal.

Very good point.  Full stops need to be globally defined for all
locales.

Of course, my proposal was a strawman, intended primarily to show that
we cannot be expected to support private DNS clouds with non-standard
IDN rules.

(Continue reading)

Dave Thaler | 17 Jun 2010 23:42
Picon
Favicon

RE: Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

Nico writes:
> > Ned Freed pointed out in that context, if you really want this to be
> > transparent to the application, the relevant interface is some flavor
> > of "SetupConnectionByName" with which the application starts with an
> > opaque name and then, subject to some parameters or function-name
> > variations, ends up with a connection.  Sadly, taking away the need
> > for expert knowledge of the DNS alone really doesn't help a lot.
> 
> Exactly!  Ned's "SetupConnectionByName" is an example of "better APIs".

As Stuart mentioned in the IPv6 panel plenary, such apis already exist on both
Windows and MacOS.

-Dave
Shawn Steele | 15 Jun 2010 18:09
Picon
Favicon

Re: IDNA and getnameinfo() and getaddrinfo()

FWIW: I don't think that applications should need to understand how DNS works.  (Something of a seperation
of business logic concept as probably taught in, like, CS101 - "Don't make your app know more than it has
to.")  IMO it'd be nice if app developers that need to open a connection to a server had all the Punycode
ugliness layered away by some nice set of DNS APIs, or even higher level at the open connection APIs or whatever.

Unforunately, Punycode means that some apps will want to decode the string anyway because they'd like
pretty names.  Some sort of getcanonicalname() or something could help there.  I realize there's an "A" in
"IDNA", but if every app has to do punycode conversion themselves there's going to be tons of odd
inconsistencies in what they're doing.  It could also mean "tweaking" thousands of apps if the IDNA20xx
rules change a little.  (Eg: like the bidi rules did this go around).

The good thing about Punycode/IDN is that it enabled DNS.  The bad thing is that suddenly any network app
needs to become a DNS expert.

-Shawn

 
http://blogs.msdn.com/shawnste
_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
Nicolas Williams | 15 Jun 2010 18:51
Picon
Favicon

Re: IDNA and getnameinfo() and getaddrinfo()

On Tue, Jun 15, 2010 at 04:09:13PM +0000, Shawn Steele wrote:
> FWIW: I don't think that applications should need to understand how
> DNS works.  (Something of a seperation of business logic concept as
> probably taught in, like, CS101 - "Don't make your app know more than
> it has to.")  IMO it'd be nice if app developers that need to open a
> connection to a server had all the Punycode ugliness layered away by
> some nice set of DNS APIs, or even higher level at the open connection
> APIs or whatever.

Simon's extensions for getname/addrinfo() are the kind of APIs I'm
looking for.  I'm not sure if that's what you have in mind, but I'd love
to hear about it.

> Unforunately, Punycode means that some apps will want to decode the
> string anyway because they'd like pretty names.  Some sort of
> getcanonicalname() or something could help there.  I realize there's
> an "A" in "IDNA", but if every app has to do punycode conversion
> themselves there's going to be tons of odd inconsistencies in what
> they're doing.  It could also mean "tweaking" thousands of apps if the
> IDNA20xx rules change a little.  (Eg: like the bidi rules did this go
> around).

If an application is just dealing with hostnames then it's easy:
ToUnicode().  If the application needs hostname<->IP address resolution
then it needs something more like getname/addrinfo() with suitable
enhancements.  If the application needs to deal with e-mail addresses,
IRIs, etcetera, then the application is going to need APIs that are
specific to those.  The alternative is that the application must
implement all the relevant rules, all the time, and as draft-iab-idn-
encoding explains, that isn't always possible (the example being the
(Continue reading)

Shawn Steele | 15 Jun 2010 18:54
Picon
Favicon

RE: IDNA and getnameinfo() and getaddrinfo()

We're in agreement, I think.  I'd rather have IDN work for getname, etc. by default though.  (Then maybe it'd
"just work"?)  Instead of ToUnicode(), which is very specific, I'd prefer a more general
"GetPrettyDNSName()."  Then, if more steps than just ToUnicode() are ever interesting, the gory details
are hidden from the app.  Specifially, on machines hosting both UTF-8 (or other code page) and Punycode
DNS, the actual form of "pretty" could differ depending on how a name was looked up.

For email, at least, UTF8SMTP is much smarter.  The DNS layer is muddy still, but at least the addresses are
"just utf-8", and don't get to this state of having strange encodings leaking all over the place.

-Shawn

 
http://blogs.msdn.com/shawnste



________________________________________
From: idna-update-bounces <at> alvestrand.no [idna-update-bounces <at> alvestrand.no] on behalf of Nicolas
Williams [Nicolas.Williams <at> oracle.com]
Sent: Tuesday, June 15, 2010 9:51 AM
To: Shawn Steele
Cc: idna-update <at> alvestrand.no
Subject: Re: IDNA and getnameinfo() and getaddrinfo()

On Tue, Jun 15, 2010 at 04:09:13PM +0000, Shawn Steele wrote:
> FWIW: I don't think that applications should need to understand how
> DNS works.  (Something of a seperation of business logic concept as
> probably taught in, like, CS101 - "Don't make your app know more than
> it has to.")  IMO it'd be nice if app developers that need to open a
> connection to a server had all the Punycode ugliness layered away by
> some nice set of DNS APIs, or even higher level at the open connection
(Continue reading)

Nicolas Williams | 15 Jun 2010 19:40
Picon
Favicon

Re: IDNA and getnameinfo() and getaddrinfo()

On Tue, Jun 15, 2010 at 04:54:46PM +0000, Shawn Steele wrote:
> We're in agreement, I think.  I'd rather have IDN work for getname,

Probably.

> etc. by default though.  (Then maybe it'd "just work"?)  Instead of

Indeed, I think I might even want to reverse Simon's flags' semantics:
the default behavior should be to use U-labels as canonical and to
support searches by un-prepared text.

However, if Simon has already deployed his extensions...  Simon?

I think Simon's playing it safe, and perhaps a conservative approach
that results in A-labels by default in UIs is better.  I'll have to
think about this, but my gut feeling is that there's no reason that we
couldn't reverse the default sense of Simon's extensions.

> ToUnicode(), which is very specific, I'd prefer a more general
> "GetPrettyDNSName()."  Then, if more steps than just ToUnicode() are
> ever interesting, the gory details are hidden from the app.

At the abstract API level I think ToUnicode() is perfectly fine.  For
actual programming language bindings something else is needed to provide
the context.  For a language with "package" names that something else
could be a package name -- "DNS::IDNA", say.  For a language like C,
with a flat namespace we'd need the function name to be more indicative
of what it does, as in your suggestion.

Nico
(Continue reading)

Shawn Steele | 15 Jun 2010 20:18
Picon
Favicon

RE: IDNA and getnameinfo() and getaddrinfo()

> I think Simon's playing it safe, and perhaps a conservative approach
> that results in A-labels by default in UIs is better.

Depends, perhaps, on your perspective.  I think international users would be happiest if the Unicode forms
worked by default.  It makes sense if some app didn't want that behavior, but adding a flag means everyone
has to recompile.  Hopefully (perhaps naively), some apps might work without that step if it handled IDN by
default.  My personal preference anyway.

- Shawn
Nicolas Williams | 15 Jun 2010 20:29
Picon
Favicon

Re: IDNA and getnameinfo() and getaddrinfo()

On Tue, Jun 15, 2010 at 06:18:07PM +0000, Shawn Steele wrote:
> > I think Simon's playing it safe, and perhaps a conservative approach
> > that results in A-labels by default in UIs is better.
> 
> Depends, perhaps, on your perspective.  I think international users
> would be happiest if the Unicode forms worked by default.  It makes
> sense if some app didn't want that behavior, but adding a flag means
> everyone has to recompile.  Hopefully (perhaps naively), some apps
> might work without that step if it handled IDN by default.  My
> personal preference anyway.

I don't have enough experience here, but intuitively I believe that
defaults which result in less A-label UI leakage should be better (as
long as we don't then end up with U-label leakage into IDN-unaware
domainname slots -- that's the risk).

Since Simon's extensions ahve shipped, we cannot reverse their sense.
Therefore I propose that we add a pair of flags with the reverse sense,
and then allow the default behavior to be implementation-specific and/or
locally configurable.  Then we'll be able to see what, if anything,
breaks by having getnameinfo() return U-labels by default and
getnameinfo() supporting unprepared inputs.

Nico
--

-- 

Gmane