Randy Presuhn | 1 May 17:57
Picon

draft-4646bis, Section 4.1 (4)(4)

Hi -

I think the following text in section 4.1 (4)(4) is misleading, if
not simply incorrect.  The current text says
           Note: where there are fragments of
           linguistic content, such as programming source code
           containing comments written in English, the subtag 'zxx'
           might still be used to indicate the primary status of the
           content, just as 'en' can be applied to a predominantly
           English text that contains a few French phrases.

I realize that this is already quite weak, with no normative force,
but it is misleading, and I believe it points in the wrong direction
for the tagging of artifacts like programming source code.  Now this
may be a function of the quality of source code in question, but I
think the analogy to an English document containing a few French phrases
is incorrect.  For well-maintained source code, the embedded comments
make up a substantial percentage, if not the majority of the
textual content.  In some ways, the comments are more important than
the code itself, since they (rather than anything machine-readable)
provide the interface semantics in most environments.  A more
appropriate analogy would be to an English-language document containing
some mathematical stuff.  I've seen (unmaintainable) chunks of code
with no comments at all.  For such abominations zxx makes a lot of sense.
But the vast majority of he production-grade software I've seen would
be much more sensibly tagged 'en'.

Proposal: delete the note.

Randy
(Continue reading)

Phillips, Addison | 1 May 19:15
Picon
Favicon

Re: draft-4646bis, Section 4.1 (4)(4)

I agree that we should eliminate the note. It really doesn't belong here. The preceding sentence says more
than enough about source code:

--
The 'zxx' (Non-Linguistic) primary language subtag identifies content that has no language. Some
examples might include instrumental or electronic music; sound recordings consisting of nonverbal
sounds; audiovisual materials with no narration, printed titles, or subtitles; machine-readable data
files consisting of machine languages or character codes; or programming source code.
--

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of
> Randy Presuhn
> Sent: Thursday, May 01, 2008 8:58 AM
> To: LTRU Working Group
> Subject: [Ltru] draft-4646bis, Section 4.1 (4)(4)
>
> Hi -
>
> I think the following text in section 4.1 (4)(4) is misleading, if
> not simply incorrect.  The current text says
(Continue reading)

Karen_Broome | 1 May 19:22
Picon

Re: draft-4646bis, Section 4.1 (4)(4)


How about:

The 'zxx' (Not Applicable) primary language subtag identifies content for which a language classification is inappropriate. Some examples might include instrumental or electronic music; sound recordings consisting of nonverbal sounds; audiovisual materials with no narration, dialog, printed titles, or subtitles; machine-readable data files consisting of machine languages or character codes; or programming source code.

Karen Broome



"Phillips, Addison" <addison <at> amazon.com>
Sent by: ltru-bounces <at> ietf.org

05/01/2008 10:15 AM

To
Randy Presuhn <randy_presuhn <at> mindspring.com>, LTRU Working Group <ltru <at> ietf.org>
cc
Subject
Re: [Ltru] draft-4646bis, Section 4.1 (4)(4)





I agree that we should eliminate the note. It really doesn't belong here. The preceding sentence says more than enough about source code:

--
The 'zxx' (Non-Linguistic) primary language subtag identifies content that has no language. Some examples might include instrumental or electronic music; sound recordings consisting of nonverbal sounds; audiovisual materials with no narration, printed titles, or subtitles; machine-readable data files consisting of machine languages or character codes; or programming source code.
--

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of
> Randy Presuhn
> Sent: Thursday, May 01, 2008 8:58 AM
> To: LTRU Working Group
> Subject: [Ltru] draft-4646bis, Section 4.1 (4)(4)
>
> Hi -
>
> I think the following text in section 4.1 (4)(4) is misleading, if
> not simply incorrect.  The current text says
>            Note: where there are fragments of
>            linguistic content, such as programming source code
>            containing comments written in English, the subtag 'zxx'
>            might still be used to indicate the primary status of the
>            content, just as 'en' can be applied to a predominantly
>            English text that contains a few French phrases.
>
> I realize that this is already quite weak, with no normative force,
> but it is misleading, and I believe it points in the wrong direction
> for the tagging of artifacts like programming source code.  Now this
> may be a function of the quality of source code in question, but I
> think the analogy to an English document containing a few French
> phrases
> is incorrect.  For well-maintained source code, the embedded comments
> make up a substantial percentage, if not the majority of the
> textual content.  In some ways, the comments are more important than
> the code itself, since they (rather than anything machine-readable)
> provide the interface semantics in most environments.  A more
> appropriate analogy would be to an English-language document containing
> some mathematical stuff.  I've seen (unmaintainable) chunks of code
> with no comments at all.  For such abominations zxx makes a lot of
> sense.
> But the vast majority of he production-grade software I've seen would
> be much more sensibly tagged 'en'.
>
> Proposal: delete the note.
>
> Randy
>
> _______________________________________________
> Ltru mailing list
> Ltru <at> ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru


Phillips, Addison | 1 May 19:54
Picon
Favicon

Re: draft-4646bis, Section 4.1 (4)(4)

I agree with this edit and have incorporated it. Upon further reflection, though, it seems like we should
provide some SHOULD/SHOULD NOT recommendation for this subtag, as we have with ‘mul’, ‘und’,
and ‘mis’. I would therefore suggest:


<t>The 'zxx' (Non-Linguistic, Not Applicable) primary language subtag identifies content for which a
language classification is inappropriate or does not apply. This subtag SHOULD NOT be used unless a
language tag is required, since the content to which it applies typically is not suitable for
identification with a language tag. It also SHOULD NOT be combined with other subtags, since other
subtags have no real meaning in a non-lingusitic context. Some examples might include instrumental or
electronic music; sound recordings consisting of nonverbal sounds; audiovisual materials with no
narration, printed titles, or subtitles; machine-readable data files consisting of machine languages
or character codes; or programming source code.</t>



Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

From: Karen_Broome <at> spe.sony.com [mailto:Karen_Broome <at> spe.sony.com]
Sent: Thursday, May 01, 2008 10:23 AM
To: Phillips, Addison; LTRU Working Group
Subject: Re: [Ltru] draft-4646bis, Section 4.1 (4)(4)


How about:

The 'zxx' (Not Applicable) primary language subtag identifies content for which a language
classification is inappropriate. Some examples might include instrumental or electronic music; sound
recordings consisting of nonverbal sounds; audiovisual materials with no narration, dialog, printed
titles, or subtitles; machine-readable data files consisting of machine languages or character codes;
or programming source code.

Karen Broome


"Phillips, Addison" <addison <at> amazon.com>
Sent by: ltru-bounces <at> ietf.org
05/01/2008 10:15 AM
To
Randy Presuhn <randy_presuhn <at> mindspring.com>, LTRU Working Group <ltru <at> ietf.org>
cc

Subject
Re: [Ltru] draft-4646bis, Section 4.1 (4)(4)







I agree that we should eliminate the note. It really doesn't belong here. The preceding sentence says more
than enough about source code:

--
The 'zxx' (Non-Linguistic) primary language subtag identifies content that has no language. Some
examples might include instrumental or electronic music; sound recordings consisting of nonverbal
sounds; audiovisual materials with no narration, printed titles, or subtitles; machine-readable data
files consisting of machine languages or character codes; or programming source code.
--

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of
> Randy Presuhn
> Sent: Thursday, May 01, 2008 8:58 AM
> To: LTRU Working Group
> Subject: [Ltru] draft-4646bis, Section 4.1 (4)(4)
>
> Hi -
>
> I think the following text in section 4.1 (4)(4) is misleading, if
> not simply incorrect.  The current text says
>            Note: where there are fragments of
>            linguistic content, such as programming source code
>            containing comments written in English, the subtag 'zxx'
>            might still be used to indicate the primary status of the
>            content, just as 'en' can be applied to a predominantly
>            English text that contains a few French phrases.
>
> I realize that this is already quite weak, with no normative force,
> but it is misleading, and I believe it points in the wrong direction
> for the tagging of artifacts like programming source code.  Now this
> may be a function of the quality of source code in question, but I
> think the analogy to an English document containing a few French
> phrases
> is incorrect.  For well-maintained source code, the embedded comments
> make up a substantial percentage, if not the majority of the
> textual content.  In some ways, the comments are more important than
> the code itself, since they (rather than anything machine-readable)
> provide the interface semantics in most environments.  A more
> appropriate analogy would be to an English-language document containing
> some mathematical stuff.  I've seen (unmaintainable) chunks of code
> with no comments at all.  For such abominations zxx makes a lot of
> sense.
> But the vast majority of he production-grade software I've seen would
> be much more sensibly tagged 'en'.
>
> Proposal: delete the note.
>
> Randy
>
> _______________________________________________
> Ltru mailing list
> Ltru <at> ietf.org
> https://www.ietf.org/mailman/listinfo/ltru

_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru



_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru
Peter Constable | 2 May 06:28
Picon
Favicon

Re: draft-4646bis, Section 4.1 (4)(4)

> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of
> Phillips, Addison
> Sent: Thursday, May 01, 2008 10:55 AM

> I would therefore suggest:
>
>
> <t>The 'zxx' (Non-Linguistic, Not Applicable) primary language subtag
> identifies content for which a language classification is inappropriate
> or does not apply. This subtag SHOULD NOT be used unless a language tag
> is required, since the content to which it applies typically is not
> suitable for identification with a language tag. It also SHOULD NOT be
> combined with other subtags, since other subtags have no real meaning
> in a non-lingusitic context.

I'm not sure I see the need for that last sentence, and wonder if it might be overly restricting. BCP47
provides a tagging scheme that can be usefully applied to a wider range of applications than those which
any of us may have had in mind, and if it's useful in a given application context to create tags along the
lines of "zxx-Hebr" then it might be unhelpful for us to discourage that. What would we be trying to protect
ourselves from?

Peter

Mark Davis | 2 May 06:46
Favicon

Re: draft-4646bis, Section 4.1 (4)(4)

Now that you point it out, I agree. It is, as I had said, meaningful to have zxx-Arab or zxx-Hant.

Mark

On Thu, May 1, 2008 at 9:28 PM, Peter Constable <petercon <at> microsoft.com> wrote:
> Phillips, Addison
> Sent: Thursday, May 01, 2008 10:55 AM


> I would therefore suggest:
>
>
> <t>The 'zxx' (Non-Linguistic, Not Applicable) primary language subtag
> identifies content for which a language classification is inappropriate
> or does not apply. This subtag SHOULD NOT be used unless a language tag
> is required, since the content to which it applies typically is not
> suitable for identification with a language tag. It also SHOULD NOT be
> combined with other subtags, since other subtags have no real meaning
> in a non-lingusitic context.

I'm not sure I see the need for that last sentence, and wonder if it might be overly restricting. BCP47 provides a tagging scheme that can be usefully applied to a wider range of applications than those which any of us may have had in mind, and if it's useful in a given application context to create tags along the lines of "zxx-Hebr" then it might be unhelpful for us to discourage that. What would we be trying to protect ourselves from?


Peter



_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru



--
Mark
Karen_Broome | 1 May 20:06
Picon

Re: draft-4646bis, Section 4.1 (4)(4)

Typo on "non-linguistic" in the third sentence.  I also added "dialog" to 
the last sentence. See my version. "Dialog" is not the same as narration. 

I would not include the third sentence. I would think it's my right to 
create, say:

zxx-x-music
zxx-x-silent
zxx-x-code
zxx-x-perl

etc.

In this case, I think no guidance is better. 

Regards,

Karen Broome





"Phillips, Addison" <addison <at> amazon.com> 
05/01/2008 10:54 AM

To
"Karen_Broome <at> spe.sony.com" <Karen_Broome <at> spe.sony.com>, LTRU Working 
Group <ltru <at> ietf.org>
cc

Subject
RE: [Ltru] draft-4646bis, Section 4.1 (4)(4)






I agree with this edit and have incorporated it. Upon further reflection, 
though, it seems like we should provide some SHOULD/SHOULD NOT 
recommendation for this subtag, as we have with ‘mul’, ‘und’, and ‘mis’. I 
would therefore suggest:


<t>The 'zxx' (Non-Linguistic, Not Applicable) primary language subtag 
identifies content for which a language classification is inappropriate or 
does not apply. This subtag SHOULD NOT be used unless a language tag is 
required, since the content to which it applies typically is not suitable 
for identification with a language tag. It also SHOULD NOT be combined 
with other subtags, since other subtags have no real meaning in a 
non-lingusitic context. Some examples might include instrumental or 
electronic music; sound recordings consisting of nonverbal sounds; 
audiovisual materials with no narration, printed titles, or subtitles; 
machine-readable data files consisting of machine languages or character 
codes; or programming source code.</t>



Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

From: Karen_Broome <at> spe.sony.com [mailto:Karen_Broome <at> spe.sony.com]
Sent: Thursday, May 01, 2008 10:23 AM
To: Phillips, Addison; LTRU Working Group
Subject: Re: [Ltru] draft-4646bis, Section 4.1 (4)(4)


How about:

The 'zxx' (Not Applicable) primary language subtag identifies content for 
which a language classification is inappropriate. Some examples might 
include instrumental or electronic music; sound recordings consisting of 
nonverbal sounds; audiovisual materials with no narration, dialog, printed 
titles, or subtitles; machine-readable data files consisting of machine 
languages or character codes; or programming source code.

Karen Broome


"Phillips, Addison" <addison <at> amazon.com>
Sent by: ltru-bounces <at> ietf.org
05/01/2008 10:15 AM
To
Randy Presuhn <randy_presuhn <at> mindspring.com>, LTRU Working Group 
<ltru <at> ietf.org>
cc

Subject
Re: [Ltru] draft-4646bis, Section 4.1 (4)(4)







I agree that we should eliminate the note. It really doesn't belong here. 
The preceding sentence says more than enough about source code:

--
The 'zxx' (Non-Linguistic) primary language subtag identifies content that 
has no language. Some examples might include instrumental or electronic 
music; sound recordings consisting of nonverbal sounds; audiovisual 
materials with no narration, printed titles, or subtitles; 
machine-readable data files consisting of machine languages or character 
codes; or programming source code.
--

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of
> Randy Presuhn
> Sent: Thursday, May 01, 2008 8:58 AM
> To: LTRU Working Group
> Subject: [Ltru] draft-4646bis, Section 4.1 (4)(4)
>
> Hi -
>
> I think the following text in section 4.1 (4)(4) is misleading, if
> not simply incorrect.  The current text says
>            Note: where there are fragments of
>            linguistic content, such as programming source code
>            containing comments written in English, the subtag 'zxx'
>            might still be used to indicate the primary status of the
>            content, just as 'en' can be applied to a predominantly
>            English text that contains a few French phrases.
>
> I realize that this is already quite weak, with no normative force,
> but it is misleading, and I believe it points in the wrong direction
> for the tagging of artifacts like programming source code.  Now this
> may be a function of the quality of source code in question, but I
> think the analogy to an English document containing a few French
> phrases
> is incorrect.  For well-maintained source code, the embedded comments
> make up a substantial percentage, if not the majority of the
> textual content.  In some ways, the comments are more important than
> the code itself, since they (rather than anything machine-readable)
> provide the interface semantics in most environments.  A more
> appropriate analogy would be to an English-language document containing
> some mathematical stuff.  I've seen (unmaintainable) chunks of code
> with no comments at all.  For such abominations zxx makes a lot of
> sense.
> But the vast majority of he production-grade software I've seen would
> be much more sensibly tagged 'en'.
>
> Proposal: delete the note.
>
> Randy
>
> _______________________________________________
> Ltru mailing list
> Ltru <at> ietf.org
> https://www.ietf.org/mailman/listinfo/ltru

_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru





_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru
Phillips, Addison | 1 May 20:35
Picon
Favicon

Re: draft-4646bis, Section 4.1 (4)(4)

Karen wrote:
>
> Typo on "non-linguistic" in the third sentence.  I also added "dialog"
> to
> the last sentence. See my version. "Dialog" is not the same as
> narration.

Both changes done. Thanks.

>
> I would not include the third sentence. I would think it's my right to
> create, say:
>
> zxx-x-music
> zxx-x-silent
> zxx-x-code
> zxx-x-perl

You're correct that it is your right to do so. This is actually the meaning of SHOULD NOT in my mind: you
shouldn't do it, unless you have a good reason to do so. If you have a good reason, you're free to do so. What I
think might be important (given certain threads on ietf-languages) is to indicate that "zxx-Latn",
"zxx-GB", "zxx-scotland", and definitely "zxx-Latn-GB-scotland" are ill-considered.

>
> etc.
>
> In this case, I think no guidance is better.
>
You're probably right. Given Randy's note, I'll omit my two proposed bits of mustard and make the para:

<t>The 'zxx' (Non-Linguistic, Not Applicable) primary language subtag identifies content for which a
language classification is inappropriate or does not apply. Some examples might include instrumental
or electronic music; sound recordings consisting of nonverbal sounds; audiovisual materials with no
narration, dialog, printed titles, or subtitles; machine-readable data files consisting of machine
languages or character codes; or programming source code.</t>

~Addison

Mark Davis | 1 May 22:21
Favicon

Re: draft-4646bis, Section 4.1 (4)(4)

I like the text you propose.

Note: I see nothing particularly wrong with zxx-Latn vs zxx-Grek. In the former case I have content where language doesn't really apply and the content is written in Latin characters; in the latter it is in Greek characters. While the application for such tags is probably rare, it is not meaningless.

Mark

On Thu, May 1, 2008 at 11:35 AM, Phillips, Addison <addison <at> amazon.com> wrote:
Karen wrote:
>
> Typo on "non-linguistic" in the third sentence.  I also added "dialog"
> to
> the last sentence. See my version. "Dialog" is not the same as
> narration.

Both changes done. Thanks.

>
> I would not include the third sentence. I would think it's my right to
> create, say:
>
> zxx-x-music
> zxx-x-silent
> zxx-x-code
> zxx-x-perl

You're correct that it is your right to do so. This is actually the meaning of SHOULD NOT in my mind: you shouldn't do it, unless you have a good reason to do so. If you have a good reason, you're free to do so. What I think might be important (given certain threads on ietf-languages) is to indicate that "zxx-Latn", "zxx-GB", "zxx-scotland", and definitely "zxx-Latn-GB-scotland" are ill-considered.

>
> etc.
>
> In this case, I think no guidance is better.
>
You're probably right. Given Randy's note, I'll omit my two proposed bits of mustard and make the para:

<t>The 'zxx' (Non-Linguistic, Not Applicable) primary language subtag identifies content for which a language classification is inappropriate or does not apply. Some examples might include instrumental or electronic music; sound recordings consisting of nonverbal sounds; audiovisual materials with no narration, dialog, printed titles, or subtitles; machine-readable data files consisting of machine languages or character codes; or programming source code.</t>


~Addison

_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru



--
Mark
Karen_Broome | 1 May 20:42
Picon

Re: draft-4646bis, Section 4.1 (4)(4)

Your text proposal below is fine with me. Thank you.

Karen Broome

"Phillips, Addison" <addison <at> amazon.com> wrote on 05/01/2008 11:35:07 AM:

> Karen wrote:
> >
> > Typo on "non-linguistic" in the third sentence.  I also added "dialog"
> > to
> > the last sentence. See my version. "Dialog" is not the same as
> > narration.
> 
> Both changes done. Thanks.
> 
> >
> > I would not include the third sentence. I would think it's my right to
> > create, say:
> >
> > zxx-x-music
> > zxx-x-silent
> > zxx-x-code
> > zxx-x-perl
> 
> You're correct that it is your right to do so. This is actually the 
> meaning of SHOULD NOT in my mind: you shouldn't do it, unless you 
> have a good reason to do so. If you have a good reason, you're free 
> to do so. What I think might be important (given certain threads on 
> ietf-languages) is to indicate that "zxx-Latn", "zxx-GB", "zxx-
> scotland", and definitely "zxx-Latn-GB-scotland" are ill-considered.
> 
> >
> > etc.
> >
> > In this case, I think no guidance is better.
> >
> You're probably right. Given Randy's note, I'll omit my two proposed
> bits of mustard and make the para:
> 
> <t>The 'zxx' (Non-Linguistic, Not Applicable) primary language 
> subtag identifies content for which a language classification is 
> inappropriate or does not apply. Some examples might include 
> instrumental or electronic music; sound recordings consisting of 
> nonverbal sounds; audiovisual materials with no narration, dialog, 
> printed titles, or subtitles; machine-readable data files consisting
> of machine languages or character codes; or programming source code.</t>
> 
> 
> ~Addison
> 

Randy Presuhn | 1 May 19:18
Picon

Re: draft-4646bis, Section 4.1 (4)(4)

Hi -

> From: "Phillips, Addison" <addison <at> amazon.com>
> To: <Karen_Broome <at> spe.sony.com>; "LTRU WorkingGroup" <ltru <at> ietf.org>
> Sent: Thursday, May 01, 2008 11:54 AM
> Subject: Re: [Ltru] draft-4646bis, Section 4.1 (4)(4)
>
> I agree with this edit and have incorporated it. Upon further reflection,
> though, it seems like we should provide some SHOULD/SHOULD NOT recommendation
> for this subtag, as we have with ‘mul’, ‘und’, and ‘mis’. I would therefore suggest:
...

I think that would be overkill, since (4), of which (4)(4) is just an elaboration,
already has that language.

Randy

_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru

Gmane