Thomas Roessler | 11 Feb 02:05
Picon
Favicon

Re: host-meta file format comments (draft-nottingham-site-meta-01)

(diverting to www-talk, too...)

On 11 Feb 2009, at 01:20, Mark Nottingham wrote:

> Yeah, I'm not completely happy with it yet. The thought was that  
> since blank lines don't introduce ambiguity here, they're not  
> harmful. OTOH one of my goals for the format is to allow existing  
> HTTP header and MIME parsers (e.g., in Python) to be used on the  
> format, and they very well may barf on a blank line.

Well, they'll barf on blank lines and declare the header over;  
changing that within the parser (or just restarting it on the rest of  
the file) should be relatively cheap.

BTW, I notice that this draft is silent on the HTTP header syntax's  
combining feature for multiple occurences of the same field (last  
paragraph of 4.2, RFC 2616); I suspect that to be one of the more  
likely causes for surprises if HTTP header parsers are re-used.  (No  
such risk with MIME parsers.)

Finally, why disallow whitespace stuffed folding?  It's pretty useful  
to make long lines editable, and I suspect that we're assuming /host- 
meta to be the product of some human with emacs in their hands. ;-)   
Implementing it is easy, and a given if existing parsers are used.

> So, the right thing to do might be to explicitly disallow them, both  
> in BNF and prose. Eran, thoughts?

I'd just prefer to not have the BNF say "no empty lines", and then  
have prose that says the opposite, but with a SHOULD.
(Continue reading)

Mark Nottingham | 11 Feb 02:18
Picon
Favicon

Re: host-meta file format comments (draft-nottingham-site-meta-01)


On 11/02/2009, at 12:05 PM, Thomas Roessler wrote:

> (diverting to www-talk, too...)
>
> On 11 Feb 2009, at 01:20, Mark Nottingham wrote:
>
>> Yeah, I'm not completely happy with it yet. The thought was that  
>> since blank lines don't introduce ambiguity here, they're not  
>> harmful. OTOH one of my goals for the format is to allow existing  
>> HTTP header and MIME parsers (e.g., in Python) to be used on the  
>> format, and they very well may barf on a blank line.
>
> Well, they'll barf on blank lines and declare the header over;  
> changing that within the parser (or just restarting it on the rest  
> of the file) should be relatively cheap.

This assumes that people will be comfortable modifying libraries. IME  
people tend to treat them as magical black boxes that shouldn't be  
opened (or even questioned) under any circumstances...

> BTW, I notice that this draft is silent on the HTTP header syntax's  
> combining feature for multiple occurences of the same field (last  
> paragraph of 4.2, RFC 2616); I suspect that to be one of the more  
> likely causes for surprises if HTTP header parsers are re-used.  (No  
> such risk with MIME parsers.)

I'll add a note.

> Finally, why disallow whitespace stuffed folding?  It's pretty  
(Continue reading)

Thomas Roessler | 11 Feb 02:28
Picon
Favicon

Re: host-meta file format comments (draft-nottingham-site-meta-01)

On 11 Feb 2009, at 02:18, Mark Nottingham wrote:

[ASCII vs UTF-8]

> OTOH we're talking about a SHOULD here. Maybe it just needs more  
> careful guidance; i.e., that you should stick to ASCII unless you're  
> conveying elements for presentation to end users.

Well, one point to consider is how you expect IRIs and IRI references  
to be represented.

There's one school of thought (more common in the IETF crowd) that  
says that these should be convereted to ASCII early, and therefore  
shouldn't occur here.

The other school of thought (more common at W3C) says that they're  
fine in the places where XML and other document formats have always  
accepted URIs, and therefore should be representable in this spot.

There are some properties of the direction that the IDNA update effort  
is going into that suggest that the IETF school of thought is less  
likely to cause interoperability problems.

The other question is what the cost of violating this SHOULD is.   
Assume that some people have a really good reason to violate an ASCII  
or ISO-8859-1 SHOULD, and actually go for UTF-8.  You now get mixed  
character sets in a single metadata file.  I'm not sure that's  
desirable...

(BTW, are we just going down the rathole of defining yet another tag- 
(Continue reading)

Mark Nottingham | 11 Feb 02:41
Picon
Favicon

Re: host-meta file format comments (draft-nottingham-site-meta-01)


On 11/02/2009, at 12:28 PM, Thomas Roessler wrote:

> On 11 Feb 2009, at 02:18, Mark Nottingham wrote:
>
> [ASCII vs UTF-8]
>
>> OTOH we're talking about a SHOULD here. Maybe it just needs more  
>> careful guidance; i.e., that you should stick to ASCII unless  
>> you're conveying elements for presentation to end users.
>
> Well, one point to consider is how you expect IRIs and IRI  
> references to be represented.
>
> There's one school of thought (more common in the IETF crowd) that  
> says that these should be convereted to ASCII early, and therefore  
> shouldn't occur here.
>
> The other school of thought (more common at W3C) says that they're  
> fine in the places where XML and other document formats have always  
> accepted URIs

IRIs?

> , and therefore should be representable in this spot.
>
> There are some properties of the direction that the IDNA update  
> effort is going into that suggest that the IETF school of thought is  
> less likely to cause interoperability problems.

(Continue reading)


Gmane