Julien ÉLIE | 31 Dec 2009 17:31
Favicon

Syntax validation of articles by injecting agents


Hi,

RFC 5537 mentions that an injecting agent MUST reject any proto-article
that is not syntactically valid as defined by RFC 5536.

What is the best way to do that then?
Is it safe to implement that requirement?  RFC 5536 is said to
"reflect current practice", but if we enforce that MUST, I believe
it will break lots of news readers.

NN for instance does not generate MIME-Version: header fields
although "user agents MUST meet the definition of MIME conformance"
("a mail user agent that is MIME-conformant MUST always generate
a "MIME-Version: 1.0" header field in any message it creates").
I believe this sentence applies to news user agents too, otherwise
a reference to MIME is useless.

And what if a news reader generates an incorrect User-Agent: header
field?  or if it always adds a tail-entry which is not a path-nodot
in Path:?  All its posts will be rejected by a RFC-compliant injecting
agent...
It it the intention?

I quite understand that it would help to have better compliant
articles.  For instance, rejecting articles with "all" in their
distribution list.

But in some cases, people would need to upgrade their news
readers...  (and maybe change their news readers if it is
(Continue reading)

Russ Allbery | 31 Dec 2009 20:43
Picon
Favicon
Gravatar

Re: Syntax validation of articles by injecting agents


Julien ÉLIE <julien <at> trigofacile.com> writes:

> RFC 5537 mentions that an injecting agent MUST reject any proto-article
> that is not syntactically valid as defined by RFC 5536.

In retrospect, I suspect that should either have been a SHOULD or it
should have singled out the netnews-specific restrictions.  I don't think
anyone pointed out at the time that it would mean rejecting all non-MIME
messages, and I suspect we would have changed it if we'd realized that as
fairly impractical.

The point was more to reject messages with syntactically invalid
Newsgroups headers and whatnot.

> And what if a news reader generates an incorrect User-Agent: header
> field?  or if it always adds a tail-entry which is not a path-nodot
> in Path:?  All its posts will be rejected by a RFC-compliant injecting
> agent...
> It it the intention?

I wonder how many user agents generate invalid Path headers.  Hm.

I have a hard time justifying rejecting articles on the basis of syntactic
problems in purely informational headers like User-Agent.

> I quite understand that it would help to have better compliant
> articles.  For instance, rejecting articles with "all" in their
> distribution list.

(Continue reading)

Charles Lindsey | 4 Jan 2010 13:27
Picon
Picon

Re: Syntax validation of articles by injecting agents


In <87aawzdj5e.fsf <at> windlord.stanford.edu> Russ Allbery <rra <at> stanford.edu> writes:

>Julien ÉLIE <julien <at> trigofacile.com> writes:

>> RFC 5537 mentions that an injecting agent MUST reject any proto-article
>> that is not syntactically valid as defined by RFC 5536.

>In retrospect, I suspect that should either have been a SHOULD or it
>should have singled out the netnews-specific restrictions.  I don't think
>anyone pointed out at the time that it would mean rejecting all non-MIME
>messages, and I suspect we would have changed it if we'd realized that as
>fairly impractical.

Eh? How does it imply that? If there are no Content-* headers, then there
is no requirement for MIME-Version.

>The point was more to reject messages with syntactically invalid
>Newsgroups headers and whatnot.

>> And what if a news reader generates an incorrect User-Agent: header
>> field?  or if it always adds a tail-entry which is not a path-nodot
>> in Path:?  All its posts will be rejected by a RFC-compliant injecting
>> agent...
>> It it the intention?

>I wonder how many user agents generate invalid Path headers.  Hm.

But aren't injecting agents allowed to remove a Path header that is
received? For sure, many of them routinely do. In any case, fixing (or
(Continue reading)

Russ Allbery | 5 Jan 2010 03:12
Picon
Favicon
Gravatar

Re: Syntax validation of articles by injecting agents


"Charles Lindsey" <chl <at> clerew.man.ac.uk> writes:

> Eh? How does it imply that? If there are no Content-* headers, then
> there is no requirement for MIME-Version.

See Julien's post:

| NN for instance does not generate MIME-Version: header fields
| although "user agents MUST meet the definition of MIME conformance"
| ("a mail user agent that is MIME-conformant MUST always generate
| a "MIME-Version: 1.0" header field in any message it creates").
| I believe this sentence applies to news user agents too, otherwise
| a reference to MIME is useless.

The first quoted statement is from RFC 5536 section 2.3, with an
accompanying reference to RFC 2049.  The second quoted statement is from
RFC 2049 section 2.

Now that I think about it, though, this only places a requirement on the
user agent.  It doesn't require that the server reject the message, so I
think the original problem isn't actually that significant of a problem.
MIME-compliant agents, such as injecting agents, are allowed to accept
non-MIME messages.

> But aren't injecting agents allowed to remove a Path header that is
> received?

See point 2 in Duties of an Injecting Agent in RFC 5537, which requires a
syntax check on the Path header.
(Continue reading)

Charles Lindsey | 5 Jan 2010 12:56
Picon
Picon

Re: Syntax validation of articles by injecting agents


In <873a2lz4dk.fsf <at> windlord.stanford.edu> Russ Allbery <rra <at> stanford.edu> writes:

>"Charles Lindsey" <chl <at> clerew.man.ac.uk> writes:

>> Eh? How does it imply that? If there are no Content-* headers, then
>> there is no requirement for MIME-Version.

>See Julien's post:

>| NN for instance does not generate MIME-Version: header fields
>| although "user agents MUST meet the definition of MIME conformance"
>| ("a mail user agent that is MIME-conformant MUST always generate
>| a "MIME-Version: 1.0" header field in any message it creates").
>| I believe this sentence applies to news user agents too, otherwise
>| a reference to MIME is useless.

>The first quoted statement is from RFC 5536 section 2.3, with an
>accompanying reference to RFC 2049.  The second quoted statement is from
>RFC 2049 section 2.

But I don't think RFC 2049 ever intended to imply that MIME-Version was
needed for a message that did not actually use any of the MIME features.
As evidence of that, I can cite RFC 2047, which states:

   (4) A MIME-Version header field is NOT required to be present for
       'encoded-word's to be interpreted according to this
       specification.  One reason for this is that the mail reader is
       not expected to parse the entire message header before displaying
       lines that may contain 'encoded-word's.
(Continue reading)

Richard Clayton | 5 Jan 2010 13:33

Re: Syntax validation of articles by injecting agents


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In message <KvrvuC.F57 <at> clerew.man.ac.uk>, Charles Lindsey
<chl <at> clerew.man.ac.uk> writes

>But I don't think RFC 2049 ever intended to imply that MIME-Version was
>needed for a message that did not actually use any of the MIME features.

Hardly on topic for this list, but yes that's exactly what it means, and
what it was intended to mean.  It's no big deal for a client to generate
the header field, so it's hardly onerous!  (see this message for an
example of a compliant client)

What it does for you is to ensure that if you are corresponding with
someone whose client is MIME compatible, you learn that is so "for free"

You might note that adding "MIME-Version: 1.0" to a message that is
otherwise "not MIME" (but correctly eschews the use of "non-ASCII"
characters etc), will make no difference to its interpretation.

- -- 
richard  <at>  highwayman . com                       "Nothing seems the same
                          Still you never see the change from day to day
                                And no-one notices the customs slip away"

-----BEGIN PGP SIGNATURE-----
Version: PGPsdk version 1.7.1

(Continue reading)

Charles Lindsey | 6 Jan 2010 17:37
Picon
Picon

Re: Syntax validation of articles by injecting agents


In <7Ey9XHElGzQLFAQU <at> highwayman.com> Richard Clayton <richard <at> highwayman.com> writes:

>In message <KvrvuC.F57 <at> clerew.man.ac.uk>, Charles Lindsey
><chl <at> clerew.man.ac.uk> writes

>>But I don't think RFC 2049 ever intended to imply that MIME-Version was
>>needed for a message that did not actually use any of the MIME features.

>Hardly on topic for this list, but yes that's exactly what it means, and
>what it was intended to mean.  It's no big deal for a client to generate
>the header field, so it's hardly onerous!  (see this message for an
>example of a compliant client)

Then how do you account for the option of not including in in RFC 2047?

--

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131            Web: http://www.cs.man.ac.uk/~chl
Email: chl <at> clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

Julien ÉLIE | 6 Jan 2010 18:52
Favicon

Re: Syntax validation of articles by injecting agents


Hi Charles,

>>>But I don't think RFC 2049 ever intended to imply that MIME-Version was
>>>needed for a message that did not actually use any of the MIME features.
>
>>Hardly on topic for this list, but yes that's exactly what it means, and
>>what it was intended to mean.  It's no big deal for a client to generate
>>the header field, so it's hardly onerous!  (see this message for an
>>example of a compliant client)
>
> Then how do you account for the option of not including in in RFC 2047?

The Section 6.1 you quote in RFC 2047 only specifies how a MIME-compliant
reader MUST parse any messages.
The fact that a MIME-Version: header field is not required to be present
for <encoded-word>s to be interpreted according to the specification
only means, to my mind, that a MIME-compliant reader should properly decode
<encoded-word>s of broken agents.

This way, a MIME-compliant agent will not write
    =?iso-8859-1?q?this=20is=20some=20text?=
to the user if the MIME-Version: header field is missing!  It will try
to decode it and properly display "this is some text".

Another reason is given by RFC 2047:

       One reason for this is that the mail reader is
       not expected to parse the entire message header before displaying
       lines that may contain 'encoded-word's.
(Continue reading)

Charles Lindsey | 7 Jan 2010 12:30
Picon
Picon

Re: Syntax validation of articles by injecting agents


In <FD8718C15E1440E993AEFBF849934D82 <at> Iulius> =?Windows-1252?Q?Julien_=C9LIE?=
<julien <at> trigofacile.com> writes:

>If not, see Section 4 of RFC 2045:

>   Therefore, this document defines a new header field, "MIME-Version",
>   which is to be used to declare the version of the Internet message
>   body format standard in use.

>   Messages composed in accordance with this document MUST include such
>   a header field, with the following verbatim text:

>     MIME-Version: 1.0

>   The presence of this header field is an assertion that the message
>   has been composed in compliance with this document.

Well that implies, to me, that if the message contains nothing that
purports to be "composed in compliance with this document", then there is
no obligation to inlcude the MIME-Version.

--

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131            Web: http://www.cs.man.ac.uk/~chl
Email: chl <at> clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

Seth | 7 Jan 2010 18:59
Picon
Favicon

Re: Syntax validation of articles by injecting agents


"Charles Lindsey" <chl <at> clerew.man.ac.uk> wrote:
> In <FD8718C15E1440E993AEFBF849934D82 <at> Iulius>
> =?Windows-1252?Q?Julien_=C9LIE?= <julien <at> trigofacile.com> writes:
> (quotes)

>>     MIME-Version: 1.0
>
>>   The presence of this header field is an assertion that the message
>>   has been composed in compliance with this document.
>
> Well that implies, to me, that if the message contains nothing that
> purports to be "composed in compliance with this document", then there is
> no obligation to inlcude the MIME-Version.

Even if it is composed in compliance with the document, there is no
obligation to *assert* that fact.

Seth

Russ Allbery | 7 Jan 2010 20:10
Picon
Favicon
Gravatar

Re: Syntax validation of articles by injecting agents


Seth <sethb <at> panix.com> writes:
> "Charles Lindsey" <chl <at> clerew.man.ac.uk> wrote:

>> Well that implies, to me, that if the message contains nothing that
>> purports to be "composed in compliance with this document", then there
>> is no obligation to inlcude the MIME-Version.

> Even if it is composed in compliance with the document, there is no
> obligation to *assert* that fact.

Actually, yes, there is.  I really don't see how RFC 2049 could possibly
be any clearer.

|    A mail user agent that is MIME-conformant MUST:
| 
|     (1)   Always generate a "MIME-Version: 1.0" header field in
|           any message it creates.

I don't think there's much need to go farther afield into other MIME
documents when RFC 2049 is as clear as crystal and when we explicitly
stated in RFC 5536 that all user agents complying with RFC 5536 MUST meet
the definition of MIME conformance in RFC 2049.

(This does not, as pointed out elsewhere in the thread, imply that servers
have to reject messages generated by non-conformant user agents.)

--

-- 
Russ Allbery (rra <at> stanford.edu)             <http://www.eyrie.org/~eagle/>

(Continue reading)

Richard Clayton | 6 Jan 2010 19:35

Re: Syntax validation of articles by injecting agents


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In message <Kvu3Ht.BG1 <at> clerew.man.ac.uk>, Charles Lindsey
<chl <at> clerew.man.ac.uk> writes
>
>In <7Ey9XHElGzQLFAQU <at> highwayman.com> Richard Clayton <richard <at> highwayman.com> 
>writes:
>
>>In message <KvrvuC.F57 <at> clerew.man.ac.uk>, Charles Lindsey
>><chl <at> clerew.man.ac.uk> writes
>
>>>But I don't think RFC 2049 ever intended to imply that MIME-Version was
>>>needed for a message that did not actually use any of the MIME features.
>
>>Hardly on topic for this list, but yes that's exactly what it means, and
>>what it was intended to mean.  It's no big deal for a client to generate
>>the header field, so it's hardly onerous!  (see this message for an
>>example of a compliant client)
>
>Then how do you account for the option of not including in in RFC 2047?

I think it's saying that you must expect MUAs to parse encoded words
without checking if the Mime-Version: header field is or is not present
(and giving permission for MUAs to be optimised that way)

- -- 
richard                                                   Richard Clayton

(Continue reading)

Harald Alvestrand | 6 Jan 2010 15:56
Picon

Re: Syntax validation of articles by injecting agents


Julien ÉLIE wrote:
>
> Hi,
>
> RFC 5537 mentions that an injecting agent MUST reject any proto-article
> that is not syntactically valid as defined by RFC 5536.
RFC 5536 says about articles:

   An article is said to be conformant to this specification if it
   conforms to the format specified in Section 3 of [RFC5322] and to the
   additional requirements of this specification.

This doesn't require MIME.
It says about agents:

   User agents MUST meet the definition of MIME conformance in [RFC2049]
   and MUST also support [RFC2231].  This level of MIME conformance
   provides support for internationalization and multimedia in message
   bodies [RFC2045], [RFC2046], and [RFC2231], and support for
   internationalization of header fields [RFC2047] and [RFC2231].  Note
   that [Errata] currently exist for [RFC2045], [RFC2046], [RFC2047] and
   [RFC2231].

Yes, that requires that user agents conforming to RFC 5536 must add 
MIME-version headers.

I think the *important* part of all that conformance is that Just-Send-8 
messages are strictly illegal, and that if you use MIME, 2047 or 2231, 
you have to use it correctly.
(Continue reading)

Julien ÉLIE | 6 Jan 2010 19:20
Favicon

Re: Syntax validation of articles by injecting agents


Hi Harald,

> I think the *important* part of all that conformance is that Just-Send-8 messages are strictly illegal,
and that if you use MIME, 
> 2047 or 2231, you have to use it correctly.

Yes, I agree with that.

And unfortunately, there are still posts in the French hierarchy fr.*
that either have no MIME headers or that have Content-Type: without
any MIME-Version: header field.
They contain 8-bit chars.

Examples of such readers:  Xnews, MacSOUP, lots of malconfigured Outlook
Express...

So I believe it would be currently too harmful to reject posts with illegal
8-bit chars...  Such articles are not valid.

--

-- 
Julien ÉLIE

« Medicus dedit qui temporis morbo curam,
  Is plus remedii quam cutis sector dedit. »


Gmane