pmaydell | 10 Dec 2004 08:42
Picon

Re: Braindump: Extended MH Format

Chad Walstrom wrote:
>I posted this as my ~/.plan file on my website... my crappy
>web-log-ish-thing.  I highly doubt anything I have to say is new, but it
>helped me form my opinion about Maildir, that it's not really worth the
>attention it's getting.

While I'm not necessarily a fan of maildir, it does have some nice
locking semantics (and locking that works over NFS is a Hard Problem).

>    If the only compelling reason to switch to Maildir from MH is the
>    file locking semantics, why not fix MH? Rather than storing index
>    data as the file names themselves, why not leave it up to the email
>    client or IMAP server to store sequences in meta-data files? Perhaps
>    as an additional field in .mh_context or a separate file.

If you're going to change format, it might be nice to have something
with a separate overview index. I played around with adding threading
support to nmh a while ago but the trouble is that you wind up having
to read all the messages concerned to build up the thread tree before
you can start doing things, so "scan -threaded last:100" feels far
too slow... (GNUS already has a format like this, which it calls nnml,
which is nmh with a .overview file containing some headers from each
message. I haven't actually looked at the implementation, though.)

>Any merit to this idea?  I understand it would change the way sequences
>would need to be handled, but we could hide that in a library call.  The
>command-line utilities don't need to change the way they reference an
>email.

It makes it harder to do things like 'grep foo ~/Mail/inbox/3???', of
(Continue reading)

Chad Walstrom | 10 Dec 2004 21:24

Re: Braindump: Extended MH Format

pmaydell <at> chiark.greenend.org.uk wrote:
> While I'm not necessarily a fan of maildir, it does have some nice
> locking semantics (and locking that works over NFS is a Hard Problem).

*nod*

> If you're going to change format, it might be nice to have something
> with a separate overview index.

Perhaps that's part of the key solution.  This type of meta-data is more
about helping the client be quicker at finding things.  I understand the
benefits of having standards so that all clients are on a level playing
field.  In any case, standardizing a way to create custom indexes that
don't rely upon the file name might be an useful step forward.
Obviously, you don't want to REQUIRE clients to update the file if it
works out better to have their own indexing mechanisms.  Sylpheed
doesn't rely upon .mh_sequences,GNUS' .overview files, or .xmhcache
files.

Note: my bad with regards to the non-existent .mh_context.  I meant to
refer to .mh_sequences. (I fixed this on my .plan file.)

If you re-sort the folder, the .mh_sequences file currently needs to be
updated anyway to reflect the change, right?  Well, rather than using
the index number to indicate members of a sequence, perhaps include the
actual file names and let the command line utilities enumerate the list.
The paradigm shifts from referencing the email name as the index itself
to referencing its enumerated index of the sequence.  We'ld have to give
mhpath a switch for specifying which sequence to use:

(Continue reading)

pmaydell | 11 Dec 2004 06:02
Picon

Re: Braindump: Extended MH Format

Chad Walstrom wrote:
>Obviously, you don't want to REQUIRE clients to update the file if it
>works out better to have their own indexing mechanisms.  Sylpheed
>doesn't rely upon .mh_sequences,GNUS' .overview files, or .xmhcache
>files.

Um. If you don't require everybody to keep the index in step then there's
no point, because an index-aware client can't rely on the index being
in sync with the actual data.

Snip various. I'm afraid I've lost track of what the actual problem
you're trying to solve is; can you reexplain, please?

>> (GNUS already has a format like this, which it calls nnml, which is
>> nmh with a .overview file containing some headers from each message. I
>> haven't actually looked at the implementation, though.)
>
>Might be a good place to start.
>
>> It makes it harder to do things like 'grep foo ~/Mail/inbox/3???', of
>> course...

>    $ sed -ne '3000,3999p' ~/Mail/inbox/sequence.all | xargs -r grep foo

Er, that's harder :-)

>Again, my bad since we're probably referring to .mh_sequence rather than
>.mh_context.  Besides, how many IMAP servers do you know of that
>currently care about .mh_sequences?

(Continue reading)

Chad Walstrom | 11 Dec 2004 08:00

Re: Braindump: Extended MH Format

pmaydell <at> chiark.greenend.org.uk wrote:
> Snip various. I'm afraid I've lost track of what the actual problem
> you're trying to solve is; can you reexplain, please?

I did forewarn you with the subject titled, "Braindump".  In any case,
the ideas that spewed out on to this list originated from trying to
incorporate some safer locking mechanisms for MH mail and preventing
IMAP servers (or clients) from loosing track of the emails in the
filesystem because of a folder re-sort, regardless of whether they
participate in using common meta-data files.

OK, just to bring RFC 2060 [2.3.1.1. Unique Identifier (UID) Message
Attribute]into discussion:

    Note: Unique identifiers MUST be strictly ascending in the mailbox
    at all times.  If the physical message store is re-ordered by a
    non-IMAP agent, this requires that the unique identifiers in the
    mailbox be regenerated,...

shell$ sortm -textfield subject -limit 4 +inbox 

    ...since the former unique identifers are no longer strictly
    ascending as a result of the re-ordering.  Another instance in which
    unique identifiers are regenerated is if the message store has no
    mechanism to store unique identifiers.  Although this specification
    recognizes that this may be unavoidable in certain server
    environments, it STRONGLY ENCOURAGES message store implementation
    techniques that avoid this problem.

Basically, the IMAP server needs to detect new emails and assign them
(Continue reading)

Chad Walstrom | 10 Dec 2004 21:49

Re: Braindump: Extended MH Format

Chad Walstrom wrote:
> If we're no longer using filename for sort order, the server and
> client won't loose track of internal tracking indexes for seen and
> unseen might be building (not using .mh_sequences).

Wow, how's that for editing discontinuity?!  It should read, "...the
server and client won't lose track of seen and unseen messages as it
might be caching internally (not using .mh_sequences)"

--

-- 
Chad Walstrom <chewie <at> wookimus.net>           http://www.wookimus.net/
           assert(expired(knowledge)); /* core dump */
_______________________________________________
Nmh-workers mailing list
Nmh-workers <at> nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers
Mike O'Dell | 10 Dec 2004 21:37
Favicon

Re: Braindump: Extended MH Format


there is a school of thought that says if you have a *fast* full-text
database with good wild-carding, you just dump the message into
the full-text database and then what we call sequences and folders
become "views" of the database.  in a system like Plan 9 that
could even be elegant.

	-mo

_______________________________________________
Nmh-workers mailing list
Nmh-workers <at> nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Gmane