Eric Jensen | 11 Feb 2005 18:34
Favicon

Re: Error from mh-index-search

OK, thanks for your help.

Below is that buffer's content (between the --- lines).

Aha!  I see from this output that it found a match in my "aliases" file,
which is obviously not a mail message, so I should put "aliases" in my
ignore list in the swish-e config...  I wonder if part of the problem is
that the indexing is catching a few files that aren't mail messages, and
that's confusing the scan formatting later?

OK, I just fixed my swish-e config file to ignore this file, re-made the
index, and re-ran the indexed search, and sure enough, that fixed it!

So it looks like part of the problem is how to handle more robustly the
case when search results are returned that aren't actually mh messages.
And correspondingly, how to modify the recommend default .swish/config
in the docs to help avoid this in the first place.  Maybe a line like:

 FileMatch filename contains ^\d+$

By the way, I'd also recommend "DefaultContents TXT*" since otherwise
swish-e use HTML2, which takes a lot longer to do the indexing.

Are there any cases where messages would be stored with filenames that
contains characters other than digits?

Thanks as always for the quick help!

Eric

(Continue reading)

Bill Wohler | 12 Feb 2005 17:09
Picon
Picon
Gravatar

Re: Error from mh-index-search

Thanks Eric, Humberto, and Satyaki. As it so happens, I'm updating the
searching chapter of the manual now and will incorporate your fixes.

--

-- 
Bill Wohler <wohler <at> newt.com>  http://www.newt.com/wohler/  GnuPG ID:610BD9AD
Maintainer of comp.mail.mh FAQ and MH-E. Vote Libertarian!
If you're passed on the right, you're in the wrong lane.

Satyaki Das | 12 Feb 2005 06:37
X-Face
Picon

Re: Error from mh-index-search

Eric Jensen <ejensen1 <at> swarthmore.edu> writes:

> So it looks like part of the problem is how to handle more robustly the
> case when search results are returned that aren't actually mh messages.

I've checked in a fix that will at least avoid the problem that
you were seeing.

> And correspondingly, how to modify the recommend default .swish/config
> in the docs to help avoid this in the first place.  Maybe a line like:
> 
>  FileMatch filename contains ^\d+$
> 
> By the way, I'd also recommend "DefaultContents TXT*" since otherwise
> swish-e use HTML2, which takes a lot longer to do the indexing.

I think adding these is a good idea.  So I've added these to the
sample config file as well.  It would be great if you could check
if I did it right.

> Are there any cases where messages would be stored with filenames that
> contains characters other than digits?

I can't think of any.

Thanks,
Satyaki


Gmane