Michael Elsdörfer | 17 Mar 16:50
Picon

Re: Confused about case-sensitivity


Hi Richard,

here's an example:

http://dpaste.com/39820/

On my system, the output is:

(term, num results)
buffed  0
Buffed  1
gee     0
Gee     1
GeE     1
shock   1
evolution       0

Normally, and according to your explanation, I would expect to see
exactly one result for each query.

Also (I didn't mention this in my original post), as you can see the
fields "title" and "text" are defined exactly the same way, but appear
to behave differently. The all-lowercase query "shock" finds the
document through the "title" field, while "evolution" through the
"text" field doesn't seem to work.

Thanks for your help,

Michael
(Continue reading)

Richard Boulton | 17 Mar 18:03

Re: Confused about case-sensitivity


Michael Elsdörfer wrote:
> Hi Richard,
> 
> here's an example:
> 
> http://dpaste.com/39820/

Thanks - that was very helpful.

> Normally, and according to your explanation, I would expect to see
> exactly one result for each query.

Yes, that would be reasonable.  I've just done a quick investigation of 
what happens, and found the problem; we don't currently cope with mixed 
stemming settings correctly.

If you try setting all the field actions to use the same language, or 
all of them to use no language (so no stemming), it works as expected. 
However, when any of the fields have a stemmer, the query parser fails 
to build the search terms for those fields correctly.

I can see a "quick hack" solution, but I'm not certain it won't degrade 
performance elsewhere, so I'll do a few tests to check on that.  I'm 
hoping to have time in the near future to do a clean-up of the way in 
which the field settings are set, which will make this kind of conflict 
impossible to happen, so I'm not going to spend too much effort on a 
short-term solution, though.

For now, I suggest you use the same stemming strategy for all free text 
(Continue reading)

Michael Elsdörfer | 18 Mar 17:38
Picon

Re: Confused about case-sensitivity


Richard,

> For now, I suggest you use the same stemming strategy for all free text
> fields.

That did the trick, and is good enough for now.

> in other words, you don't actually add the contents of the text field to
> the UnprocessedDocument anywhere!
> (Easy mistake to make - it took me a while to spot it...)

Oh - sorry about that. Too much experimenting with the search in my
actual app must have had me mixing things up.

Thanks you very much for investigating this. Xappy is great, keep up
the good work.

Regards,

Michael

On Mar 17, 6:03 pm, Richard Boulton <rich...@...>
wrote:
> Michael Elsdörfer wrote:
> > Hi Richard,
>
> > here's an example:
>
> >http://dpaste.com/39820/
(Continue reading)


Gmane