9 Sep 2008 09:31
Replacing FAST functionality at sesam.no - ShingleFilter+ exact matching
Mck <mick <at> semb.wever.org>
2008-09-09 07:31:56 GMT
2008-09-09 07:31:56 GMT
-- original post was on solr's user list. -- -- i've reposted here as it's centered on the ShingleFilter which comes from lucene -- *ShortVersion* is there a way to make the ShingleFilter perform exact matching via inserting ^ $ begin/end markers? *LongVersion* At sesam.no we want to replace a FAST (fast.no) Query Matching Server with a Solr index. The index we are trying to replace is not a regular index, but specially configured to perform phrases (and sub-phrases) matches against several large lists (like an index with only a 'title' field). I'm not sure of a correct, or logical, name for the behaviour we are after, but it is like a combination between Shingles and exact matching. Our test list has 9 entries: "abcd efgh ijkl", "abcd efgh", "efgh ijkl", "abcd", "efgh", "ijkl", "ijkl efgh", "efgh abcd", and "ijkl efgh abcd". The query behaviour we are looking for is like: (i've included ^$ to denote the exact matching) Original Query --> Filtered Query abcd --> ^abcd$ "abcd efgh" --> (^abcd$ ^"abcd efgh"$ ^efgh$) "abcd efgh ijkl" --> (^abcd$ ^"abcd efgh"$ ^"abcd efgh ijkl"$ ^efgh$ ^"efgh ijkl"$ ^ijkl$)(Continue reading)
I see no way of configuring this behaviour though.
RSS Feed