Doug Cutting | 5 Aug 2002 22:23

Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/search QueryFilter.java

Scott Ganyo wrote:
> My thought was that the Filter.bits() method on Hits would only resolve the
> BitSet if it was asked for (and probably wouldn't even cache it), so in the
> common case Hits wouldn't suffer any ill effect.  Would that work?  (I feel
> like I'm missing something obvious...)

One could do this, but I'm not sure what the advantage would be.

In your original message on this topic, you wrote:

Scott Ganyo wrote:
 > But instead of adding a new class, why not change Hits to
 > inherit from Filter and add the bits() method to it?
 > Then one could "pipe" the output of one Query into another
 > search without modifying the Queries...

If that's the goal, then a bits() method is not a great way to do this, 
as it ignores the ranking in the first search when ranking the second. 
Since that is a material difference, I prefer to make it explicit.

Filters are not designed for searching within an arbitrary result set. 
For that you really should take the ranking for the first query into 
account: a new query should be formed by adding clauses to the original 
query.  Filters are instead designed to search subsets of an index 
defined by boolean criteria, criteria that do not affect ranking, like 
date, language, postal code, document type, etc.  They are particularly 
useful when the same criterion is used repeatedly, and the bit vector 
can be cached, as the construction and storage of a new bit-vector per 
query is expensive.  Thus the canonical uses of a filter should be to 
implement things like "modified in last week", or "written in english" 
(Continue reading)


Gmane