Re: Bloom filters
Sanjay Ghemawat <sanjay <at> google.com>
2011-09-23 23:33:43 GMT
On Fri, Sep 23, 2011 at 3:37 PM, Dave Smith <dizzyd <at> gmail.com> wrote:
> Hi all,
> I have now seen (in production) a situation where a user is writing
> uuids into a LevelDB instance. Unfortunately, before the write can be
> done we need to do a read. This, obviously, gets very expensive as the
> dataset grows in size. I was wondering if the Google team had any
> thoughts or plans to put Bloom filters into LevelDB, or even,
> suggestions for best place to hang one?
There are two reasons I am reluctant to put bloom filter support directly
(1) It increases the footprint/complexity of the library.
(2) Many storage system wrappers will want a bloom filter key
that does not exactly match a leveldb key. Suppose for example,
there is a higher-level storage engine that creates leveldb keys that
The reads generated by such a storage engine will be range reads
(not plain gets) and for bloom filters to be effective, we would need to
have enough hooks so that leveldb can know the range that is being
read, and map down from there to the userkey to look for in bloom
This might be doable, but no simple interface springs to mind. That's
why I am hesitant to add support inside leveldb.
On the other hand, I see no easy way for the application to build