Nilesh Bansal | 1 Apr 2007 03:30
Picon

Re: Lock files in a read-only application

Thanks for your response. Is there a way that I can disable these read
locks without upgrading to 2.1. Our application uses its own custom
locking mechanism, so that lucene locking is actually redundant. We
are currently using Lucene version 2.0.

The application has multiple threads (different web requests) reading
the same index simultaneously (say 20 concurrent threads). Can that be
a reason of this problem. Sometimes the lockfiles remain there for
long periods of time (more than a few minutes, which is bad).

Yes, JVM sometimes crashes when it runs out of memory. There should be
someway that the lock files are removed after such crash (any fixes is
2.1?).

thanks
Nilesh

On 3/31/07, Michael McCandless <lucene <at> mikemccandless.com> wrote:
> "Nilesh Bansal" <nileshbansal <at> gmail.com> wrote:
> > We have a web-based application that searches a large lucene index.
> > This application only creates object of type IndexSearcher only (and
> > no IndexWriters) for searching the index. After the application runs
> > for some time (a few hours), I can see lock files in the temp
> > directory of the form
> > /opt/tomcat/temp/lucene-5f77ffdc821b3f8e861949e9ecc35a53-commit.lock
> > The temp dir is set to /opt/tomcat/temp/ as we are using tomcat.
> >
> > Since the application is read-only, there is no point in it using the
> > lock files. These lock files are creating a lot of trouble for me, as
> > their presence leads to a lock obtained timeout in other threads. It
(Continue reading)

Chris Hostetter | 1 Apr 2007 03:51

Re: Lock files in a read-only application


: locks without upgrading to 2.1. Our application uses its own custom
: locking mechanism, so that lucene locking is actually redundant. We
: are currently using Lucene version 2.0.

since before the 2.0.0 release there has been a static
FSDirectory.setDisableLocks that can be called before opening any indexes
to prevent locking -- it's only intended to be used on indexes on read
only disk -- which is not the case in your situation, since a seperate
process is in fact modifying the index, but if you are confident in your
own locking mechanism you can use it.

: The application has multiple threads (different web requests) reading
: the same index simultaneously (say 20 concurrent threads). Can that be
: a reason of this problem. Sometimes the lockfiles remain there for
: long periods of time (more than a few minutes, which is bad).

mutliple reader threads should not cause the commit lock to stay arround
that long, even if each thread is opening it's on IndexReader (which they
should not do, it's better to open one and reuse it among many threads)

: Yes, JVM sometimes crashes when it runs out of memory. There should be
: someway that the lock files are removed after such crash (any fixes is
: 2.1?).

As Michael said, in 2.1 the commit lock doesn't even exist, and in general
there is a much more robust lock management system that lets you decide
what type of lock mechanism to use.

in 2.0.0 your only option for dealing with stale locks is to forcebly
(Continue reading)

Michael McCandless | 1 Apr 2007 10:38

Re: Lock files in a read-only application

"Chris Hostetter" <hossman_lucene <at> fucit.org> wrote:
>
> : locks without upgrading to 2.1. Our application uses its own custom
> : locking mechanism, so that lucene locking is actually redundant. We
> : are currently using Lucene version 2.0.
> 
> since before the 2.0.0 release there has been a static
> FSDirectory.setDisableLocks that can be called before opening any indexes
> to prevent locking -- it's only intended to be used on indexes on read
> only disk -- which is not the case in your situation, since a seperate
> process is in fact modifying the index, but if you are confident in your
> own locking mechanism you can use it.

You need to be really certain your own locking protects Lucene
properly.  Specifically, no IndexReader can be created (restarted)
while a writer is open against the index, and, only one writer can be
open on the index at once (it sounds like you already have that).  If
you're sure about that then disabling the locks as Hoss describes
above is OK.

> : The application has multiple threads (different web requests) reading
> : the same index simultaneously (say 20 concurrent threads). Can that be
> : a reason of this problem. Sometimes the lockfiles remain there for
> : long periods of time (more than a few minutes, which is bad).
> 
> mutliple reader threads should not cause the commit lock to stay arround
> that long, even if each thread is opening it's on IndexReader (which they
> should not do, it's better to open one and reuse it among many threads)

This part (commit lock staying around for so long) is definitely odd
(Continue reading)

Nilesh Bansal | 2 Apr 2007 08:11
Picon

Re: Lock files in a read-only application

thanks for your replies. i have two more questions.
> You need to be really certain your own locking protects Lucene
> properly.  Specifically, no IndexReader can be created (restarted)
> while a writer is open against the index, and, only one writer can be
> open on the index at once (it sounds like you already have that).  If
> you're sure about that then disabling the locks as Hoss describes
> above is OK.
1. If our locking fails, what will happen in the worst case, i.e., an
IndexSearcher tries to read while an IndexWriter is updating the
index. Can it lead to index corruption, or just that the searcher will
give garbage results (or fail with exception) for that query.

2. Currently we are not using any IndexReader. When a request arrives,
we create a new IndexSearcher, and destroy it when it finishes
searching. Is it more efficient to create just one IndexSearcher and
share it with all threads? Or create one IndexReader and use it for
creating all IndexSearchers.

thanks again,
Nilesh

--

-- 
Nilesh Bansal.
http://queens.db.toronto.edu/~nilesh/
Michael McCandless | 2 Apr 2007 10:42

Re: Lock files in a read-only application

"Nilesh Bansal" <nileshbansal <at> gmail.com> wrote:
> thanks for your replies. i have two more questions.
> > You need to be really certain your own locking protects Lucene
> > properly.  Specifically, no IndexReader can be created (restarted)
> > while a writer is open against the index, and, only one writer can be
> > open on the index at once (it sounds like you already have that).  If
> > you're sure about that then disabling the locks as Hoss describes
> > above is OK.
> 1. If our locking fails, what will happen in the worst case, i.e., an
> IndexSearcher tries to read while an IndexWriter is updating the
> index. Can it lead to index corruption, or just that the searcher will
> give garbage results (or fail with exception) for that query.

As long as you only have one writer and the only risk is that
IndexSearcher is opening without proper locking, I believe the worst
that will happen is actually quite benign and quickly detected: the
IndexReader (IndexSearcher just creates IndexReader under the hood)
will hit some sort of IOException while instantiating.  If it
instantiates without an IOException it should then do searches just
fine, correctly.  I *think* this is the case but I honestly haven't
explored this area too heavily so this is somewhat speculation :)
But I'm quite sure your index will not be corrupted, ie, the worst
that happens is the IndexReader has problems.

And, this is assuming your index is not on NFS, which has its own
challenges (even beyond locking).

> 2. Currently we are not using any IndexReader. When a request arrives,
> we create a new IndexSearcher, and destroy it when it finishes
> searching. Is it more efficient to create just one IndexSearcher and
(Continue reading)


Gmane