Rian Hunter | 6 Jan 05:04 2012

accessing persistent from multiple threads

Hello,

Haskell newbie here, but long term c programmer. I need to access my persistent (sqlite) database from both
my yesod handlers but also a separate thread. I know that sqlite doesnt do well (no isolation) when
multiple threads are accessing the db at the same time so I had two solutions:

1. Lock some mutex before running a persistent action via "runPool"
2. Have a single thread running persistent actions on behalf of other threads.

#1 is easy to implement and works but I'd much rather have all sqlite accesses happen on one thread with #2.
The problem is passing the specific actions (instances of PersistentConfigBackend c m a) over an
inter-thread channel. I think the best way is to use Data.Dynamic, unfortunately SqlPersist isn't an
instance of Typeable2, also GGHandler isn't an instance of Typeable1.

I wanted to ask the experts. Is there a way to do this, or should I give up and use mutexes?

Rian
Michael Snoyman | 6 Jan 05:55 2012

Re: accessing persistent from multiple threads

On Fri, Jan 6, 2012 at 6:04 AM, Rian Hunter <rian@...> wrote:
> Hello,
>
> Haskell newbie here, but long term c programmer. I need to access my persistent (sqlite) database from
both my yesod handlers but also a separate thread. I know that sqlite doesnt do well (no isolation) when
multiple threads are accessing the db at the same time so I had two solutions:
>
> 1. Lock some mutex before running a persistent action via "runPool"
> 2. Have a single thread running persistent actions on behalf of other threads.
>
> #1 is easy to implement and works but I'd much rather have all sqlite accesses happen on one thread with #2.
The problem is passing the specific actions (instances of PersistentConfigBackend c m a) over an
inter-thread channel. I think the best way is to use Data.Dynamic, unfortunately SqlPersist isn't an
instance of Typeable2, also GGHandler isn't an instance of Typeable1.
>
> I wanted to ask the experts. Is there a way to do this, or should I give up and use mutexes?
>
> Rian

Actually, I don't think this is necessary at all. runPool ensures that
a single SQLite connection is being used by only a single thread. I'm
pretty certain that SQLite can handle multiple connections to a single
database.

Of course, if you're using an in-memory database, you'll end up with
essentially separate databases, but it doesn't sound like that's the
issue at hand.

Michael

(Continue reading)

Max Cantor | 6 Jan 15:23 2012
Picon

Re: accessing persistent from multiple threads

I believe Michael is right and you don't need them, but just to give you a head start on googling terms, the nearest Haskell equivalent to mutexes are MVars.  However, for a lot of concurrency problems, software transactional memory (STM) and its associated TVars can provide better performance and is deadlock proof.  MVars have pessimistic semantics, TVars are optimistic.  

Lastly, when only a single value needs to be atomically retrieved and updated, IORefs, which are nonblocking, do have an atomicModifyIORef function which will vastly outperform MVars or TVars.

-- 
Max Cantor
Sent with Sparrow

On Thursday, January 5, 2012 at 11:55 PM, Michael Snoyman wrote:

On Fri, Jan 6, 2012 at 6:04 AM, Rian Hunter <rian <at> dropbox.com> wrote:
Hello,

Haskell newbie here, but long term c programmer. I need to access my persistent (sqlite) database from both my yesod handlers but also a separate thread. I know that sqlite doesnt do well (no isolation) when multiple threads are accessing the db at the same time so I had two solutions:

1. Lock some mutex before running a persistent action via "runPool"
2. Have a single thread running persistent actions on behalf of other threads.

#1 is easy to implement and works but I'd much rather have all sqlite accesses happen on one thread with #2. The problem is passing the specific actions (instances of PersistentConfigBackend c m a) over an inter-thread channel. I think the best way is to use Data.Dynamic, unfortunately SqlPersist isn't an instance of Typeable2, also GGHandler isn't an instance of Typeable1.

I wanted to ask the experts. Is there a way to do this, or should I give up and use mutexes?

Rian

Actually, I don't think this is necessary at all. runPool ensures that
a single SQLite connection is being used by only a single thread. I'm
pretty certain that SQLite can handle multiple connections to a single
database.

Of course, if you're using an in-memory database, you'll end up with
essentially separate databases, but it doesn't sound like that's the
issue at hand.

Michael

Darrin Thompson | 6 Jan 17:47 2012
Picon

Re: accessing persistent from multiple threads

On Thu, Jan 5, 2012 at 11:55 PM, Michael Snoyman <michael <at> snoyman.com> wrote:
Actually, I don't think this is necessary at all. runPool ensures that
a single SQLite connection is being used by only a single thread. I'm
pretty certain that SQLite can handle multiple connections to a single
database.


Some hard experience from Rails land, which I think is applicable here...

1. Don't use SQLite in production for an online system with more than one user. Two users is one too many. Zero might work but I'd do some load testing.

2. "SQLite can handle can handle multiple connections" is strictly true but that "can handle" might include your app response time slowing down to a crawl over a surprisingly small number of concurrent users. For us it was in the low tens. And that's users, not even true active concurrent connections. Your data is ok but your users and stakeholders are very frustrated.

3. When you upgrade to a more capable db under pressure you will introduce several haste bugs due to your SQLite query assumptions not holding in the new environment. I think ours was mostly due to schema management/migration, IIRC. See 1.

4. "But I'm using persistent," you might say, "I can upgrade later." I was using Rails schemas and Rails ActiveRecord. Didn't help me. Data was damaged.

5. If runPool can be convinced to force use of exactly one connection then that's the right answer. But see 1.

--
Darrin

Rian Hunter | 6 Jan 18:37 2012

Re: accessing persistent from multiple threads


On Jan 5, 2012, at 8:55 PM, Michael Snoyman wrote:

Actually, I don't think this is necessary at all. runPool ensures that
a single SQLite connection is being used by only a single thread. I'm
pretty certain that SQLite can handle multiple connections to a single
database.

you're right it isn't strictly necessary but what i don't want is for one connection  to see an intermediate state during a transaction while another thread is modifying the database row by row. it turns out that even though sqlite allows concurrent readers & writers, the default isolation level is actually SERIALIZABLE (http://www.sqlite.org/pragma.html#pragma_read_uncommitted) which is exactly what i want. i may get spurious SQLITE_BUSY exceptions if i have two concurrent writers but that's not a big deal. i'll keep things as they are for now.

still... if you had to solve a problem like this, for instance on an in-memory database. is there a cleaner way to do it than using Data.Dynamic? :)

rian
Michael Snoyman | 7 Jan 22:35 2012

Re: accessing persistent from multiple threads

On Fri, Jan 6, 2012 at 7:37 PM, Rian Hunter <rian@...> wrote:
>
> On Jan 5, 2012, at 8:55 PM, Michael Snoyman wrote:
>
> Actually, I don't think this is necessary at all. runPool ensures that
>
> a single SQLite connection is being used by only a single thread. I'm
>
> pretty certain that SQLite can handle multiple connections to a single
>
> database.
>
>
> you're right it isn't strictly necessary but what i don't want is for one
> connection  to see an intermediate state during a transaction while another
> thread is modifying the database row by row. it turns out that even though
> sqlite allows concurrent readers & writers, the default isolation level is
> actually SERIALIZABLE
> (http://www.sqlite.org/pragma.html#pragma_read_uncommitted) which is exactly
> what i want. i may get spurious SQLITE_BUSY exceptions if i have two
> concurrent writers but that's not a big deal. i'll keep things as they are
> for now.
>
> still... if you had to solve a problem like this, for instance on an
> in-memory database. is there a cleaner way to do it than using Data.Dynamic?
> :)
>
> rian

Just to be clear: are you talking about some kind of an API where you
would register that an action should occur, and then you get back a
future (i.e., promise) of a result?

I was about to give some vague description of how to do this, but then
I realized that it was far too much fun to actually write all the
code. It hasn't been tested outside of compiling it, but it should be
correct.

https://gist.github.com/1576120

Michael

Rian Hunter | 7 Jan 23:04 2012

Re: accessing persistent from multiple threads

Haha this is great. The trick here is you concatenated the action that writes the mvar with the input action
to keep the chan of type (). I was instead passing the reply mvar over the channel and expecting the worker
thread to write to it explicitly. Thanks for this Michael!

On Jan 7, 2012, at 1:35 PM, Michael Snoyman <michael@...> wrote:

> On Fri, Jan 6, 2012 at 7:37 PM, Rian Hunter <rian@...> wrote:
>> 
>> On Jan 5, 2012, at 8:55 PM, Michael Snoyman wrote:
>> 
>> Actually, I don't think this is necessary at all. runPool ensures that
>> 
>> a single SQLite connection is being used by only a single thread. I'm
>> 
>> pretty certain that SQLite can handle multiple connections to a single
>> 
>> database.
>> 
>> 
>> you're right it isn't strictly necessary but what i don't want is for one
>> connection  to see an intermediate state during a transaction while another
>> thread is modifying the database row by row. it turns out that even though
>> sqlite allows concurrent readers & writers, the default isolation level is
>> actually SERIALIZABLE
>> (http://www.sqlite.org/pragma.html#pragma_read_uncommitted) which is exactly
>> what i want. i may get spurious SQLITE_BUSY exceptions if i have two
>> concurrent writers but that's not a big deal. i'll keep things as they are
>> for now.
>> 
>> still... if you had to solve a problem like this, for instance on an
>> in-memory database. is there a cleaner way to do it than using Data.Dynamic?
>> :)
>> 
>> rian
> 
> Just to be clear: are you talking about some kind of an API where you
> would register that an action should occur, and then you get back a
> future (i.e., promise) of a result?
> 
> I was about to give some vague description of how to do this, but then
> I realized that it was far too much fun to actually write all the
> code. It hasn't been tested outside of compiling it, but it should be
> correct.
> 
> https://gist.github.com/1576120
> 
> Michael

Michael Snoyman | 8 Jan 05:54 2012

Re: accessing persistent from multiple threads

One thing my code didn't account for was exceptions. You most likely
want to catch any exception inside fullAction and then return it to
the caller, either by having something like `Either SomeException a`
or just by rethrowing it.

On Sun, Jan 8, 2012 at 12:04 AM, Rian Hunter <rian@...> wrote:
> Haha this is great. The trick here is you concatenated the action that writes the mvar with the input action
to keep the chan of type (). I was instead passing the reply mvar over the channel and expecting the worker
thread to write to it explicitly. Thanks for this Michael!
>
> On Jan 7, 2012, at 1:35 PM, Michael Snoyman <michael@...> wrote:
>
>> On Fri, Jan 6, 2012 at 7:37 PM, Rian Hunter <rian@...> wrote:
>>>
>>> On Jan 5, 2012, at 8:55 PM, Michael Snoyman wrote:
>>>
>>> Actually, I don't think this is necessary at all. runPool ensures that
>>>
>>> a single SQLite connection is being used by only a single thread. I'm
>>>
>>> pretty certain that SQLite can handle multiple connections to a single
>>>
>>> database.
>>>
>>>
>>> you're right it isn't strictly necessary but what i don't want is for one
>>> connection  to see an intermediate state during a transaction while another
>>> thread is modifying the database row by row. it turns out that even though
>>> sqlite allows concurrent readers & writers, the default isolation level is
>>> actually SERIALIZABLE
>>> (http://www.sqlite.org/pragma.html#pragma_read_uncommitted) which is exactly
>>> what i want. i may get spurious SQLITE_BUSY exceptions if i have two
>>> concurrent writers but that's not a big deal. i'll keep things as they are
>>> for now.
>>>
>>> still... if you had to solve a problem like this, for instance on an
>>> in-memory database. is there a cleaner way to do it than using Data.Dynamic?
>>> :)
>>>
>>> rian
>>
>> Just to be clear: are you talking about some kind of an API where you
>> would register that an action should occur, and then you get back a
>> future (i.e., promise) of a result?
>>
>> I was about to give some vague description of how to do this, but then
>> I realized that it was far too much fun to actually write all the
>> code. It hasn't been tested outside of compiling it, but it should be
>> correct.
>>
>> https://gist.github.com/1576120
>>
>> Michael

Felipe Almeida Lessa | 8 Jan 06:26 2012
Picon

Re: accessing persistent from multiple threads

On Sun, Jan 8, 2012 at 2:54 AM, Michael Snoyman <michael@...> wrote:
> One thing my code didn't account for was exceptions. You most likely
> want to catch any exception inside fullAction and then return it to
> the caller, either by having something like `Either SomeException a`
> or just by rethrowing it.

Note also that there are some comments on the gist =).

--

-- 
Felipe.


Gmane