Matthew Scott | 15 Sep 2009 06:28
Picon
Gravatar

server-less multi-process concurrency for durus

David (and anyone else interested),

I'd like you to nitpick this idea I have to add server-less, multi- 
process concurrency to Durus.

Goal:  Allow more than one process and/or thread on a single machine  
to reliably access and write to a Durus file, without having to  
maintain a separate process for a Durus server, and without having to  
install more Python packages.

I'll present several actions that a process or thread might take, and  
how Durus would behave to support those actions.

Please let me know if you think this will work, any details or quirks  
you can think of that might get in the way, etc.  In return I will put  
some time into this to see if I can produce a working patch.

== Lock file ==
Kept as a separate file, "mydatabase.durus.lock" being locked only  
during write operations.
This is to allow all processes to at least read up to a "known good"  
EOF marker while another process is writing a transaction.

== Packing ==
I haven't thought through this part as of yet, but I'm not worried  
about it at the moment.

== Initial opening of a file ==
1. seek(SEEK_END), tell(), keep as current EOF offset
2. read initial state from file, create in-memory state
(Continue reading)

Binger David | 15 Sep 2009 12:28

Re: server-less multi-process concurrency for durus

Hi Matt,

I'm glad to hear from you.

It seems like we've looked down this road before, and never decided
to take it, but maybe it will look different this time.  One thing  
that seems
to make it seem more attractive now than before is the on-disk index  
provided by
ShelfStorage.  That means that the in-memory index is not much
burden.

On Sep 15, 2009, at 12:28 AM, Matthew Scott wrote:

> David (and anyone else interested),
>
> I'd like you to nitpick this idea I have to add server-less, multi- 
> process concurrency to Durus.
>
> Goal:  Allow more than one process and/or thread on a single machine  
> to reliably access and write to a Durus file, without having to  
> maintain a separate process for a Durus server, and without having  
> to install more Python packages.

The net improvement, then, is that you don't need to maintain a  
separate process for the durus server.
Maintaining the process seems like no inconvenience at all:  am I  
right that it is having to *start* the
separate process that is the requirement that we would like to  
eliminate?
(Continue reading)

Matthew Scott | 15 Sep 2009 18:02
Picon
Gravatar

Re: server-less multi-process concurrency for durus

_______________________________________________
Durus-users mailing list
Durus-users <at> mems-exchange.org
http://mail.mems-exchange.org/mailman/listinfo/durus-users
Binger David | 15 Sep 2009 18:58

Re: server-less multi-process concurrency for durus


On Sep 15, 2009, at 12:02 PM, Matthew Scott wrote:

>
> This could pose a "handoff" problem, where in process A (an  
> application) starts the Durus server, process B (a Python shell  
> inspecting some things under the hood) connects to the server  
> started by process A, then A shuts down, taking the Durus server  
> with it, now process B is left hanging without Durus.

I think this could be addressed by forking and having the parent run  
the server so that the server still has
a live parent when the client quits.

>

> This can be done, but it sounds easier than it really is.  You'll  
> need to
> read the tail, find the oids, and make sure that none of them have  
> states
> loaded in your cache during this transaction.  It isn't enough just to
> look for conflict with the oids you are writing.  This could  
> potentially be
> a slow operation, and it has a cost that the server-based durus  
> avoids.
>
> I'm guessing that using a Durus server avoids this because the Durus  
> server keeps such information in memory?

Yes, the server keeps track of the oids as it writes and makes sure  
(Continue reading)

Matthew Scott | 15 Sep 2009 19:10
Picon
Gravatar

Re: server-less multi-process concurrency for durus



On Tue, Sep 15, 2009 at 09:58, Binger David <dbinger <at> mems-exchange.org> wrote:

On Sep 15, 2009, at 12:02 PM, Matthew Scott wrote:

This could pose a "handoff" problem, where in process A (an application) starts the Durus server, process B (a Python shell inspecting some things under the hood) connects to the server started by process A, then A shuts down, taking the Durus server with it, now process B is left hanging without Durus.

I think this could be addressed by forking and having the parent run the server so that the server still has
a live parent when the client quits.

I'll keep this in mind, your solution sounds good for that particular scenario.


So, rather than read and invalidate on each file length change, we'd just "pretend" that the file isn't growing at all, and perform an analysis on a snapshot of the database.  When the client code was finished, it would call continue(), at which point the database state would be allowed to sync with latest changes -- client code wouldn't care though, since it is done with its analysis.

That  behavior is okay, as long as the client has nothing to write, and doesn't really care about being perfectly up-to-date.
If the thing that I miss is the sale of my seat to someone else, I'm unhappy.

Understandable.  The client would be forbidden to write while paused.  Doing so would have undefined results or (even better) would raise an exception.

This is more for a scenario where you want to continue allowing writes to a certain database, but where you also want  to generate some sort of report or historical record based on a consistent snapshot in time.

Question:  Is the client/server connection at a low-enough level that the protocol could be extended to support this "pause/resume" behavior?  (Packing would not be permitted while any client had "paused" their connection, so that invalidated records in use by a paused connection would be .)


--
Matthew R. Scott
_______________________________________________
Durus-users mailing list
Durus-users <at> mems-exchange.org
http://mail.mems-exchange.org/mailman/listinfo/durus-users
Matthew Scott | 15 Sep 2009 07:05
Picon
Gravatar

Re: server-less multi-process concurrency for durus

_______________________________________________
Durus-users mailing list
Durus-users <at> mems-exchange.org
http://mail.mems-exchange.org/mailman/listinfo/durus-users

Gmane