Daniel Phillips | 26 Jul 00:13

Re: Comparison to Hammer fs design

On Friday 25 July 2008 11:53, Matthew Dillon wrote:
> 
> :Hi;
> :
> :The announcement of yet another filesystem:
> :
> :http://lkml.org/lkml/2008/7/23/257
> :
> :led to some comments about hammer fs:
> :
> :http://tux3.org/pipermail/tux3/2008-July/000006.html
> :
> :enjoy,
> :
> :     Pedro.
> 
>     Those are interesting comments.   I think I found Daniel's email address
>     so I am adding him to the To:   Dan, feel free to post this on your Tux
>     groups if you want.

How about a cross-post?

>     I did consider multiple-parentage...  that is the ability to have a
>     writable snapshot that 'forks' the filesystem history.  It would be
>     an ultra cool feature to have but I couldn't fit it into the B-Tree
>     model I was using.  Explicit snapshotting would be needed to make it
>     work, and the snapshot id would have to be made part of the B-Tree key,
>     which is fine.  HAMMER is based around implicit snapshots (being able
>     to do an as-of historical lookup without having explicitly snapshotted
>     bits of the history).
(Continue reading)

Matthew Dillon | 26 Jul 04:02

Re: Comparison to Hammer fs design

:>     so I am adding him to the To:   Dan, feel free to post this on your Tux
:>     groups if you want.
:
:How about a cross-post?

    I don't think it will work, only subscribers can post to the DFly groups,
    but we'll muddle through it :-)  I will include the whole of the previous
    posting so the DFly groups see the whole thing, if you continue to get
    bounces.

    I believe I have successfully added you as an 'alias address' to the
    DragonFly kernel list so you shouldn't get bounced if you Cc it now.

:Yes, that is the main difference indeed, essentially "log everything" vs
:"commit" style versioning.  The main similarity is the lifespan oriented
:version control at the btree leaves.

    Reading this and a little more that you describe later let me make
    sure I understand the forward-logging methodology you are using.
    You would have multiple individually-tracked transactions in
    progress due to parallelism in operations initiated by userland and each
    would be considered committed when the forward-log logs the completion
    of that particular operation?

    If the forward log entries are not (all) cached in-memory that would mean
    that accesses to the filesystem would have to be run against the log
    first (scanning backwards), and then through to the B-Tree?  You
    would solve the need for having an atomic commit ('flush groups' in
    HAMMER), but it sounds like the algorithmic complexity would be
    very high for accessing the log.
(Continue reading)

Matthew Dillon | 26 Jul 04:02

Re: [Tux3] Comparison to Hammer fs design

:>     so I am adding him to the To:   Dan, feel free to post this on your Tux
:>     groups if you want.
:
:How about a cross-post?

    I don't think it will work, only subscribers can post to the DFly groups,
    but we'll muddle through it :-)  I will include the whole of the previous
    posting so the DFly groups see the whole thing, if you continue to get
    bounces.

    I believe I have successfully added you as an 'alias address' to the
    DragonFly kernel list so you shouldn't get bounced if you Cc it now.

:Yes, that is the main difference indeed, essentially "log everything" vs
:"commit" style versioning.  The main similarity is the lifespan oriented
:version control at the btree leaves.

    Reading this and a little more that you describe later let me make
    sure I understand the forward-logging methodology you are using.
    You would have multiple individually-tracked transactions in
    progress due to parallelism in operations initiated by userland and each
    would be considered committed when the forward-log logs the completion
    of that particular operation?

    If the forward log entries are not (all) cached in-memory that would mean
    that accesses to the filesystem would have to be run against the log
    first (scanning backwards), and then through to the B-Tree?  You
    would solve the need for having an atomic commit ('flush groups' in
    HAMMER), but it sounds like the algorithmic complexity would be
    very high for accessing the log.
(Continue reading)

Daniel Phillips | 27 Jul 13:51

Re: Comparison to Hammer fs design

linSubscribed now, everything should be OK.

On Friday 25 July 2008 19:02, Matthew Dillon wrote:
> :Yes, that is the main difference indeed, essentially "log everything" vs
> :"commit" style versioning.  The main similarity is the lifespan oriented
> :version control at the btree leaves.
> 
>     Reading this and a little more that you describe later let me make
>     sure I understand the forward-logging methodology you are using.
>     You would have multiple individually-tracked transactions in
>     progress due to parallelism in operations initiated by userland and each
>     would be considered committed when the forward-log logs the completion
>     of that particular operation?

Yes.  Writes tend to be highly parallel in Linux because they are
mainly driven by the VMM attempting to clean cache dirtied by active
writers, who generally do not wait for syncing.  So this will work
really well for buffered IO, which is most of what goes on in Linux.
I have not thought much about how well this works for O_SYNC or
O_DIRECT from a single process.  I might have to do it slightly
differently to avoid performance artifacts there, for example, guess
where the next few direct writes are going to land based on where the
most recent ones did and commit a block that says "the next few commit
blocks will be found here, and here, and here...".

When a forward commit block is actually written it contains a sequence
number and a hash of its transaction in order to know whether the
commit block write ever completed.  This introduces a risk that data
overwritten by the commit block might contain the same hash and same
sequence number in the same position, causing corruption on replay.
(Continue reading)

Matthew Dillon | 27 Jul 23:31

Re: [Tux3] Comparison to Hammer fs design


:When a forward commit block is actually written it contains a sequence
:number and a hash of its transaction in order to know whether the
:...
:Note: I am well aware that a debate will ensue about whether there is
:any such thing as "acceptable risk" in relying on a hash to know if a
:commit has completed.  This occurred in the case of Graydon Hoare's
:Monotone version control system and continues to this day, but the fact
:is, the cool modern version control systems such as Git and Mercurial
:now rely very successfully on such hashes.  Nonetheless, the debate
:will keep going, possibly as FUD from parties who just plain want to
:use some other filesystem for their own reasons.  To quell that
:definitively I need a mount option that avoids all such commit risk,
:perhaps by providing modest sized journal areas salted throughout the
:volume whose sole purpose is to record log commit blocks, which then
:are not forward.  Only slightly less efficient than forward logging
:and better than journalling, which has to seek far away to the journal
:and has to provide journal space for the biggest possible journal
:transaction as opposed to the most commit blocks needed for the largest
:possible VFS transaction (probably one).

    Well, you've described both sides of the debate quite well.  I for
    one am in the 0-risk camp.  ReiserFS got into trouble depending on
    a collision space (albeit a small one).  I did extensive testing of
    64 bit collision spaces when I wrote Diablo (A USENET news system
    contemporary with INN, over 10 years ago) and even with a 64 bit space
    collisions could occur quite often simply due to the volume of article
    traffic.

    I think the cost can be reduced to the point where there's no need
(Continue reading)

Pedro F. Giffuni | 26 Jul 07:02
Picon
Favicon

Re: [Tux3] Comparison to Hammer fs design

Matthew Dillon wrote:
..
> 
>     I do wish we had something like LVM on BSD systems.  You guys are very
>     lucky in that regard.  LVM is really nice.
> 

AFAICT, there are patches for NetBSD:

http://netbsd-soc.sourceforge.net/projects/lvm/

enjoy,

     Pedro.

ps. Feel free to drop me from the CC list, I'm happily following the 
group :).

Matthew Dillon | 26 Jul 19:03

Re: [Tux3] Comparison to Hammer fs design


:>     I do wish we had something like LVM on BSD systems.  You guys are very
:>     lucky in that regard.  LVM is really nice.
:> 
:
:AFAICT, there are patches for NetBSD:
:
:http://netbsd-soc.sourceforge.net/projects/lvm/
:
:enjoy,
:
:     Pedro.

    It looks like something to watch.  He hasn't finished it yet, judging
    by the code.

					-Matt
					Matthew Dillon 
					<dillon <at> backplane.com>

Pedro F. Giffuni | 27 Jul 00:24
Picon
Favicon

Re: [Tux3] Comparison to Hammer fs design

Matthew Dillon wrote:
> :>     I do wish we had something like LVM on BSD systems.  You guys are very
> :>     lucky in that regard.  LVM is really nice.
> :> 
> :
> :AFAICT, there are patches for NetBSD:
> :
> :http://netbsd-soc.sourceforge.net/projects/lvm/
> :
> :enjoy,
> :
> :     Pedro.
> 
>     It looks like something to watch.  He hasn't finished it yet, judging
>     by the code.
> 

The GSoC project isn't finished yet but from the reports it seems 
functional:

http://mail-index.netbsd.org/tech-kern/2008/07/16/msg002121.html
http://mail-index.netbsd.org/tech-kern/2008/07/20/msg002178.html

Pedro.

Maslan | 27 Jul 02:31
Picon

Re: [Tux3] Comparison to Hammer fs design

Why LVM why not follow the ZFS path ?

On Sat, Jul 26, 2008 at 10:24 PM, Pedro F. Giffuni
<pfgshield-freebsd <at> yahoo.com> wrote:
> Matthew Dillon wrote:
>>
>> :>     I do wish we had something like LVM on BSD systems.  You guys are
>> very
>> :>     lucky in that regard.  LVM is really nice.
>> :> :
>> :AFAICT, there are patches for NetBSD:
>> :
>> :http://netbsd-soc.sourceforge.net/projects/lvm/
>> :
>> :enjoy,
>> :
>> :     Pedro.
>>
>>    It looks like something to watch.  He hasn't finished it yet, judging
>>    by the code.
>>
>
>
> The GSoC project isn't finished yet but from the reports it seems
> functional:
>
> http://mail-index.netbsd.org/tech-kern/2008/07/16/msg002121.html
> http://mail-index.netbsd.org/tech-kern/2008/07/20/msg002178.html
>
> Pedro.
(Continue reading)

Pedro F. Giffuni | 27 Jul 02:53
Picon
Favicon

Re: [Tux3] Comparison to Hammer fs design

Maslan wrote:
> Why LVM why not follow the ZFS path ?
> 

In FreeBSD's case I think the ZFS approach fits very well with GEOM.

Neither NetBSD or DragonFly have an operational ZFS (yet), so instead of 
growing their own fs-specific implementation from the ground up it looks 
reasonable to leverage the LVM effort.

Just my $0.02, of course.

Pedro.

Matthew Dillon | 27 Jul 02:50

Re: [Tux3] Comparison to Hammer fs design


:
:Why LVM why not follow the ZFS path ?

    I think its a mistake to integrate the storage layer into the
    filesystem.  That would effectively limit us to a single filesystem,
    ZFS.

					-Matt

Ivan Voras | 27 Jul 03:50
Picon
Favicon

Re: [Tux3] Comparison to Hammer fs design

Matthew Dillon wrote:
> :
> :Why LVM why not follow the ZFS path ?
> 
>     I think its a mistake to integrate the storage layer into the
>     filesystem.  That would effectively limit us to a single filesystem,
>     ZFS.

Well, ZFS itself isn't that integrated as it's said. You can use the 
low-level volume manager by itself and create UFS volumes on top of it. 
It isn't as straightforward as a simple volume manager since you still 
get the superfluous caching and checksumming but it works.

Daniel Phillips | 27 Jul 05:02

Re: [Tux3] Comparison to Hammer fs design

On Saturday 26 July 2008 17:50, Matthew Dillon wrote:
> :Why LVM why not follow the ZFS path ?
> 
>     I think its a mistake to integrate the storage layer into the
>     filesystem.  That would effectively limit us to a single filesystem,
>     ZFS.

"Me too".

Daniel


Gmane