Michael Lesniak | 5 Jul 10:12

Implementation of Red-Black Trees

Hello,

First, sorry, if it is the wrong mailing list, I wasn't sure which to
choose, since my question is neither language nor implementation
specific and there's no library mailing list.

My name is Michael Lesniak and I'm working at the parallel programming
research group of the university of Kassel. I'm interested in writing
an implementation of Red-Black Trees (and maybe other collections) in
Fortress and evaluating my observations. But there are still some
questions:

- Is there any prior work?
- (more general) do I have to consider anything special?
- Is it advisable to get full repository access, i.e. sign the Sun
Contributor Agreement? (This won't be a problem)

Fortress seems to grow into a really exciting language and I'd like to
help it grow :)

Best regards,
Michael

Michael Spiegel | 5 Jul 11:58

Re: Implementation of Red-Black Trees

Hi Michael,

I think Jan is best to answer the question, as he is the unofficial
high officer and keeper of the libraries.  As someone who has done a
little contribution to the libraries, I have the following advice for
you:

a) test early, and test often.  Since type-checking is currently
turned off by default, it's very likely that the code will look ok,
except some tweaky type error will be triggered somewhere during
runtime.

b) explicit type annotation is your friend.  Until type-inference is
implemented, the runtime system is guessing the types as best it can.
And sometimes it guesses wrong.  So make use of the type annotation
scheme.

c) don't write this foo[3] = 5, unless you expect to do boolean
comparison instead of mutable assignment.  You probably want to write
foo[3] := 5.  Eventually the former expression will be a type error if
used in the wrong context, but I don't know if the interpreter is
currently checking for that.

Best regards,
--Michael

On Sat, Jul 5, 2008 at 11:16 AM, Michael Lesniak <mlesniak@...> wrote:
> Hello,
>
> First, sorry, if it is the wrong mailing list, I wasn't sure which to
(Continue reading)

Jan-Willem Maessen | 5 Jul 16:25
Favicon

Re: Implementation of Red-Black Trees

On Jul 5, 2008, at 4:16 AM, Michael Lesniak wrote:

> Hello,
>
> First, sorry, if it is the wrong mailing list, I wasn't sure which to
> choose, since my question is neither language nor implementation
> specific and there's no library mailing list.

This is the correct mailing list.  The implementation mailing list is  
a good place to look if you're actively developing the core  
interpreter or FortressLibrary.  This list is the right place for  
general questions.

> My name is Michael Lesniak and I'm working at the parallel programming
> research group of the university of Kassel. I'm interested in writing
> an implementation of Red-Black Trees (and maybe other collections) in
> Fortress and evaluating my observations. But there are still some
> questions:
>
> - Is there any prior work?

The one interesting piece of prior work is the Map library; this uses  
a different data structure (weight-balanced trees), but has an API  
that you might want to consider matching.  This would permit you to  
use your Red-Black Tree implementation as a drop-in replacement for  
the default Map.  It also means that if your Map is better than what  
we've got, it'll be easier to make it the new default.

Realistically, though, you'll want to start with a small subset of  
this and work outward from there.
(Continue reading)

Raymond Myers | 6 Jul 18:28

Re: Implementation of Red-Black Trees

Michael:

I am also working on a Red-Black tree implementation. Perhaps we could combine forces?

The Red-Black Tree in "Purely Functional Data Structures", by Chris Okasaki is a good starting point. However, it is very minimal and does not include things like Delete and Union/Merge. For a linear time implementation of Union, you might look at the method addAll in Java's TreeMap, and its helpers.

One of the trickier parts of a Red-Black Tree is balancing after insertion.

(Adapted from Chris Okasaki's Haskell)
balance :: RB a -> a -> RB a -> RB a
balance (T R a x b) y (T R c z d) = T R (T B a x b) y (T B c z d)
balance (T R (T R a x b) y c) z d = T R (T B a x b) y (T B c z d)
balance (T R a x (T R b y c)) z d = T R (T B a x b) y (T B c z d)
balance a x (T R b y (T R c z d)) = T R (T B a x b) y (T B c z d)
balance a x (T R (T R b y c) z d) = T R (T B a x b) y (T B c z d)
balance a x b = T B a x b

Here is a balance() from one of the Fortress versions I am kicking around. (I hope it's ok to use the Unicode representation over the mailing list.)
This version only implements a Set, not a Map, to keep things simple. I also made RedNode and BlackNode separate types, both extending the NodeSet trait. Overall, it is probably better to make NodeSet an object with a color field, which makes balance() a bit more hairy.

Note that the Fortress code handles the base case "BlackNode⟦E⟧(l,v,r)" three different times, due to the nature of typecase. There may be ways of avoiding this.

balance⟦E⟧(l:Set⟦E⟧, v:E, r:Set⟦E⟧) : Set⟦E⟧ = do
  typecase (l,r) of    
    (RedNode⟦E⟧, RedNode⟦E⟧) ⇒ RedNode(l.toBlack(), v, r.toBlack())
    (RedNode⟦E⟧, Set⟦E⟧)     ⇒   
      typecase (ll,lr) = (l.l,l.r) of
        (RedNode⟦E⟧, Set⟦E⟧) ⇒ RedNode(ll.toBlack(), l.v, BlackNode(l.r, v, r))
        (Set⟦E⟧, RedNode⟦E⟧) ⇒ RedNode(ll.toBlack(), l.v, BlackNode(l.r, v, r))
        (Set⟦E⟧, Set⟦E⟧)     ⇒ BlackNode⟦E⟧(l,v,r) 
      end
    (Set⟦E⟧, RedNode⟦E⟧)     ⇒ 
      typecase (rl,rr) = (r.l,r.r) of 
        (Set⟦E⟧, RedNode⟦E⟧) ⇒ RedNode(BlackNode(l, v, r.l), r.v, rr.toBlack())
        (RedNode⟦E⟧, Set⟦E⟧) ⇒ RedNode(BlackNode(l, v, rl.l), rl.v, BlackNode(rl.r, r.v, r.r))
        (Set⟦E⟧, Set⟦E⟧)     ⇒ BlackNode⟦E⟧(l,v,r)
      end                    
    (Set⟦E⟧, Set⟦E⟧)         ⇒ BlackNode⟦E⟧(l,v,r)
  end
end 

Any feedback is appreciated! Hopefully, I'll be able to commit what I have soon. (I submitted my SCA on Tuesday, but haven't heard back yet.)

Cheers,

-- Ray Myers

On Sat, Jul 5, 2008 at 3:16 AM, Michael Lesniak <mlesniak <at> uni-kassel.de> wrote:

Hello,

First, sorry, if it is the wrong mailing list, I wasn't sure which to
choose, since my question is neither language nor implementation
specific and there's no library mailing list.

My name is Michael Lesniak and I'm working at the parallel programming
research group of the university of Kassel. I'm interested in writing
an implementation of Red-Black Trees (and maybe other collections) in
Fortress and evaluating my observations. But there are still some
questions:

- Is there any prior work?
- (more general) do I have to consider anything special?
- Is it advisable to get full repository access, i.e. sign the Sun
Contributor Agreement? (This won't be a problem)

Fortress seems to grow into a really exciting language and I'd like to
help it grow :)

Best regards,
Michael

Victor Luchangco | 6 Jul 23:13
Favicon

Re: Implementation of Red-Black Trees


On Jul 6, 2008, at 12:28 PM, Raymond Myers wrote:

> The Red-Black Tree in "Purely Functional Data Structures", by Chris  
> Okasaki is a good starting point. However, it is very minimal and  
> does not include things like Delete and Union/Merge. For a linear  
> time implementation of Union, you might look at the method addAll  
> in Java's TreeMap, and its helpers.

I haven't thought much about a functional red-black tree, but if you  
are looking for the standard imperative kind, the article in  
Wikipedia on red-black trees is excellent--one of the clearest I've  
seen.  And it covers Delete, which I find much trickier than Insert.   
I looked at Okasaki's "Red-Black Trees in a Functional Setting" (an  
earlier paper that I could get on the web), which describes a  
slightly different algorithm, but close enough.

> One of the trickier parts of a Red-Black Tree is balancing after  
> insertion.
>
> (Adapted from Chris Okasaki's Haskell)
> balance :: RB a -> a -> RB a -> RB a
> balance (T R a x b) y (T R c z d) = T R (T B a x b) y (T B c z d)
> balance (T R (T R a x b) y c) z d = T R (T B a x b) y (T B c z d)
> balance (T R a x (T R b y c)) z d = T R (T B a x b) y (T B c z d)
> balance a x (T R b y (T R c z d)) = T R (T B a x b) y (T B c z d)
> balance a x (T R (T R b y c) z d) = T R (T B a x b) y (T B c z d)
> balance a x b = T B a x b
>
> Here is a balance() from one of the Fortress versions I am kicking  
> around. (I hope it's ok to use the Unicode representation over the  
> mailing list.)
> This version only implements a Set, not a Map, to keep things  
> simple. I also made RedNode and BlackNode separate types, both  
> extending the NodeSet trait. Overall, it is probably better to make  
> NodeSet an object with a color field, which makes balance() a bit  
> more hairy.

Hmm... not sure which is preferable.  Making object a color field is  
good in an imperative version because then you can just change the  
field rather than allocating a new node.  In a functional version,  
you don't have that option, and distinguishing RedNode and BlackNode  
types allows you to use dispatch (or typecase).

> Note that the Fortress code handles the base case "BlackNode⟦E⟧ 
> (l,v,r)" three different times, due to the nature of typecase.  
> There may be ways of avoiding this.
>
> balance⟦E⟧(l:Set⟦E⟧, v:E, r:Set⟦E⟧) : Set⟦E⟧ = do
>   typecase (l,r) of
>     (RedNode⟦E⟧, RedNode⟦E⟧) ⇒ RedNode(l.toBlack(), v,  
> r.toBlack())
>     (RedNode⟦E⟧, Set⟦E⟧)     ⇒
>       typecase (ll,lr) = (l.l,l.r) of
>         (RedNode⟦E⟧, Set⟦E⟧) ⇒ RedNode(ll.toBlack(), l.v,  
> BlackNode(l.r, v, r))
>         (Set⟦E⟧, RedNode⟦E⟧) ⇒ RedNode(ll.toBlack(), l.v,  
> BlackNode(l.r, v, r))

I think this case should yield RedNode(BlackNode(ll, l.v, lr.l),  
lr.v, BlackNode(lr.r, v, r)).  (It's the "double rotation" case.)

Also, you can use lr rather than l.r in the two previous cases, and  
rl and rr instead of r.l and r.r in the two analogous cases below.

>         (Set⟦E⟧, Set⟦E⟧)     ⇒ BlackNode⟦E⟧(l,v,r)
>       end
>     (Set⟦E⟧, RedNode⟦E⟧)     ⇒
>       typecase (rl,rr) = (r.l,r.r) of
>         (Set⟦E⟧, RedNode⟦E⟧) ⇒ RedNode(BlackNode(l, v,  
> r.l), r.v, rr.toBlack())
>         (RedNode⟦E⟧, Set⟦E⟧) ⇒ RedNode(BlackNode(l, v,  
> rl.l), rl.v, BlackNode(rl.r, r.v, r.r))
>         (Set⟦E⟧, Set⟦E⟧)     ⇒ BlackNode⟦E⟧(l,v,r)
>       end
>     (Set⟦E⟧, Set⟦E⟧)         ⇒ BlackNode⟦E⟧(l,v,r)
>   end
> end

Since the first typecase is just dispatching on the arguments of  
balance, you can just overload balance and let dispatch do the work:

balance[\E\](l: RedNode[\E\], v: E, r: RedNode[\E\]) = RedNode 
(l.toBlack(), v, r.toBlack())
balance[\E\](l: RedNode[\E\], v: E, r: Set[\E\]) = typecase (ll, lr)  
= (l.l, l.r) of
   (RedNode[\E\], Set[\E\]) => RedNode(ll.toBlack(), l.v, BlackNode 
(lr, v, r))
   (Set[\E\], RedNode[\E\]) => RedNode(BlackNode(ll, l.v, lr.l),  
lr.v, BlackNode(lr.r, v, r))
   else => BlackNode(l,v,r)
end
balance[\E\](l: Set[\E\], v: E, r: RedNode[\E\]) = typecase (rl, rr)  
= (r.l, r.r) of
   (Set[\E\], RedNode[\E\]) => RedNode(BlackNode(l, v, rl), r.v,  
rr.toBlack())
   (RedNode[\E\], Set[\E\]) => RedNode(BlackNode(l, v, rl.l), rl.v,  
BlackNode(rl.r, r.v, rr))
   else => BlackNode(l,v,r)
end
balance[\E\](l: Set[\E\], v: E, r: Set[\E\]) =  BlackNode(l,v,r)

It's a matter of style which is preferable, but I prefer the  
overloading form.

Alternatively, if you defined isRed and isBlack predicates (and you  
could use these whether you chose to represent colors by types (as  
you do) or using fields), you could write the following:

balance[\E\](l: Set[\E\], v: E, r: Set[\E\]) =
   if l.isRed() AND r.isRed() then  (*  (R a x b) y (R c z d)  *)
     RedNode(l.toBlack(), v, r.toBlack())
   elif l.isRed() AND: l.l.isRed() then  (* (R (R a x b) y c) z d  *)
     RedNode(l.l.toBlack(), l.v, BlackNode(l.r, v, r))
   elif l.isRed() AND: l.r.isRed() then  (* (R a x (R b y c)) z d  *)
     RedNode(BlackNode(l.l, l.v, l.r.l), l.r.v, BlackNode(l.r.r, v, r))
   elif r.isRed() AND: r.r.isRed() then  (* a x (R b y (R c z d))  *)
     RedNode(BlackNode(l, v, r.l), r.v, r.r.toBlack())
   elif r.isRed() AND: r.l.isRed() then  (* a x (R (R b y c) z d))  *)
     RedNode(BlackNode(l, v, r.l.l), r.l.v, BlackNode(r.l.r, r.v, r.r))
   else
     BlackNode(l,v,r)
   end

This matches the structure of the Haskell code more closely--it only  
lacks the nice pattern-matching syntax.

By the way, I don't think the first case (l.isRed() AND r.isRed()) is  
necessary.  It just pushes red nodes up the tree, which is an  
advantage for imperative red-black trees when insertions  
significantly outnumber deletions (the usual case) because they allow  
balancing after insertion to terminate early.  But for functional red- 
black trees, you have to go all the way to the root anyway.

One subtlety in the code above is the use of conditional AND (i.e.,  
AND:), so that in the second case, for example, l.l is not evaluated  
unless l.isRed() returns true.  If l.isRed() is false, then l may be  
empty, in which case l.l would not be well defined.  We could also  
avoid the problem by having an empty node return itself for .l  
and .r.  Alternatively, we could nest the if expressions, but that  
reintroduces the extra BlackNode cases.

- Victor

Damon McCormick | 8 Jul 10:48

Re: Implementation of Red-Black Trees

If you are not already familiar with it, you might want to look at Sedgewick's recent work on Left-Leaning Red Black Trees.

http://www.cs.princeton.edu/~rs/talks/LLRB/RedBlack.pdf

-Damon


On Sat, Jul 5, 2008 at 1:16 AM, Michael Lesniak <mlesniak-uCjPqYABYO6OPzg9A3tDVg@public.gmane.org> wrote:
Hello,

First, sorry, if it is the wrong mailing list, I wasn't sure which to
choose, since my question is neither language nor implementation
specific and there's no library mailing list.

My name is Michael Lesniak and I'm working at the parallel programming
research group of the university of Kassel. I'm interested in writing
an implementation of Red-Black Trees (and maybe other collections) in
Fortress and evaluating my observations. But there are still some
questions:

- Is there any prior work?
- (more general) do I have to consider anything special?
- Is it advisable to get full repository access, i.e. sign the Sun
Contributor Agreement? (This won't be a problem)

Fortress seems to grow into a really exciting language and I'd like to
help it grow :)

Best regards,
Michael


Gmane