Richard O'Keefe | 31 Aug 06:45 2012
Picon

Sliding Window functional data structure

Consider the following interface

	type Ord k => Sliding_Window k v

	entries :: Sliding_Window k v -> [(k,v)]
	    The cost is expected to be linear in the length of
	    the result.  The pairs are listed in increasing
	    order of k.

	add :: Ord k => k -> v -> Sliding_Window k v -> Sliding_Window k v
	    precondition: all (< k) [k' | (k',_) <- entries q]
	    The cost should be at most O((log . length . entries) q).
	    post: entries (add k v q) = entries q ++ [(k,v)]

	since :: Ord k => k -> Sliding_Window k v -> [(k,v)]
	    answers [(k',v) | (k',v) <- entries q, k' > k]
	    The cost should be at most O((log . length . entries) q
                                         + length result)

	purge :: Ord k => k -> Sliding_Window k v -> Sliding_Window k v
	    answers q' such that entries q' = [(k',v) | (k',v) <- entries q,
                                               k' > k]
	    The cost should be at most O((log . length . entries) q
					 + length [k' | (k',v) <- entries q,
							k' <= k])

Ignoring costs, this can obviously be done trivially using
a list of pairs.  Paying some attention to costs, it can also
be done using some sort of balanced search tree.  The data
structure is close to a priority queue, but subtly different.
(Continue reading)

Ross Paterson | 31 Aug 10:03 2012
Picon

Re: Sliding Window functional data structure

On Fri, Aug 31, 2012 at 05:45:27AM +0100, Richard O'Keefe wrote:
> Consider the following interface
> 
> 	type Ord k => Sliding_Window k v
> 
> 	entries :: Sliding_Window k v -> [(k,v)]
> 	    The cost is expected to be linear in the length of
> 	    the result.  The pairs are listed in increasing
> 	    order of k.
> 
> 	add :: Ord k => k -> v -> Sliding_Window k v -> Sliding_Window k v
> 	    precondition: all (< k) [k' | (k',_) <- entries q]
> 	    The cost should be at most O((log . length . entries) q).
> 	    post: entries (add k v q) = entries q ++ [(k,v)]
> 
> 	since :: Ord k => k -> Sliding_Window k v -> [(k,v)]
> 	    answers [(k',v) | (k',v) <- entries q, k' > k]
> 	    The cost should be at most O((log . length . entries) q
>                                          + length result)
> 
> 	purge :: Ord k => k -> Sliding_Window k v -> Sliding_Window k v
> 	    answers q' such that entries q' = [(k',v) | (k',v) <- entries q,
>                                                k' > k]
> 	    The cost should be at most O((log . length . entries) q
> 					 + length [k' | (k',v) <- entries q,
> 							k' <= k])

Any search tree implementation will do add and purge in O(log n) time.
A finger tree will do add in O(1) and purge in O(log(min(r, n-r))) time,
where r in the length of the result.
(Continue reading)

ok | 31 Aug 14:24 2012
Picon

Re: Sliding Window functional data structure


> Any search tree implementation will do add and purge in O(log n) time.

Add's obvious, but could you explain to me about purge?
All of the explanations of search trees I'm familiar with,
if they bother to explain deletion at all, talk about how
to repair the balance of a tree after deleting *one* element.
It's not at all obvious to me how to quickly rebalance an
AVL tree after purging it, for example.

> A finger tree will do add in O(1) and purge in O(log(min(r, n-r))) time,

Thanks for that and the sample code.
Picon

Sliding Window functional data structure

One possibility would be a random-access list (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.5156), which is a kind of list of trees.

 

Then 'add' is trivially O(1) and 'since' is trivially O(length result).

 

The only tricky operation is purge, which naively would be O(N), where N is the length of entries.  But that can be reduced using a kind of lazy deletion.  Pair the random-access list with a single k, meaning "everything <= k is considered deleted, even if it is still physically present in the data structure".  Purge would update the k and scan the list of trees, removing any tree with a root <= k.  There are only a logarithmic number of trees, so this scan would be O(log N).  In fact, it could be made O(log #remaining entries) by noting that, when one such tree is found, the list can just be chopped off there because all later trees should also be removed.

 

The downside of lazy deletion is that you might use a bit more space than you need, because a portion of the last tree in the list might be logically deleted but still present.  I would expect this effect to be minor in practice.

 

Taking this approach, there is one situation that would require clarification, and possibly special handling: If you call purge with a k that is larger than the most recent add, thereby purging everything, are you later permitted to add a k' that is smaller than the purged k?  This might be disallowed by extending the precondition of add.  But if it is allowed, then should the new k' be affected by the previous purge, or not?  It's easy to implement either way; it just depends on how you want to specify the desired behavior.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane