6.3.90.900
data-red-black: augmented red black tree structures
The data/red-black library consists of several
red-black tree data structures. Its contents include two red-black
tree implementations:
as well as an application of augmented red-black trees to support an ordered set collection in data/red-black/ordered-set.
1 Positional Red-Black Trees
Danny Yoo <dyoo@hashcollision.org>
This is an implementation of an augmented red-black tree with extra information
to support position-based queries.
The intended usage case of this structure is to maintain an ordered sequence of
items, where each item has an internal length. Given such a sequence, we want
to support quick lookup by position and in-place insertions and deletions.
We also want to support the catenation and splitting of sequences.
For example:
This implementation follows the basic outline for order-statistic red-black
trees described in [clrs2009] and incorporates a few extensions suggsted
in [wein2005]. As a red-black tree, the structure ensures that the tree’s
height is never greater than 2*lg(#-of-nodes + 1), guaranteeing good
worst-case behavior for its operations.
The main types of values used in the library are trees and nodes.
A tree has a root node, and each node has holds arbitrary data
and a natural self-width, along with a reference to the elements smaller
(node-left) and larger (node-right). Each node also
remembers the entire width of its subtree, which can be accessed with
node-subtree-width. The tree holds first and last pointers into the
structure to allow for fast access to the beginning and end of the sequence. A
distinguished nil node lies at the leaves of the tree.
1.1 API
1.1.1 Data types
Constructs a new tree. The tree’s root is initially
nil.
Returns #t if x is a tree.
Returns the root node of the tree
t.
If the tree is empty, returns the distinguished
nil node.
Returns the first node in the tree.
Returns the last node in the tree.
Returns #t if x is a node.
Returns
#t if
x is a
singleton node. A singleton node
is unattached to any tree, and is not the
nil node.
Returns #t if x is a non-nil node.
Returns #t if x is the nil node.
Updates the self-width associated to node
n. When attached to a tree,
also propagates the width’s change to the widths of subtrees, upward through
its parents to the root. Note that the
node-data and
node-self-width are entirely independent.
Returns the width of the entire subtree at node n. This sums the
width of the left and right child subtrees, as well as its self-width.
Returns the parent of the node n.
Returns the left child of the node n.
Returns the right child of the node n.
Returns the color of the node n. The red-black tree structure uses
this value to maintain balance.
Returns #t if node n is red.
Returns #t if node n is black.
1.1.2 Operations
Adds node n as the first element in tree t.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "persimmon" 1)) |
|
> (insert-first! a-tree a-node) |
|
> (insert-first! a-tree a-node) |
insert-first!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 2nd argument of |
(-> tree? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:81.11 |
Adds node n as the last element in tree t.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "orange" 1)) |
|
> (insert-last! a-tree a-node) |
|
> (insert-last! a-tree a-node) |
insert-last!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 2nd argument of |
(-> tree? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:82.11 |
Adds node
n2 before node
n1 in tree
t. This effectively
makes
n2 the
predecessor of
n1.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "peach" 1)) |
|
> (insert-first! a-tree a-node) |
|
> (insert-before! a-tree a-node a-node) |
insert-before!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 3rd argument of |
(-> tree? non-nil-node? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:83.11 |
Adds node
n2 after node
n1 in tree
t. This effectively
makes
n2 the
successor of
n1.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "grapefruit" 1)) |
|
> (insert-first! a-tree a-node) |
|
> (insert-after! a-tree a-node a-node) |
insert-after!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 3rd argument of |
(-> tree? non-nil-node? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:84.11 |
Deletes node n from the tree t. After deletion, n
will become a singleton node.
Note that n must be attached to tree t or else will raise
a contract error:
> (define t1 (new-tree)) |
|
> (insert-first/data! t1 "tricky" 1) |
|
> (define n (new-node "tricky" 1)) |
|
; This should raise an error: |
> (delete! t1 n) |
delete!: contract violation |
expected: predicate-contract |
given: #<node> |
in: the n argument of |
(->i |
((t tree?) (n (t) (attached-in-tree/c t))) |
(result any/c)) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:90.11 |
Destructively joins trees t1 and t2, returning a tree that
has the contents of both. Every element in t1 is treated less than
the elements in t2.
Note that
t1 should not be
eq? to
t2 or else will raise
a contract error.
> (define t1 (new-tree)) |
|
> (join! t1 t1) |
join!: contract violation |
expected: first-order-and/c |
given: #<tree> |
in: the t2 argument of |
(->i |
((t1 tree?) |
(t2 (t1) (and/c tree? (not-eq?/c t1)))) |
(result tree?)) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:93.11 |
Destructively joins tree t1, singleton node n, and tree
t2, returning a tree that has the contents of both. Every element in
t1 is treated less than x, and x is treated smaller than all
the elements in t2.
Note that
t1 should not be
eq? to
t2 or else will raise
a contract error.
> (define t1 (new-tree)) |
|
> (define n (new-node "a-node" 1)) |
|
> (concat! t1 n t1) |
concat!: contract violation |
expected: first-order-and/c |
given: #<tree> |
in: the t2 argument of |
(->i |
((t1 tree?) |
(n singleton-node?) |
(t2 (t1) (and/c tree? (not-eq?/c t1)))) |
(result any/c)) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:95.18 |
Destructively splits tree t into two trees, the first containing the
elements smaller than node n, and the second containing those larger.
Afterwards, n becomes a singleton node.
Note that n must be attached to tree t or else raise
a contract error.
> (define t (new-tree)) |
|
|
|
; This should raise an error: |
> (define t2 (new-tree)) |
|
> (insert-last! t2 (new-node "bob" 1)) |
|
> (split! t (tree-root t2)) |
split!: contract violation |
expected: predicate-contract |
given: #<node> |
in: the n argument of |
(->i |
((t tree?) (n (t) (attached-in-tree/c t))) |
(values (t1 tree?) (t2 tree?))) |
contract from: |
<pkgs>/data-red-black/data/red-black/positional.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/positional.rkt:98.11 |
Resets the contents of the tree to the empty state.
Searches for the node at or within the given position
p of the tree.
If the position is out of bounds, returns
nil.
Note: nodes with a self-width of zero are effectively invisible to
search, and will be skipped over.
|
t : tree? |
p : natural-number/c |
Searches for the node at or within the given position
p of the tree.
This is an extension of
search that returns a second value: the offset
into the element where the search has terminated. If the position is out of
bounds of any element, the first component of the returned value is
nil.
Given a node n, returns the minimum element of the subtree rooted at
n.
Note: to get the minimum of the whole tree, it’s faster to use
tree-first.
Given a node n, returns the maximum element of the subtree rooted at
n.
Note: to get the maximum of the whole tree, it’s faster to use
tree-last.
Given a node n contained in some tree, returns the immediate
successor of n in an inorder traversal of that tree.
Given a node n contained in some tree, returns the immediate
predecessor of n in an inorder traversal of that tree.
Given a node n contained in some tree, returns the immediate
position of n in that tree.
Given a tree, returns a list of its data and width pairs.
Iterates a function f across the nodes of the tree, in inorder, preorder,
and postorder respectively.
1.2 Uncontracted library
This library uses contracts extensively to prevent the user from messing up;
however, the contract checking may be prohibitively
expensive for certain applications.
The uncontracted bindings of this library can be accessed through:
(require (submod data/red-black/positional uncontracted))
This provides the same bindings as the regular API, but with no contract
checks. Use this with extreme care: Improper use of the uncontracted form of
this library may lead to breaking the red-black invariants, or (even worse)
introducing cycles in the structure. If you don’t know whether you should be
using the uncontracted forms or not, you probably should not.
2 Augmented Red-Black Trees
Danny Yoo <dyoo@hashcollision.org>
This is an implementation of an augmented red-black tree that extends the nodes
of a basic red-black tree with attached metadata at every node. The metadata
at a node should be a function of the data of the current node and the left and
right children.
One intended usage case of this structure is to maintain an ordered sequence of
items, where each item has an internal length. Given such a sequence, we want
to support quick lookup by position and in-place insertions and deletions. We
also want to support the catenation and splitting of sequences.
For example:
; Here, the metadata represents the length of the contents |
; of the entire subtree: |
|
|
|
|
> (define a-tree (new-catenated-string-tree)) |
|
|
|
; Assuming the metadata is correct at every node, we can search |
; for a node by its "position" by using the metadata: |
|
|
; Now we can search: |
> (node-data (search a-tree 0)) |
"This" |
> (node-data (search a-tree 10)) |
"test" |
> (define at-test-node (search a-tree 10)) |
|
; We can also insert within the tree, |
> (insert-before/data! a-tree at-test-node "small") |
|
> (tree-items a-tree) |
'("This" " " "is" " " "a" " " "small" "test") |
; and split at the node holding "small". |
> (define at-small-node (search a-tree 10)) |
|
> (define-values (left-side right-side) (split! a-tree at-small-node)) |
|
> (tree-items left-side) |
'("This" " " "is" " " "a" " ") |
> (tree-items right-side) |
'("test") |
> (define joined-tree (join! left-side right-side)) |
|
> (tree-items joined-tree) |
'("This" " " "is" " " "a" " " "test") |
The interpretation of the metadata is up to clients. Another approprate
metadata may hold subtree size rather than string length, in which case
the tree acts as an container where items can be found through their index:
This augmented red-black tree implementation follows the basic outline in
[clrs2009] and incorporates a few extensions suggsted in [wein2005].
As a red-black tree, the structure ensures that the tree’s height is never
greater than 2*lg(#-of-nodes + 1), guaranteeing good worst-case behavior
for its operations.
The main types of values used in the library are trees and nodes.
A tree has a root node (tree-root), and each node has holds
arbitrary data (node-data) and metadata
(node-metadata), along with a reference to the elements smaller
(node-left) and larger (node-right). The tree holds first
and last pointers into the structure to allow for fast access to the beginning
and end of the sequence. A distinguished nil node lies at the leaves
of the tree.
2.1 API
2.1.1 Data types
Constructs a new tree. The tree’s root is initially
nil.
When provided a #:metadata-f, each node in the tree will
have an associated node-metadata that is computed through its
node-data, node-left and node-right.
The #:metadata-f must not mutate the tree as a side effect; contracts
currently do not enforce this requirement, but may in the future.
Returns #t if x is a tree.
Returns the root node of the tree
t.
If the tree is empty, returns the distinguished
nil node.
Returns the metadata-computing function for the tree t.
Returns the first node in the tree.
Returns the last node in the tree.
Returns #t if x is a node.
Returns
#t if
x is a
singleton node. A singleton node
is unattached to any tree, and is not the
nil node.
Returns #t if x is a non-nil node.
Returns #t if x is the nil node.
Returns the data associated to node n.
Assigns the data associated to node n. Note that this also may update
the metadata of the tree if the tree has been constructed with a
#:metadata-f.
Returns the width of the entire subtree at node n. This sums the
width of the left and right child subtrees, as well as its self-width.
Returns the parent of the node n.
Returns the left child of the node n.
Returns the right child of the node n.
Returns the color of the node n. The red-black tree structure uses
this value internally to maintain binary tree balance; most users will not need
to inspect this value.
Returns #t if node n is red.
Returns #t if node n is black.
2.1.2 Operations
Adds node n as the first element in tree t.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "persimmon")) |
|
> (insert-first! a-tree a-node) |
|
> (insert-first! a-tree a-node) |
insert-first!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 2nd argument of |
(-> tree? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:71.11 |
Adds node n as the last element in tree t.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "orange")) |
|
> (insert-last! a-tree a-node) |
|
> (insert-last! a-tree a-node) |
insert-last!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 2nd argument of |
(-> tree? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:72.11 |
Adds node
n2 before node
n1 in tree
t. This effectively
makes
n2 the
predecessor of
n1.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "peach")) |
|
> (insert-first! a-tree a-node) |
|
> (insert-before! a-tree a-node a-node) |
insert-before!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 3rd argument of |
(-> tree? non-nil-node? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:73.11 |
Adds node
n2 after node
n1 in tree
t. This effectively
makes
n2 the
successor of
n1.
Note that attempting to add an attached, non-singleton node to a tree will
raise a contract error.
> (define a-tree (new-tree)) |
|
> (define a-node (new-node "grapefruit")) |
|
> (insert-first! a-tree a-node) |
|
> (insert-after! a-tree a-node a-node) |
insert-after!: contract violation |
expected: singleton-node? |
given: #<node> |
in: the 3rd argument of |
(-> tree? non-nil-node? singleton-node? any) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:74.11 |
Deletes node n from the tree t. After deletion, n
will become a singleton node.
Note that n must be attached to tree t or else will raise
a contract error:
> (define t1 (new-tree)) |
|
> (insert-first/data! t1 "tricky") |
|
> (define n (new-node "tricky")) |
|
; This should raise an error: |
> (delete! t1 n) |
delete!: contract violation |
expected: predicate-contract |
given: #<node> |
in: the n argument of |
(->i |
((t tree?) (n (t) (attached-in-tree/c t))) |
(result any/c)) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:80.11 |
Destructively joins trees t1 and t2, returning a tree that
has the contents of both. Every element in t1 is treated less than
the elements in t2.
Note that t1 and t2 should share the same
tree-metadata-f and neither tree should be eq? to the other.
Violations of either condition will raise a contract error.
> (define t1 (new-tree)) |
|
> (join! t1 t1) |
join!: contract violation |
expected: first-order-and/c |
given: #<tree> |
in: the t2 argument of |
(->i |
((t1 tree?) |
(t2 |
(t1) |
(and/c |
tree? |
(not-eq?/c t1) |
(share-metadata-f/c t1)))) |
(result tree?)) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:83.11 |
Destructively joins tree t1, singleton node n, and tree
t2, returning a tree that has the contents of both. Every element in
t1 is treated less than x, and x is treated smaller than all
the elements in t2.
Note that t1 and t2 should share the same
tree-metadata-f and neither tree should be eq? to the other.
Violations of either condition will raise a contract error.
> (define (f1 data left right) 1) |
|
> (define (f2 data left right) 1) |
|
; f1 and f2 are distinct function values: they won't compare the same. |
> (define t1 (new-tree #:metadata-f f1)) |
|
> (define t2 (new-tree #:metadata-f f2)) |
|
> (define n (new-node "a-node")) |
|
> (concat! t1 n t2) |
concat!: contract violation |
expected: first-order-and/c |
given: #<tree> |
in: the t2 argument of |
(->i |
((t1 tree?) |
(n singleton-node?) |
(t2 |
(t1) |
(and/c |
tree? |
(not-eq?/c t1) |
(share-metadata-f/c t1)))) |
(result any/c)) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:89.18 |
Destructively splits tree t into two trees, the first containing the
elements smaller than node n, and the second containing those larger.
Afterwards, n becomes a singleton node.
Note that n must be attached to tree t or else raise
a contract error.
> (define t (new-tree)) |
|
|
|
; This should raise an error: |
> (define t2 (new-tree)) |
|
> (insert-last! t2 (new-node "bob")) |
|
> (split! t (tree-root t2)) |
split!: contract violation |
expected: predicate-contract |
given: #<node> |
in: the n argument of |
(->i |
((t tree?) (n (t) (attached-in-tree/c t))) |
(values (t1 tree?) (t2 tree?))) |
contract from: |
<pkgs>/data-red-black/data/red-black/augmented.rkt |
blaming: top-level |
(assuming the contract is correct) |
at: |
<pkgs>/data-red-black/data/red-black/augmented.rkt:97.11 |
Resets the contents of the tree to the empty state.
Given a node n, returns the minimum element of the subtree rooted at
n.
Note: to get the minimum of a whole tree, it’s faster to use
tree-first.
Given a node n, returns the maximum element of the subtree rooted at
n.
Note: to get the maximum of a whole tree, it’s faster to use
tree-last.
Given a node n contained in some tree, returns the immediate
successor of n in an inorder traversal of that tree.
Given a node n contained in some tree, returns the immediate
predecessor of n in an inorder traversal of that tree.
Given a tree, returns a list of its data and width pairs.
Iterates a function f across the nodes of the tree, in inorder, preorder,
and postorder respectively.
2.2 Uncontracted library
This library uses contracts extensively to prevent the user from messing up;
however, the contract checking may be prohibitively
expensive for certain applications.
The uncontracted bindings of this library can be accessed through:
(require (submod data/red-black/augmented uncontracted))
This provides the same bindings as the regular API, but with no contract
checks. Use this with extreme care: Improper use of the uncontracted form of
this library may lead to breaking the red-black invariants, or (even worse)
introducing cycles in the structure. If you don’t know whether you should be
using the uncontracted forms or not, you probably should not.
3 Ordered sets: mutable sets with a total order
Danny Yoo <dyoo@hashcollision.org>
This module provides a mutable, set-like container of totally-ordered elements.
As a quick example:
For convenience, these ordered sets use the notion of the total-order defined
by the datum-order function in data/order. The
ordered-set constructor can take an optional #:order
comparision function to customize how its elements compare.
But be careful about defining your own ordering function. The following
example shows where it might go wrong:
; order-strings-by-length: string string -> (or/c '< '= '>) |
|
|
> (define a-set (ordered-set #:order order-strings-by-length)) |
|
|
|
; Note that we know that "of" will be missing from the list! |
; That's because the comparison function makes "of" and "we" |
; look the same: |
> (ordered-set->list a-set) |
'("we" "few" "band" "happy" "brothers") |
The ordered set trusts the order provided by #:order for all
comparisons, including equality. In the example above, "of" and
"we" compare the same, and ordered-set-add! ignores
operations that insert a value that already exists in the set.
On the implementation side: an ordered set hold onto its elements with a
red-black tree, so that most operations work in time logarithmic to the set’s
ordered-set-count.
3.1 API
(ordered-set [#:order order] initial-elt ...) → ordered-set/c
|
order : (any/c any/c . -> . (or/c '< '= '>)) = datum-order |
initial-elt : any/c |
Constructs a new ordered set.
By default, this uses datum-order
to compare its elements; this default can be overridden by providing
an #:order that can compare two elements:
; Overriding #:order for descending sort: |
> (define (cmp x y) | (cond [(< x y) '>] | [(= x y) '=] | [(> x y) '<])) |
|
|
|
|
> (for/list ([x yet-another-set]) x) |
'(9 5 4 3 1) |
Returns true if x is an ordered set.
A flat contract for ordered sets.
Returns the total-ordering function used by ordered set a-set.
Returns true if the ordered set a-set is empty.
Returns the number of elements in the ordered set a-set.
Returns true if x is an element in ordered set a-set.
Adds x into ordered set a-set. If x is already
an element, this has no effect.
Be aware that this operation depends on the correctness of the total-ordering
function of a-set. If the order doesn’t distinguish between
unequal elements, unexpected results may follow:
Removes x from ordered set a-set. If x is not
an element of a-set, this has no effect.
Just as with ordered-set-add!, ordered-set-remove!’s
behavior depends on the correctness of the set’s total ordering function.
Returns the elements of ordered set a-set as a list, where
the elements are sorted according to a-set’s total-order.
Explicitly constructs a sequence of the elements in ordered set a-set.
Note that ordered sets are already treated implicitly as sequences.
4 Bibliography
Bibliography