Immutable Hash Array Mapped Tries
(require data/hamt) | package: hamt |
This package defines immutable hash array mapped tries (or HAMTs, for short). A HAMT is a dictionary, and its interface mimics that of an immutable hash table.
Hash array mapped tries are described in [Bagwell2000].
Caveat concerning mutable keys: If a key in an equal?-based HAMT is mutated (e.g., a key string is modified with string-set!), then the HAMT’s behavior for insertion, lookup, and remove operations becomes unpredictable.
procedure
(hamt-equal? hamt) → boolean?
hamt : hamt?
procedure
hamt : hamt?
procedure
hamt : hamt?
procedure
(hamt key val ... ...) → (and/c hamt? hamt-equal?)
key : any/c val : any/c
procedure
key : any/c val : any/c
procedure
key : any/c val : any/c
The hamt procedure creates a HAMT where keys are compared with equal?, hamteqv creates a HAMT where keys are compared with eqv?, and hamteq creates a HAMT where keys are compared with eq?.
The key to val mappings are added to the table in the order they appear in the argument list, so later mappings can hide earlier ones if the keys are equal.
procedure
(make-hamt [assocs]) → (and/c hamt? hamt-equal?)
assocs : (listof pair?) = null
procedure
(make-hamteqv [assocs]) → (and/c hamt? hamt-eqv?)
assocs : (listof pair?) = null
procedure
(make-hamteq [assocs]) → (and/c hamt? hamt-eq?)
assocs : (listof pair?) = null
make-hamt creates a HAMT where the keys are compared with equal?, make-hamteqv creates a HAMT where the keys are compared with eqv?, and make-hamteq creates a HAMT where the keys are compared with eq?.
See also the caveat concerning mutable keys above.
procedure
hamt : hamt? key : any/c
failure-result :
(λ () (raise (exn:fail:contract ....)))
If failure-result is a procedure, it is called (through a tail call) with no arguments to produce the result.
Otherwise, failure-result is returned as the result.
See also the caveat concerning mutable keys above.
procedure
(hamt-has-key? hamt key) → boolean?
hamt : hamt? key : any/c
procedure
(hamt-remove hamt key) → hamt?
hamt : hamt? key : any/c
See also the caveat concerning mutable keys above.
procedure
(hamt-count hamt) → exact-nonnegative-integer?
hamt : hamt?
procedure
(hamt-empty? hamt) → boolean?
hamt : hamt?
procedure
(hamt-values hamt) → (listof any/c)
hamt : hamt?
1 Performance
(require data/hamt/fast) | package: hamt |
This package provides exactly the same interface as data/hamt, but the procedures that it exports are not wrapped in contracts. Therefore, passing unexpected kinds of data to these procedures will likely result in error messages that aren’t especially helpful. On the other hand, they will run much faster than than their counterparts with contracts.
Because data/hamt provides essentially the same functionality as Racket’s built-in hash data type, there would be no point in using the former unless it provided some advantage over the latter. With contracts on, a hamt is usually slower than a hash, but with contracts off, it is usually faster. (You can validate this claim using the perf.rkt script included in the test directory of this package.) Therefore, I recommend using data/hamt/fast for production use.
A hamt is a tree with a branching factor of 16, so, while Racket’s built-in hash data type provides O(log2 N) access and update, a hamt provides the same operations at O(log16 N). That said, hash has lower constant-time overhead, and it’s implemented in C. My tests indicate that hash tends to have slightly better access performance, and hamt tends to be slightly faster at insertion and removal. (Rather perplexingly, hash seems to perform best on all operations when given sequential fixnums as keys.) You should do your own performance testing before concluding what kind of immutable dictionary to use in your program.
Bibliography
[Bagwell2000] | Phil Bagwell, “Ideal Hash Trees,” (Report). Infoscience Department, École Polytechnique Fédérale de Lausanne, 2000. http://lampwww.epfl.ch/papers/idealhashtrees.pdf |