The new data type is a red-black tree. This is an efficient form of lookup table. It can do everything that a TCL array can do. It adds the ability to iterate through an ordered list quickly. It can find the key closest to a given key. Most operations are O(log(n)). It can provide an alternative to lsort, when the data is constantly changing. The red-black tree works correctly with keys with embedded nulls; built in TCL arrays do not.
This data type is implemented as a new TCL data type. It may be stored in a variable, passed to or from a procedure, converted to a string, etc.
New procedures are added to access this data type. They can give the appearance of a map (i.e. a function, array, or keyed list), a multimap (i.e. a relation), a set, or a multiset (i.e. a bag).
Current I only deliver this extension as source, and I have only tested it under Linux. More platforms are expected soon. As is a binary distribution for MS Windows.
Most commands take a variable name, where the variable contains the tree, rather than passing the tree directly as an argument. This was done for efficiency. If exactly one variable contains the tree, the tree will be modified in place. If a command took a tree as an argument and returned a tree as an argument, but the tree was stored in a variable, a copy of the would be made unnecessarily. This is analogous to saying incr x rather than set x [expr {$x + 1}], except that the penalty for copying a tree is worst than the penalty for copying an integer. It is perfectly acceptable to copy trees into variables, procedure arguments, procedure return values, list elements, array elements, etc.
Except as noted below, all actions require O(log (n)) time where n is the number of elements in the tree.
Implicit commands:
Copy:
Copying a tree takes O(n log(n)). TCL's normal copy-on-write semantics apply.Convert to string:
Converting a tree to a string takes O(n log(n)) time. The format will definitely change in the near future. Currently the format is set up to match the arguments of array set and array get. Eventually there will be separate commands for efficiently transferring between arrays and trees.Convert from string:
Converting a string to a tree takes O(n log(n)) time. The format will change in the future, but will always be the same as the format for convert to string.General commands:
The following commands apply to all trees, no matter how they are used.tree::create
The create command creates and returns a new tree. At the moment that isn't very interesting. This is a place holder and will take more options in the future. In particular, the create command will need to know the sort order of the keys.tcl>set my_tree [tree::create]
tree::previous var key ?error_value?
The previous command tries to find which key which comes immediately before the given key. The given key does not have to be part of the tree. If the given key is the first key in the tree, or comes before all keys in the tree, error_value is returned. If error_value is needed but not supplied, the operation will fail.tcl>tree::set insert my_tree E
tcl>tree::set insert my_tree G
tcl>tree::set insert my_tree B
tcl>tree::set insert my_tree D
tcl>tree::set insert my_tree F
tcl>foreach key {A B C D E F G H I} {puts "$key -> [tree::previous my_tree $key {NONE}]"}
A -> NONE
B -> NONE
C -> B
D -> B
E -> D
F -> E
G -> F
H -> G
I -> Gtree::next var key ?error_value?
The next command is the same as the previous command, except that it goes in the other direction.tcl>foreach key {A B C D E F G H I} {puts "$key -> [tree::next my_tree $key {NONE}]"}
A -> B
B -> D
C -> D
D -> E
E -> F
F -> G
G -> NONE
H -> NONE
I -> NONEtree::closest var position key ?error_value?
The closest command will look for a given key in a tree. If the key exists, it will be returned. Otherwise this command is the same as the previous or next command. position must be "before" or "after" to specify what to do if the original key does not exist.tcl>foreach key {A B C D E F G H I} {puts "$key -> ([tree::closest my_tree before $key NONE], [tree::closest my_tree after $key NONE])"}
A -> (NONE, B)
B -> (B, B)
C -> (B, D)
D -> (D, D)
E -> (E, E)
F -> (F, F)
G -> (G, G)
H -> (G, NONE)
I -> (G, NONE)
View commands:
There are four different ways to view a tree. There is only one underlying tree data type, so you may mix and match these views on the same tree. But be careful; not all cases are defined for all views. One view may put a tree into an unexpected state which may confuse another view. That may cause an immediate error, or that may cause results which vary from one version to the next.tree::map
A map looks just like the underlying tree structure, and a lot like a TCL array. A map can be used to simulate / inspect all the other views, but it may be slower in some cases.tree::map exists var key
This function returns the number of instances of key in the map. Like an array, each key can only exist 0 or 1 times in a map.tcl>set my_map [tree::create]
tcl>tree::map insert my_map 1 one
tcl>tree::map insert my_map 2 two
tcl>tree::map insert my_map 2 II
tcl>tree::map exists my_map 1
1
tcl>tree::map exists my_map 2
1
tcl>tree::map exists my_map 3
0tree::map insert var key data
This procedure inserts a mapping between the given key and the given data into the tree. If there is already a mapping from the given key, that mapping is deleted before the new mapping is added.tree::map delete var key
This procedure deletes the mapping from the given key, if one exists. It returns the number of mappings deleted. For a map, this will always be 0 or 1.tcl>tree::map delete my_map 2
1
tcl>tree::map delete my_map 3
0
tcl>tree::map exists my_map 1
1
tcl>tree::map exists my_map 2
0
tcl>tree::map exists my_map 3
0tree::map foreach var key_var data_var ?from to? body
This procedure iterates over the elements of the mapping in order. For each pair, it assigns the key to key_var and it assigns the data to data_var then it executes the body. key_var and data_var may each be the null string, in which case it is ignored.tcl>set new_map [tree::create]
tcl>tree::map insert new_map 1 one
tcl>tree::map insert new_map 2 two
tcl>tree::map insert new_map 3 three
tcl>tree::map insert new_map 5 five
tcl>tree::map insert new_map 6 six
tcl>tree::map foreach new_map key data {puts "$key -> $data"}
1 -> one
2 -> two
3 -> three
5 -> five
6 -> sixIf from and to are specified, the loop will efficiently skip all keys before from and after to. From and to do not have to be in the map. If to comes before from, the body will not be executed at all. If from is the same as to, and that element is a key, then the body will be executed exactly once.
tcl>tree::map foreach new_map key data 4 6 {puts "$key -> $data"}
5 -> five
6 -> sixbreak, continue, error, and return act just like they would with the built in foreach command.
This command copies the tree before executing the body for the first time. The body is free to modify the tree variable as it chooses. That will not effect the number of times that the loop is executed or the values of the key_var or data_var.
tcl>tree::map foreach new_map key data {puts "$key -> $data";tree::map insert new_map $data $key}
1 -> one
2 -> two
3 -> three
5 -> five
6 -> six
tcl>tree::map foreach new_map key data {puts "$key -> $data";tree::map insert new_map $data $key}
1 -> one
2 -> two
3 -> three
5 -> five
6 -> six
five -> 5
one -> 1
six -> 6
three -> 3
two -> 2foreach takes O(m log(n)) time where m is the number of iterations, and n is the total number of elements in the map. If from and to are selected such that nothing is executed, there is still a cost of O(log(n)) time. Note the performance penalty for modifying the tree while iterating over it.
tree::map keys var
This returns a list of all the keys in the map, in order.tcl>tree::map keys new_map
1 2 3 5 6 five one six three twotree::map data
This returns a list of all the data associate with the keys in the map. The data items are ordered by their keys.tcl>tree::map data new_map
one two three five six 5 1 6 3 2tree::map value var key ?error_value?
Given a key this looks up and returns the corresponding data. If the key is not in the map, error_value is returned. If error_value is required, but not specified, an error is signaled.tcl>tree::map value new_map one {I don't know!}
1
tcl>tree::map value new_map five {I don't know!}
5
tcl>tree::map value new_map {five and a half} {I don't know!}
I don't know!
tree::set
A set is a simplified version of a map. Use a set when the data doesn't matter, only the keys. A set is implemented as a map, where each key maps to itself.tree::set exists var key
This returns the number of times that the given key exists in the given set, which will always be 0 or 1.tree::set insert var key
This inserts the given key into the given set.tree::set delete var key
This deletes the given key from the given set, if it exists. It returns the number of elements deleted, which will always be 0 or 1.tree::set foreach var key_var ?from to? body
This works just like tree::map foreach, except that the data_var is omitted because it is meaningless for a set.tree::set keys
This returns a list of all the keys in the set, in order.tree::multiset
A multiset is similar to a set, except that each key can appear more than once. It is implemented by a map which which maps members of the multiset to the number of times which they exist in the multiset. The multiset type was introduced to this extension for performance reasons. It is analogous to adding incr to TCL. It is slightly better, because all unknown keys are assumed to be zero, whereas it is an error to use incr on an uninitialized variable.tree::multiset exists var key
This returns the number of times that the given key exists in the given multiset, which will always be a natural number.tree::multiset insert var key
This inserts the given key into the given multiset.tree::multiset delete var key
This deletes one copy of the given key from the given set, if it exists. It returns the number of elements deleted, which will always be 0 or 1.tree::multiset foreach var key_var ?from to? body
This works just like tree::multiset foreach.tcl>set my_bag [tree::create]
tcl>tree::multiset insert my_bag 1
tcl>tree::multiset insert my_bag 2
tcl>tree::multiset insert my_bag 2
tcl>tree::multiset insert my_bag 3
tcl>tree::multiset insert my_bag 3
tcl>tree::multiset insert my_bag 3
tcl>tree::multiset foreach my_bag item {puts $item}
1
2
2
3
3
3tree::multiset keys
This returns a list of all the keys in the multiset, in order.tcl>tree::multiset keys my_bag
1 2 2 3 3 3tree::multimap
A multimap is a set of mappings where keys may be repeated. Just as a multiset is really a mapping from keys to integers, where the default value is 0 and insert does an incr, multimap is really a mapping from keys to lists, where the default value is the empty list, and insert does an lappend. Again, this could be done directly with a map, but this is more efficient.
tree::multimap exists var key
This returns the number of mappings from the given key in the given multimap. This will be a natural number.tree::multimap insert var key data
This inserts a mapping from the given key to given data in the given multimap. This will not remove any mappings.tree::multimap delete var key
This will try to delete one mapping from the given key in the given multimap. This will return the number of deletions made, which will be 0 if the key does not exist, or 1 if it does. It is not specified which mapping will be deleted if more than one exists.tree::multimap foreach var key_var ?from to? body
This works just like tree::map foreach.tree::multimap keys var
This returns a list of all keys in the given mapping. They are sorted. Each key appears once for every mapping.tree::multimap data var
This returns a list of the data in each mapping. The mappings are processed in the same order as in tree::multimap.tree::multimap value var key ?error_value?
This works like tree::map value. If there is more than one mapping from the given key, one is chosen arbitrarily.Multiple identical calls to this function will return the same value. Furthermore, a call to delete will delete the value which this function returned. Once the tree has been modified, there is no guarantee of what value will be reported next. Also, the selection of values may be implementation dependent and may change in the next version.
Version | Comments |
1.0 | Initial release |
If you like and/or use this extension, please let me know.