Object Icon inherits builtin lists, tables and sets from Icon. These are suitable for almost all purposes, but occasionally something different is needed. The datastruct
package provides some object-oriented abstractions which can help.
The three most important classes are List, Table and Set. Each is an abstract class with methods which reflect the operations of its builtin counterpart.
Each data structure instance has zero or more of several capabilities, reflected by an integer flag value returned by a get_mode()
method. This is the same technique used by io.Stream
to report whether a Stream is readable, writable, etc. But in this case the capabilities reflect how (if at all) the particular data structure may be altered. Sets and tables (but not lists) additionally have an ORDER
capability, meaning entries are stored in order, and thus will be generated ready-sorted during iteration.
Each of the three types of builtin datastructure has two corresponding datastruct
wrapper classes. One is a general purpose wrapper providing full access to the underlying data via the particular class’s methods. The other is rather more useful, and provides just a read-only view of the underlying data. For lists, the classes are datastruct.BuiltinList
and datastruct.UnmodifiableBuiltinList
respectively, with the other classes being similarly named.
To see how the read-only wrapper could be useful, consider a class which has a private builtin list member variable, and wishes to expose its content to callers. Exposing the private variable would of course allow the list’s content to be changed by the caller. Providing a copy of the list might be expensive. Returning a read-only wrapper avoids these problems. For example :-
class Something()
private l # A list.
...
public get_data()
return UnmodifiableBuiltinList(l)
end
...
end
# A user of the above
x := Something()
...
# Examine the elements
l := x.get_data()
every e := l.gen() do {
...
}
l.put("Junk") # Runtime error
One might legitimately object to the overhead of creating many instances of UnmodifiableBuiltinList
. At the cost of an extra instance variable this could be avoided :-
class Something()
private const
l, # A list.
ul # Its wrapper
...
public get_data()
return ul
end
public new()
ul := UnmodifiableBuiltinList(l := [])
...
return
end
end
Note that the use of const
ensures that ul
and l
cannot get out of synch after the instance is instantiated.
datastruct.LinkedList
, implements a Lisp-style singly linked list, so that distinct lists can share tail portions of their data :-
l1 := LinkedList(5, 4, 57, 12, 14)
# The tail list from element 4
l2 := l1.from(4)
# Now l2 is LinkedList(12,14)
# Add an element on the front and back
l2.push(1000)
l2.put(99)
# Now l2 is LinkedList(1000,12,14,99)
# and l1 is LinkedList(5,4,57,12,14,99)
More useful is datastruct.DoublyLinkedList
, with several operations possible in constant time, which would take linear time using other types of list. For example, given two lists of arbitrary length, the following splices all of the nodes of the second list onto the end of the first, in constant time, leaving the second list empty :-
l1 := DoublyLinkedList( ... )
l2 := DoublyLinkedList( ... )
l1.splice(0, l2)
The opposite of splicing is “extracting”, which removes part of a list, returning it as a new list.
l1 := DoublyLinkedList( ... )
n1 := l1.node(100); n2 := l1.node(-100)
l2 := DoublyLinkedList.unsafe_extract_nodes(n1, n2)
The last operation is again constant time. Note that the caller has to ensure that the nodes form a valid sub-list, with n2
coming after n1
, otherwise both source and output lists will be nonsensical; hence the name of the method. Note also that this is a static method (rather than an instance method of the list); list nodes can in fact be manipulated independently of their enclosing list structure.
One final point to mention about DoublyLinkedList
is that, like LinkedList
, getting the size is an “O(n)” operation.
datastruct.SubList
is a view of part of another list; it stores no data itself :-
l1 := BuiltinList( [5,4,57,12,14,99,200] )
# A view of l1[4:7]
l2 := SubList(l1, 4, 7)
# Change l1
l1.push(1)
l1.at(5) := 101
# l1 is now BuiltinList(1,5,4,57,101,14,99,200)
# l2 is now SubList(57,101,14)
# Now modify l2 to change l1
l2.at(1) := 300
# l1 is now BuiltinList(1,5,4,300,101,14,99,200)
# l2 is now SubList(300,101,14)
datastruct.SortTable
is a Table
which implements the ORDER
capability described above, using a red-black binary tree :-
t := SortTable()
t.insert("Zebra", 120)
t.insert("Human", 10)
t.insert("Aardvark", 34)
# t is now SortTable("Aardvark"->34;"Human"->10;"Zebra"->120)
A SortTable
can also quickly obtain the “rank” of a particular key, which means its position in the sorted key order. For example
t.rank("Human")
gives 2
, since “Human” is the second key in order. Conversely, the select
method gives the key for a particular rank :-
t.select(2)
gives “Human”. select
uses standard Icon indexing, so t.select(-1)
will give the highest ordered key (“Zebra”).
SortTable
has several methods which return references to the nodes in the red-black tree. Each node contains a (read-only) key and a value, which may be freely edited at any time to update the table entry’s value.
A node may be obtained when a new entry is added to the table (datastruct.SortTable.new_node()
), by key datastruct.SortTable.find_node()
, or by rank datastruct.SortTable.node()
.
The node may be removed from the tree with datastruct.SortTableNode.unlink()
.
SortTable
and SortTableNode
provide methods to iterate forward or backward over keys, values and nodes. These iterators are robust against any changes to the underlying tree structure may change, with one exception: the last entry returned must not be deleted before the next entry is generated. For example, this is not allowed :-
t := SortTable(,, "Cat", 12, "Mouse", 2, "Gerbil", 1)
# Generate the nodes (in ascending sort order)
e := create t.nodes()
n := @e
# n is now the SortTableNode(key="Cat", val=12)
n.unlink()
# t now has two remaining elements
n := @e # Runtime error: Node has been unlinked
Since it is often useful to iterate and remove elements in the same loop, the method datastruct.SortTableNode.move_unlink()
may be used instead, together with a while
loop. For example, the following procedure will iterate over a table and delete even-numbered keys :-
procedure del(t)
local n
# Get the first node by rank
n := t.node(1)
while
n := if n.key % 2 = 0 then
n.move_unlink(1)
else
n.get_next()
end
datastruct.EqTable
uses the lang.equals
procedure (and its cousin, lang.hash
) to determine table membership rather than the usual identity operator (===
), which is quite often useful. For example :-
# A table with two entries.
t := EqTable(,
[1], 1,
[2], 2)
# Modify the second entry's value.
t.insert([2], 200)
# Now we still have two entries ( [1]->1 and [2]->200 )
# (with a regular table we would have 3).
datastruct.WeakrefTable
wraps its keys in weak references. Thus on a garbage collection, the keys may “disappear”, and the size of the table reduce :-
# Set up some keys.
k1 := []; k2 := []; k3 := []; k4 := []
# Create a table with four entries.
t := WeakrefTable(, k1, 1,
k2, 2,
k3, 3,
k4, 4)
# Make two of the keys only referenced by the table.
k2 := k4 := &null
# Do a garbage collection
collect()
# Now t is reduced
t.size() # 2 (k1->1 and k3->3)
The functionality of any Table
can easily be extended into Set
form using datastruct.TableSet
, which provides a set view of the keys. datastruct.EqSet
is a particularly convenient subclass of TableSet
which wraps the EqTable
described above :-
# A three-element set
s := EqSet( [1], [2], [3] )
# Insert an element
s.insert([2])
# The size is still 3 (not 4, as a regular set would have).
Contents