Here is a seemingly simple programming problem that actually throws up several interesting issues: write a library procedure that recursively generates the entries in a given directory.
This is a simple first attempt. The enclosing main program takes two optional parameters; namely the directory to list, and the maximum number of results to print.
import io
procedure dir_recurse(f)
local p, f2, l, s
write("Opening ", f)
s := DirStream(f) | {
write("Failed to open ", f, ": ", &why)
fail
}
suspend f
p := FilePath(f)
repeat {
l := s.read_line() | break
if /l then
break
if Files.is_relative_dir(l) then
next
f2 := p.child(l).str()
if Files.is_directory(f2) then
suspend dir_recurse(f2)
else
suspend f2
}
write("Closing ", f)
s.close()
end
procedure main(a)
local f, d, n
d := a[1] | "/tmp"
n := integer(a[2]) | 100
every f := dir_recurse(d) \ n do
write(" Got ", f)
write("Exiting")
end
This procedure has three problems. Firstly, unless the generator is run to exhaustion, not all of the io.DirStream
objects used to read the directories will be closed. This is because the last line of dir_recurse
, which closes the DirStream
, s
, won’t be reached if dir_recurse
suspends, but isn’t resumed.
To illustrate this problem, try creating a test directory with some entries as follows :-
mkdir -p /tmp/here/is/a/deep/directory
touch /tmp/here/entry1 /tmp/here/is/a/entry2 /tmp/here/is/a/entry3 /tmp/here/is/a/deep/entry4
chmod -r /tmp/here/is/a/deep/directory
Note that the last command makes that directory unreadable. If we run the command
oit -s ./dir1.icn -x /tmp/here 5
we get the output :-
Opening /tmp/here
Got /tmp/here
Got /tmp/here/entry1
Opening /tmp/here/is
Got /tmp/here/is
Opening /tmp/here/is/a
Got /tmp/here/is/a
Got /tmp/here/is/a/entry3
Exiting
Note how none of the three opened DirStream
s has a corresponding close.
The second problem is that dir_recurse
doesn’t handle errors well. If it can’t open a directory, it just prints a message, and if it encounters an error reading a directory (indicated by s.read_line()
failing), it exits its loop. In either case, there is no way for the caller of dir_recurse
to tell that an error occurred.
To illustrate this, try running :-
oit -s ./dir1.icn -x /tmp/here
Here we see that the error opening the unreadable directory goes unnoticed by the higher level invocations, which just carry on regardless :-
Opening /tmp/here
Got /tmp/here
Got /tmp/here/entry1
Opening /tmp/here/is
Got /tmp/here/is
Opening /tmp/here/is/a
Got /tmp/here/is/a
Got /tmp/here/is/a/entry3
Opening /tmp/here/is/a/deep
Got /tmp/here/is/a/deep
Opening /tmp/here/is/a/deep/directory
Failed to open /tmp/here/is/a/deep/directory: Permission denied (errno=13)
Got /tmp/here/is/a/deep/entry4
Closing /tmp/here/is/a/deep
Got /tmp/here/is/a/entry2
Closing /tmp/here/is/a
Closing /tmp/here/is
Closing /tmp/here
Exiting
The third problem is that the procedure is potentially inefficient if it is handling a very deep directory structure. This is because every result from the deepest directories has to “bubble up” through all the higher level calls. Then, after it is received by the top-level caller, the whole chain of calls has to be resumed to get the next result. This is illustrated in the following diagram, showing what happens when a directory under “/usr/lib” is being iterated over. Every result generated involves a complete circuit of the loop shown by the green and orange arrows.
The second version tries to address the last of the above problems, by using coroutines, instead of nested suspension.
import io, ipl.pdco
procedure dir_recurse1(f)
local s, p, f2, l
write("Opening ", f)
s := DirStream(f) | {
write("Failed to open ", f, ": ", &why)
fail
}
coact(f)
p := FilePath(f)
repeat {
l := s.read_line() | fail
if /l then
break
if Files.is_relative_dir(l) then
next
f2 := p.child(l).str()
if Files.is_directory(f2) then
dir_recurse1(f2)
else
coact(f2)
}
write("Closing ", f)
s.close()
end
procedure dir_recurse(f)
suspend Seq{ dir_recurse1(f) }
end
procedure main(a)
local f, d, n
d := a[1] | "/tmp"
n := integer(a[2]) | 100
every f := dir_recurse(d) \ n do
write(" Got ", f)
write("Exiting")
end
Now dir_recurse()
starts off a new coroutine, which recursively descends the directory tree in dir_recurse1()
. As each result is encountered, it is sent back to dir_recurse()
via a co-expression activation. dir_recurse()
then suspends it to the main procedure. It makes use of the ipl.pdco.Seq
library procedure, which is a very simple one-liner which turns a co-expression’s activation results back into a results sequence. The result is that the main procedure sees the same results sequence as in the first version.
Here is a diagram showing what is happening. The results now flow along the blue lines, bypassing the lengthy green and orange loop.
Having said all this, a directory would have to be very deep indeed, and have a large number of entries, before the relatively small cost of procedure resumption becomes noticeable against the backdrop of the expensive I/O calls. In fact, for shallow directories, the above technique will probably be slightly slower than the first version (although in this case, the cost difference will almost certainly be irrelevant). For more on the performance advantage of the technique, please see this page.
The next version addresses the problem of ensuring that the DirStream
instances are properly closed, even if the results aren’t generated until exhaustion.
import io, util
record Msg(file, revert)
procedure dir_recurse2(s, f, revert)
local p, f2, l
coact(Msg(f, ¤t), revert)
p := FilePath(f)
repeat {
l := s.read_line() | fail
if /l then
return
if Files.is_relative_dir(l) then
next
f2 := p.child(l).str()
if Files.is_directory(f2) then
dir_recurse1(f2, revert)
else
coact(Msg(f2, ¤t), revert)
}
end
procedure dir_recurse1(f, revert)
local s
use {
{
write("Opening ", f)
s := DirStream(f)
},
dir_recurse2(s, f, revert),
{
write("Closing ", f)
s.close()
}
} | write("Problem accessing ", f, ": ", &why)
end
procedure dir_recurse(f)
local v, here, e
here := ¤t
e := create dir_recurse1(f, here)
v := @e | fail
repeat {
suspend v.file
v := coact(, v.revert) | fail
}
end
procedure main(a)
local f, d, n
d := a[1] | "/tmp"
n := integer(a[2]) | 100
clean {
every f := dir_recurse(d) \ n do
write(" Got ", f)
}
write("Exiting")
end
It achieves this by using the util.use
library procedure. This procedure takes three co-expressions: a setup expression which opens a particular resource (such as a file); an action expression which uses that resource, and finally a cleanup expression which closes the resource. Normally use
invokes the cleanup expression itself, after the action expression completes. But if the action expression never does this, as may happen in this case if we don’t generate all the directory entries to exhaustion, then we can use another procedure, util.clean
. This works in conjunction with use
, by managing a global list of cleanup expressions. use
adds each cleanup expression to the list, and removes them if and when they are run. Finally, clean
will run any cleanup expressions that use
hasn’t.
The call to use
is in dir_recurse1
which then calls dir_recurse2
to make use of the DirStream
resource. The procedure dir_recurse
is still the top-level interface, suspending the sequence of file names to the client program. You will see that it is rather more involved than in the previous version. This is because unfortunately use
also happens to introduce a complication. Before, we only had two co-expressions, namely the main co-expression and the one descending the directory tree. This means we didn’t have to keep track of which co-expression each one should activate or revert to; the descending co-routine just reverted to its activator when it had a result (this is what coact
doees by default). But now each time we call use
we are creating three co-expressions. One of those is activated and calls dir_recurse2
, so that procedure can no longer simply revert to its activator to send a result. So instead we have to pass a parameter (revert
) down to dir_recurse2
, giving it the co-expression we wish to activate when we have a result. Similarly, with each result we have to include the co-expression we wish the top level to resume when we want another result. So rather than just giving a file as a result, we give a record instance, Msg
, containing a file and the co-expression to revert to for the next result.
Try running this version on the test directory mentioned before, producing just the first few results :-
oit -s ./dir3.icn -x /tmp/here 5
which produces :-
Opening /tmp/here
Got /tmp/here
Got /tmp/here/entry1
Opening /tmp/here/is
Got /tmp/here/is
Opening /tmp/here/is/a
Got /tmp/here/is/a
Got /tmp/here/is/a/entry3
Closing /tmp/here/is/a
Closing /tmp/here/is
Closing /tmp/here
Exiting
In contrast to the first version, all the DirStream
s are closed properly.
This just leaves the problem of error notification to be addressed. The problem is that, since dir_recurse
generates a sequence of results, it can’t use failure to indicate an error. One possibility would be to use some kind of flag to indicate an error. But a better way is to use the library procedure exception.throw
to raise an exception. This has the added advantage that it is easy to abandon a deeply nested series of calls, which is what we have in this case.
Here is the final version of the program.
import io, util, exception
record Msg(file, revert)
procedure dir_recurse2(s, f, revert)
local p, f2, l
coact(Msg(f, ¤t), revert)
p := FilePath(f)
repeat {
l := s.read_line() | fail
if /l then
return
if Files.is_relative_dir(l) then
next
f2 := p.child(l).str()
if Files.is_directory(f2) then
dir_recurse1(f2, revert)
else
coact(Msg(f2, ¤t), revert)
}
end
procedure dir_recurse1(f, revert)
local s
use {
{
write("Opening ", f)
s := DirStream(f)
},
dir_recurse2(s, f, revert),
{
write("Closing ", f)
s.close()
}
} | throw("Problem accessing " || f || ": " || &why)
end
procedure dir_recurse(f)
local v, here, e
here := ¤t
e := create dir_recurse1(f, here)
v := @e | fail
repeat {
suspend v.file
v := coact(, v.revert) | fail
}
end
procedure main(a)
local f, d, n
d := a[1] | "/tmp"
n := integer(a[2]) | 100
clean { try1 {
every f := dir_recurse(d) \ n do
write(" Got ", f)
}} | write(&why)
write("Exiting")
end
The exception mechanism is actually very simple. At the top level, a procedure exception.try1
is used to wrap the call to dir_recurse
. All this does in effect is save the current co-expression (ie, ¤t
) in a global variable, throw_handler
. The expression passed to try1
is now invoked. If that expression wants to raise an exception, all it does is call throw
with an arbitrary exception message (an error message string in this case). throw
simply sets some global variables to save the message and other information, and invokes the handler.
Here is a diagram showing what happens when an exception is raised. try1
invokes the body coroutine passed to it. It then descends several levels of calls to dir_recurse1
. Finally, on encountering an error, throw
is called, and the handler (set by try1
, and stored in throw_handler
) is invoked.
This diagram rather simplifies the activities of the descending body coroutine. dir_recurse2
isn’t shown, and, since we are now using use
to invoke dir_recurse2
, each level of recursion actually invovles a co-expression activation, as well as an ordinary procedure invocation.
There is a similarity between this diagram and the one above showing how the traversal coroutine sends its results to the main coroutine, bypassing the call chain. Indeed both are using a similar technique, and it would be possible to avoid using the exception library and instead send an error notification as part of the Msg
record.
The call to try1
is in the main procedure, inside the call to clean
, ensuring that all the DirStream
s opened by the calls to use
are closed in the case of an exception.
To test this final version of the program, try running :-
oit -s ./dir4.icn -x /tmp/here
The output should be :-
Opening /tmp/here
Got /tmp/here
Opening /tmp/here/is
Got /tmp/here/is
Opening /tmp/here/is/a
Got /tmp/here/is/a
Opening /tmp/here/is/a/deep
Got /tmp/here/is/a/deep
Opening /tmp/here/is/a/deep/directory
Closing /tmp/here/is/a/deep
Closing /tmp/here/is/a
Closing /tmp/here/is
Closing /tmp/here
Problem accessing /tmp/here/is/a/deep/directory: Permission denied (errno=13)
Exiting
Note how the program stops when an error is encountered, and that all DirStream
s are properly closed.