List Reversal

List Reversal

A first experiment

We first tried to analyse the following list reversal program:



type cell = record
  n: ^cell;
end
type list = ^cell;

procedure main(list) is
var res;
begin
main_entry: res := reverse(list);
main_end:
end







  



function reverse(x) : y is
var z,t;
begin
reverse_entry:  z := x->n;
reverse1:       x->n := null;
reverse2:       if (z != 0) then begin
reverse3:          y := reverse(z);
reverse4:          z->n = x;
                end
                else begin
reverse5:         y := x;
                end
reverse_end:
reverse_summary:
end

The specification is split in several files.

File Description
ReverseSimple.tvp main file, including auxiliary files and defining the equation system
predicate.tvp standard instrumentation predicates and constraint rules
functionInterprocedural.tvp standard semantic functions for procedure calls and returns
function.tvp semantic functions for various intraprocedural instructions
reverse_input.tvs set of initial states for the analysis, namely the set of list with a least two elements

The result of the analysis is described by the PostScript file ReverseSimple.ps. It gives:

  1. the equation system depicted as a graph (with circle giving nodes/control points and boxes operations between nodes);
  2. the set of reachable structures for each control point of the program. The shape graphs associated to the control points are given in the following order: main_entry, reverse_entry, reverse1, reverse2, reverse3, reverse4, reverse5, reverse_end, reverse_summary, main_end.

It is particularly interesting to look at what is reachable at the exit point of the reverse procedure (node reverse_summary). This represents the semantic predicate transformer associated to the reverse function. The structures associated to the node/control point main_entry shows that the analysis models accurately that the list has been reversed.

A more complex experiment

We next experimented some more operations, by taking the new main procedure



procedure main(list) is
var res,res1,res2,res3
begin
main_entry:     res := reverse(list);
main1:          res1 := reverse(res);
main2:          res2 := list->n;
main3:          res3 := reverse(res2)
main_end:
end

Only the file ReverseSimple.tvp is changed (we added a new instrumentation predicate and related constraint rules, and of course we modified the equation system). We obtained the result Reverse.ps. Notice that we obtained the expected result at node main_end. Notice also that we show at node main_1 that applying twice the procedure reverse gives the identity.

Other experiments and comparison with [RS01] can be found here.

Statistics

They are given in the following table.

In the "Number of Structures" field, we indicate the number of reachable structures at the following control points:

Input Program Nb of predicates (abs/total) and constraints Time Nb of Structures File
reverse_input.tvs ReverseSimple.tvp 40/56 & 392 11s 3/4/8/7/3 ReverseSimple.ps
reverse_input.tvs Reverse.tvp 58/80 & 596 55s 9/8/14/10/3 Reverse.ps

The reason why the number of reachable structures at point reverse_summary is not the same in the two case is the following one: our analysis compute the summary function of a procedure specialized on the reachable input. Here, in the second case there are more particular cases (list with a single cell, only two cells, etc...) because reverse is not called only with a list of at least two elements (node main_entry) but also with such a list minus one element (node main_3). Moreover, there is the problem of the context of the input given to the procedure detailed here (properties of memory cells not reachable from the input are irrelevant but induces a combinatorial explosion).

The sensibly higher running time in the second case is due to:

More detailed description

Core predicates.

The core predicates we will use are the unary predicates corresponding to variables of reference type, and the binary predicates n corresponding to the link of single linked list.

Standard instrumentation predicates.

We detail below rather "standard" instrumentation predicates, common to most TVLA specifications. f[m] denotes any field predicate (binary core predicate representing the field of a record of pointer type), m a mode (either inp, out or ext), and z any local variable (unary core predicate).

The predicates downStar[f,m](v1,v2) does not appear in the graphs, in order to make them more readable.

Invariant satisfied by a list node

Our algorithm will manipulate lists of unbounded sizes, so we should be able to characterize properties by the value of instrumentation predicates instead of relying only on the (abstract) universe and the values of core predicates. iA proper list node of a list pointed by a variable z should satisfy the following properties

  1. !is[n,m](v) (the cell is not pointed by more than one cell)
  2. !c[n,m](v) (a cell cannot belong to a cycle)
  3. r[n,m,z](v) (the cell is reachable from the head of the list)

We should at least prove that the list reversal procedure returns a proper single linked list.

Cross instrumentation predicates

We will also need so-called "cross" instrumentation predicates, to represent relationship between the input memory heap and the current/output memory heap.

Predicate id_succ, standard in interprocedural TVLA specifications, allows to model the identity relationship of a cell between input and output state, and the predicates reverse_n_succ the fact that the outgoing n link has been reversed.

The dual predicates id_pred and reverse_n_pred are useful only for the more complex Reverse.tvp example, where one has to deduce that reversing twice a link gives the initial link.


List Reversal