Compilation of rewriting rules to decision trees.
This module handles the compilation of rewriting rules to “decision trees” based on the method described by Luc Maranget.
Types for decision trees
The types involved in the definition of decision trees are given in module Tree_type
(they could not be defined here as this would lead to a cyclic dependency).
Example: let us consider the rewrite system for symbol f
defined as:
f Z (S m) ↪ S m
,f n Z ↪ n
andf (S n) (S m) ↪ S (S m)
.
A possible decision tree might be
├─?─∘─Z─∘ ↪ n
├─Z─∘─Z─∘ ↪ n
│ └─S─∘─?─∘ ↪ S m
├─S─∘─Z─∘ ↪ n
└─S─∘─?─∘ ↪ S (S m)
with ∘
being a node (with an omitted label) and ─u─
being an edge with a matching on symbol u
or a variable or wildcard when ?
. Typically the portion S─∘─Z
is made possible by a swap.
Conditions for decision trees
The decision trees used for pattern matching include binary nodes carrying conditions (see constructor Tree_type.tree.Cond
) that must be tested during evaluation to select which of the two subsequent branches to follow. There are two forms of conditions:
Convertibility conditions are used whenever we try to apply a rule that is not left-linear, for example f $x $x (s $y) ↪ r
. In this case we need to test whether the two terms at position {0}
and {1}
(that correspond to pattern variable $x
) are convertible: the rule may only apply if that is the case. Note that in general there may be more than two occurrences of a variable, so we may need to check convertibility several times.
Free variable constraints are used to verify which variables are free in a term. If there is a rule of the form f (λ x y, $Y[y]) ↪ $Y
, then we need to check that the term at position {0.0}
does not depend on x
(or, put in other words, that among x
and y
, only y
is allowed).
Module providing a representation for pools of conditions.
Clause matrix and pattern matching problem
NOTE We ideally need the type stack
of terms used during evaluation (argument stk
of Eval.tree_walk
) to have fast element access (for swaps) as well as fast destruction and reconstruction operations (for more information on these operations have a look at Lplib.List.destruct
and Lplib.List.reconstruct
). These operations are intuitively used to inspect a particular element, reduce it, and then reinsert it. It seems that in practice, the naive representation based on lists performs better than more elaborate solutions, unless there are rules with many arguments. Alternatives to a list-based implementation would be cat-enable lists (or deques), finger trees (Paterson & Hinze) or random access lists. With the current implementation, destruct e i
has a time complexity of Θ(i)
and reconstruct l m r
has a time complexity of Θ(length l + length m)
.
Clause matrices encode pattern matching problems. The clause matrix C can be denoted C = P ↪ A where P is a pattern matrix and A is a column of RHS. Each line of a matrix is a pattern to which a RHS is attached. When reducing a term, if a line filters the term (i.e., the term matches the pattern) then term is rewritten to the RHS.
harvest lhs rhs subst vi slot
exhausts linearly the LHS lhs
composed only of pattern variables with no constraints, to yield a leaf with RHS rhs
and the substitution subst
(which is completed). Mapping vi
contains variables that may appear free in patterns. slot
is the number of subterms that must be memorised.
NOTE compile
produces a decision tree from a set of rewriting rules (in practice, they all belong to a same symbol). This tree is designed to be used in the reduction process, in function Eval.tree_walk
. The purpose of the trees is to
- declare efficiently whether the input term (during evaluation) matches some LHS from the orginal rules (the one from which the tree is built);
- build a substitution mapping some (sub) terms of the filtered term to the rewritten term.
The first bullet is handled by the definition of the trees, e.g. if a switch node contains in its mapping a tree constructor s
, then terms having (as head structure) symbol s
will be accepted.
The second bullet is managed by the substitution of type Tree_type.rhs_substit
. If a LHS contains a named pattern variable, then it is used in the RHS. Sub-terms that are filtered by named variables that are bound in the RHS are saved into an array during evaluation. When a leaf is reached, the substitution is applied on the RHS, copying terms from that array to the RHS. Subterms are saved into the array when field store
of nodes is true.
compile mstrat m
translates the pattern matching problem encoded by the matrix m
into a decision tree following strategy mstrat
.
update_dtree s rs
updates the decision tree of the symbol s
by adding rules rs
.