Characterizing Minimal Semantics-preserving Slices of predicate-linear, Free, Liberal Program Schemas
Sebastian Danicic, Robert M Hierons, Michael R Laurence

TL;DR
This paper characterizes the minimal predicate-minimal slices of certain program schemas, showing that Weiser's static slicing algorithm is effective for predicate-linear, free, liberal schemas under specific conditions, with some limitations.
Contribution
It extends previous work by proving the correctness of static slicing for predicate-linear schemas and clarifies conditions under which the algorithm produces minimal slices.
Findings
Weiser's slicing algorithm yields predicate-minimal slices for specified schemas.
The result applies to schemas with predicate-linearity, freedom, and liberal conditions.
The approach does not hold for termination-based slicing criteria.
Abstract
A program schema defines a class of programs, all of which have identical statement structure, but whose functions and predicates may differ. A schema thus defines an entire class of programs according to how its symbols are interpreted. A subschema of a schema is obtained from a schema by deleting some of its statements. We prove that given a schema which is predicate-linear, free and liberal, such that the true and false parts of every if predicate satisfy a simple additional condition, and a slicing criterion defined by the final value of a given variable after execution of any program defined by , the minimal subschema of which respects this slicing criterion contains all the function and predicate symbols `needed' by the variable according to the data dependence and control dependence relations used in program slicing, which is the symbol set given by Weiser's staticâŠ
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Logic, programming, and type systems
â â thanks: Corresponding author: Michael Laurence, email [email protected], tel+44 (0) 114 222 1800, fax +44 (0) 114 222 1810, address Department of Computer Science, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK.
Characterizing Minimal Semantics-preserving Slices of predicate-linear, Free, Liberal Program Schemas
Sebastian Danicic
Robert M Hierons
Michael R Laurence
Department of Computer Science, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK.
Department of Computing, Goldsmiths College, University of London, London SE14 6NW, UK.
Department of Information Systems and Computing, Brunel University, Uxbridge, Middlesex, UB8 3PH.
Abstract
A program schema defines a class of programs, all of which have identical statement structure, but whose functions and predicates may differ. A schema thus defines an entire class of programs according to how its symbols are interpreted. A subschema of a schema is obtained from a schema by deleting some of its statements. We prove that given a schema which is predicate-linear, free and liberal, such that the true and false parts of every if predicate satisfy a simple additional condition, and a slicing criterion defined by the final value of a given variable after execution of any program defined by , the minimal subschema of which respects this slicing criterion contains all the function and predicate symbols âneededâ by the variable according to the data dependence and control dependence relations used in program slicing, which is the symbol set given by Weiserâs static slicing algorithm. Thus this algorithm gives predicate-minimal slices for classes of programs represented by schemas satisfying our set of conditions. We also give an example to show that the corresponding result with respect to the slicing criterion defined by termination behaviour is incorrect. This complements a result by the authors in which was required to be function-linear, instead of predicate-linear.
keywords:
program schemas , Herbrand domain , program slicing , Weiserâs algorithm , free and liberal schemas , linear schemas
1 Introduction
A schema represents the statement structure of a program by replacing real functions and predicates by symbols representing them. A schema, , thus defines a whole class of programs which all have the same structure. Each program can be obtained from via a domain and an interpretation which defines a function for each function symbol of arity , and a predicate function for each predicate symbol of arity . As an example, Figure 1 gives a schema , and the program of Figure 2 is defined from by interpreting the function symbols and the predicate symbol as given by , with being the set of integers.
The subject of schema theory is connected with that of program transformation and was originally motivated by the wish to compile programs effectively[1]. Schema theory is also relevant to program slicing. Since program slicing algorithms do not normally take into account the meanings of the functions and predicates of a program, a schema encodes all the information about any program which it defines that is available to such algorithms.
A subschema of a schema is defined to be any schema obtained by deleting statements from . Given a schema and a variable , we wish to find a subschema of which satisfies the following condition; given any interpretation and any initial state such that the program defined by terminates, that defined by also does, and defines the same final value for . In this case we say that is a -slice of . We are particularly interested in finding minimal -slices of (with slices of ordered according to their sets of symbols111A symbol in this paper means a function or predicate symbol in a schema.).
The main theorem of this paper requires that given any path through a schema , there is an interpretation and an initial state such that the program thus defined follows this path when executed (the freeness condition) and the same term is not generated more than once as it does so (the liberality condition). These conditions were first defined by Paterson [2]. We also require that the same predicate symbol does not occur more than once in (the predicate-linearity condition), and that if the same function symbol occurs in both the true and false parts of any if predicate222If the statement occurs in a predicate-linear schema , then we say that and are respectively the true and false parts of in , then it assigns to distinct variables in each case. We call schemas satisfying all these conditions special schemas. We prove that given a schema which satisfies these conditions and a variable , the -slice of given by Weiserâs static slicing algorithm[3] has the unique minimal set of predicate and function symbols of all -slices of . Given a schema, Weiserâs algorithm computes the subschema containing only those symbols defined by the transitive closure of the control and backward data dependence relations. We also define an -slice of a schema in which termination behaviour defines the slicing criterion, and give an example to show that Weiserâs algorithm, modified in a natural way with respect to this slicing criterion, need not give a minimal -slice. This is in contrast to the situation for function-linear, free, liberal schemas [4].
Our theorem is a strengthening of the result in [5] in which no symbol was allowed to occur more than once in the schema (that is, had to be linear, as opposed to just predicate-linear in this paper).
1.1 Organisation of the paper
In the remainder of this section, we explain how the field of program slicing provides motivation for our results, and we also discuss the history of the study of schemas. In Section 2, we give formally our basic schema definitions. In Section 3 we define formally free and liberal schemas, and also give a simple characterisation of schemas that are both free and liberal, which shows that Weiserâs algorithm preserves the property of being both free and liberal for slices. In Section 4 we formally define a subschema of a schema. In Section 5 we formally define the data dependence relations and for a schema and define Weiserâs labelled symbol set for a schema. We also give examples of cases in which the subschema of a schema containing only the symbols in Weiserâs set is not the minimal subschema satisfying the required conditions. In Section 6, we define the notion of a -couple for a predicate ; that is, a pair of interpretations which differ only on one -predicate term. In Section 7 we introduce formally the class of special schemas to which our results apply. In Section 8, we prove our main theorems. In Section 9, we give an example to show that the subschema of a special schema given by Weiserâs algorithm with respect to termination need not be minimal of all subschemas preserving termination behaviour. In Section 10, we discuss our conclusions.
1.2 Relevance of Schema Theory to Program Slicing
The field of (static) program slicing is largely concerned with the design of algorithms which given a program and a variable , eliminate as much code as possible from the program, such that the subprogram consisting of the remaining code, when executed from the same initial state, will still give the same final value for as the original program, and preserve termination. One algorithm is thus better than another if it constructs a smaller slice.
Most program slicing algorithms are based on the program dependence graph (PDG) of a program. This includes Weiserâs algorithm[3], which was, however, expressed in different language. (For a fuller discussion of program slicing algorithms see [6, 7].) The PDG of a program is a graph whose vertices are the labelled statements of the program and whose directed edges indicate control or data dependence of one statement upon another.
Data dependence is defined as follows. We say that in a schema , a function or predicate symbol is data dependent upon a function symbol , written , if references the variable to which assigns, and there is a path through passing through before passing through without passing through an intermediate assignment to the same variable as . The relation statement is defined analogously for a function symbol and variable using terminal path-segments. This definition of the relations is purely syntactic; feasability of any path is not required for it to hold. Thus , and hold for the schema of Figure 1; means that there is a path through passing through , and not subsequently passing through a later assignment to the variable before reaching the end of the schema.
Slicing algorithms do not take account of the meanings of the functions and predicates occurring in a program, nor do they exploit the knowledge that the same function or predicate occurs in two different places in a program. This reflects the fact that it is undecidable whether the deletion of a particular line of code from a program can affect the final value of a given variable after execution[8]. On the other hand a schema likewise encapsulates the data and control dependence relations of the programs that it represents, but whereas it also does not encode the meanings of its function and predicate symbols, it does record any multiple occurrences of these symbols, and this extra information may sometimes lead to a proof of the existence of smaller slices. As an example, it is obvious that the predicate symbol and the assignment that it controls may be deleted from the schema of Figure 3 without preventing termination or changing the final value of (that is, the resulting subschema is a -slice in our terminology), but most program slicing algorithms will treat the two occurrences of as if they were two distinct functions, and therefore will not make any deletion.
However, slicing algorithms taking linear schemas as input may yield more information about a program than algorithms that merely use Weiserâs algorithm. As an example, in the schema of Figure 4, which will be discussed in further sections, it can be seen that the subschema of obtained by deleting the assignment with symbol is a -slice of , since the removal of this assignment cannot prevent termination (which is determined solely by the value of when referenced by ), nor can it prevent the path of execution from passing through at least once, though it may affect the number of times this happens. However, if the assignment with symbol is replaced by an assignment to give a schema , then the assignment may not similarly be deleted from , since this deletion may change the value of after execution. As an example of an interpretation under which this occurs, suppose that and are all interpreted as the function in the domain of integers and and map to true, whereas and map to false if or respectively. Execution of from the initial state in which all variables are set to zero results in a final value of for , whereas if the assignment is deleted, then the execution path will pass through on both occasions that it enters the body of giving a final value of for . However Weiserâs algorithm will treat these two cases identically, and will require to be in a -slice in both cases. This is because for and , Weiserâs set with respect to the variable must contain or respectively, since and , and controls or respectively, hence Weiserâs set must contain , and (and similarly for ), thus Weiserâs set contains . Danicic [8] gives other examples of cases of linear schemas for which program slicing algorithms will not give minimal correct subschemas. If the linearity assumption is discarded, then non-minimality can be demonstrated even for loop-free schemas, such as the one in Figure 3, in which and both occurrences of lie in the Weiser symbol set defined by , but the statement containing can clearly be deleted without changing the final value of . These examples motivate the mathematical study of schemas, which may lead to the computation of smaller subschemas than conventional program slicing techniques can achieve.
1.3 Different classes of schemas
Many subclasses of schemas have been defined:
Structured schemas,
in which goto statements are forbidden, and thus loops must be constructed using while statements. All schemas considered in this paper are structured.
Linear schemas,
in which each function and predicate symbol occurs at most once.
Predicate-linear schemas,
which we introduce in this paper, in which each predicate symbol occurs at most once, but which may have more than one occurrence of the same function symbol.
Free schemas,
where all paths are executable under some interpretation.
Liberal schemas,
in which two assignments along any legal path can always be made to assign distinct values to their respective variables.
Near-liberal schemas,
which the authors introduced in [9], in which this non-repeating condition applies only to terms not having the form for a function symbol of zero arity.
We now give examples of schemas satisfying these definitions, and first show that the freeness and liberality conditions on schemas are incomparable. To see this, consider the following two examples of linear schemas. The schema
[TABLE]
contains no assignments and is therefore liberal, but it is not free, since there is no choice of interpretation and initial state under which the executed path thus defined passes exactly once through the body of , since the value of , and hence the boolean value defined at cannot change during execution. On the other hand the schema
\begin{array}[]{llll}\,\mathit{while}\,q(w)\,\mathit{do}&\{\\ &w\,{\tt{:=}}\,f(w);\\ &x\,{\tt{:=}}\,g();\\ &\}\end{array}
is free, since if defines the function over the domain of integers, then never defines a repeated value when referenced by , and so can be interpreted so as to define an executed path that passes any desired number of times through , but it is not liberal, since the variable is always assigned the same value at occurrences of along any executed path. The subschema obtained from it by deleting the assignment (that is, ) is both free and liberal, on the other hand.
The schema in Figure 4 can also be seen to be free, owing to the self-referencing assignments with symbols , which can be interpreted as the function over the domain of integers, thus ensuring that the variables referenced by and respectively never repeat in value. It is not liberal however, since it has a path passing more than once through , along which this assignment defines the same value to on each occasion. More generally, it is easy to see that no schema having a constant assignment in the body of a while predicate can be both free and liberal, since if it is free, then there is an executable path passing twice through this assignment, which clearly assigns the same value to its variable on each occasion.
Two schemas are said to be equivalent if they have the same termination behaviour, and give the same final value for every variable, given every symbol interpretation and initial state. The authors have shown [10, 11] that it is decidable whether linear, free, liberal schemas are equivalent.
Paterson [2] gave a proof that it is decidable whether a schema is both liberal and free (which we give in Section 3); and since he also gave an algorithm transforming a schema into a schema such that is both liberal and free if and only if is liberal, it is clearly decidable whether a schema is liberal. It is an open problem whether freeness is decidable for the class of linear schemas. However he also proved, using a reduction from the Post Correspondence Problem, that it is not decidable whether an arbitrary schema is free.
1.4 Previous results on the decidability of schema equivalence
Most previous research on schemas has focused on schema equivalence, as defined in Section 1.3. All results on the decidability of equivalence of schemas are either negative or confined to very restrictive classes of schemas. In particular Paterson [2] proved that equivalence is undecidable for the class of all (unstructured) schemas. He proved this by showing that the halting problem for Turing machines (which is, of course, undecidable) is reducible to the equivalence problem for the class of all schemas. Ashcroft and Manna showed [12] that an arbitrary schema can be effectively transformed into an equivalent structured schema, provided that statements such as are permitted; hence Patersonâs result shows that any class of schemas for which equivalence can be decided must not contain this class of schemas. Thus in order to get positive results on this problem, it is plainly necessary to define the relevant classes of schema with great care.
Positive results on the decidability of equivalence of schemas include the following; in an early result in schema theory, Ianov [13] introduced a restrictive class of schemas, the Ianov schemas, for which equivalence is decidable. This problem was later shown to be NP-complete [14, 15].
Paterson [2] proved that equivalence is decidable for a class of schemas called progressive schemas, in which every assignment references the variable assigned by the previous assignment along every legal path.
Sabelfeld [16] proved that equivalence is decidable for another class of schemas called through schemas. A through schema satisfies two conditions: firstly, that on every path from an accessible predicate to a predicate which does not pass through another predicate, and every variable referenced by , there is a variable referenced by which defines a term containing the term defined by and secondly, distinct variables referenced by a predicate can be made to define distinct terms under some interpretation.
In view of the evident difficulty of obtaining positive results on this problem, and the importance of program slicing, it seems sensible to concentrate on trying to decide equivalence for classes of schema pairs in which one schema is a subschema of the other, as was done for a class of near-liberal schemas in [9].
2 Basic definitions for schemas
Throughout this paper, , , and denote fixed infinite sets of function symbols, predicate symbols, variables and labels respectively. We assume a function
[TABLE]
The arity of a symbol is the number of arguments referenced by . Note that in the case when the arity of a function symbol is zero, may be thought of as a constant.
The set of terms is defined as follows:
- âą
each variable is a term,
- âą
if is of arity and are terms then is a term.
We refer to a tuple , where each is a term, as a vector term. We call a predicate term if and the number of components of the vector term is .
We also define -terms and -terms recursively for and . Any term is an -term, and the term is a -term. If and at least one of the terms is an -term or -term, then the term is an -term, or -term, respectively. Thus any -term is also an -term.
Definition 1** (schemas)**
We define the set of all schemas recursively as follows. is a schema. An assignment where , , and is a vector of variables, is a schema. From these all schemas may be âbuilt upâ from the following constructs on schemas.
sequences;
is a schema provided that each for is a schema.
if** schemas;**
is a schema whenever , , is a vector of variables, and are schemas. We call the schemas and the true and false parts of .
while** schemas;**
is a schema whenever , , is a vector of variables, and is a schema. We call the body of the while predicate in . If is a labelled symbol in , and there is no labelled while predicate in which also contains in its body, then we say that lies immediately above .
Thus a schema is a word in a language over an infinite alphabet. We normally omit the braces and if this causes no ambiguity. Also, we may write instead of if .
If no symbol (that is, no element of ) appears more than once in a schema , then is said to be linear. If no element of appears more than once in a schema , then is said to be predicate-linear. We define function-linear schemas analogously using the set .
The labels on function and predicate symbols do not affect the semantics of a schema; they are merely included in order to distinguish different occurrences of the same symbol in a schema; we always assume that distinct occurrences of a symbol in a schema have distinct labels. We will often omit labels on symbols in contexts where they need not be referred to, as in Figure 3, or where a symbol only occurs once in a schema. In particular, our main theorems assume predicate-linear schemas, hence we do not label predicate symbols in Section 8.
We define , and to be the sets of symbols, function symbols and predicate symbols occurring in a schema . Their labelled counterparts are , and . Also and are the sets of all labelled if predicates and while predicates in . A schema without predicates (that is, a schema which consists of a sequence of assignments and s) is called predicate-free.
If a schema contains an assignment then we define and . If then is defined similarly.
Definition 2** (the relation)**
Let be a schema. If is a labelled predicate in and is any (possibly labelled) symbol, we say that holds if lies in the body of (if is a while predicate in ) or lies in the true or false part of (if is an if predicate). We may strengthen this by writing for to indicate the additional condition that lies in the -part of if , or (if ).
The relation is the transitive closure of the relation âcontrolsâ in program analysis terminology.
2.1 Paths through a Schema
The execution of a program defines a possibly infinite sequence of assignments and predicates. Each such sequence will correspond to a path through the associated schema. The set of paths through is now given.
Definition 3** (the set of paths through , path-segments of )**
If is any set, then we write for the set of finite words over and for the set containing both finite and infinite words over . If is a word, or a set of words over an alphabet, then is the set of all finite prefixes of (elements of) .
For each schema the alphabet of , written is the set
[TABLE]
[TABLE]
[TABLE]
We define and .
The words in are formed by concatenation from the words of subschemas of as follows:
For ,
[TABLE]
is the set containing only the empty word.
For assignments,
[TABLE]
For sequences,
.
For if schemas,
is the set of all concatenations of with a word in and all concatenations of with a word in .
For while schemas,
.
We define . Elements of are called paths through . Any is a path-segment (in ) if there are words such that . A terminal path-segment of is a path-segment such that for some .
2.2 Semantics of schemas
The symbols upon which schemas are built are given meaning by defining the notions of a state and of an interpretation. It will be assumed that âvaluesâ are given in a single set , which will be called the domain. We are mainly interested in the case in which (the Herbrand domain) and the function symbols represent the ânaturalâ functions with respect to .
Definition 4** (states, (Herbrand) interpretations and the natural state )**
Given a domain , a state is either (denoting non-termination) or a function . The set of all such states will be denoted by . An interpretation defines, for each function symbol of arity , a function , and for each predicate symbol of arity , a function . The set of all interpretations with domain will be denoted .
We call the set of terms the Herbrand domain, and we say that a function from to is a Herbrand state. An interpretation for the Herbrand domain is said to be Herbrand if the functions for each are defined as
for all -tuples of terms .
We define the natural state by for all
Note that an interpretation being Herbrand places no restriction on the mappings
defined by for each .
Given a schema and a domain , an initial state with and an interpretation we now define the final state and the associated path . In order to do this, we need to define the predicate-free schema associated with the prefix of a path by considering the sequence of assignments and s through which it passes.
Definition 5** (the schema )**
Given a word for a schema , we recursively define the predicate-free schema by the following rules; if is the empty word, and
.
Lemma 6
Let be a schema. If , the set is one of the following; a singleton containing an underlined assignment, a pair where , or the empty set, and if then the last case holds.
Lemma 6, which was proved in [4, Lemma 6], reflects the fact that at any point in the execution of a program, there is never more than one ânext stepâ which may be taken, and an element of cannot be a strict prefix of another.
Definition 7** (semantics of predicate-free schemas)**
Given a state , the final state and associated path of a schema are defined as follows:
For ,
[TABLE]
and
[TABLE]
For assignments,
[TABLE]
(where the vector term for )
and
[TABLE]
and for sequences of predicate-free schemas,
[TABLE]
and
[TABLE]
This uniquely defines and if is predicate-free.
In order to give the semantics of a general schema , first the path, , of with respect to interpretation, , and initial state is defined.
Definition 8** (the path )**
Given a schema , an interpretation , and a state, , the path is defined by the following condition; for all , the equality holds.
In other words, the path has the following property; if a predicate expression along is evaluated with respect to the predicate-free schema consisting of the sequence of assignments preceding that predicate in , then the value of the resulting predicate term given by âagreesâ with the value given in .
By Lemma 6, this defines the path uniquely.
Definition 9** (the semantics of arbitrary schemas)**
If is finite, we define
[TABLE]
(which is already defined, since is predicate-free) otherwise is infinite and we define . In this last case we may say that is not terminating. Also, for schemas and interpretations and we write to mean . For convenience, if is predicate-free and is a state then we define unambiguously ; that is, we assume that the interpretation is Herbrand if is a Herbrand state; and we will write to mean for any .
Observe that and
[TABLE]
hold for all schemas (not just predicate-free ones).
Given a schema , let . We say that passes through a predicate term if has a prefix ending in for such that holds. We say that is a consequence of in this case.
3 Free and liberal schemas
Given an initial state and an interpretation, a path through a schema defines a term or a predicate term at each symbol that it encounters. For this paper, we wish to consider the class of schemas for which no term or predicate term is defined more than once along any path, given as the initial state and assuming that all interpretations are Herbrand.
Definition 10** (free and liberal schemas)**
Let be a schema.
- âą
If for every there is a Herbrand interpretation such that , then is said to be free.
- âą
If for every Herbrand interpretation and any prefix , we have
[TABLE]
then is said to be liberal. (If then of course this condition is trivially satisfied.)
Thus a schema is said to be free if for every path through , there is a Herbrand interpretation which follows it with the natural state as the initial state, and a schema is said to be liberal if given any path through passing through two assignments and a Herbrand interpretation which follows it with as the initial state, the assignments give distinct values to the variables to which they assign. The definitions of freeness and liberality were first given in [2].
Observe that if a schema is free, then given a Herbrand interpretation ,
[TABLE]
implies that
[TABLE]
holds, since otherwise there would be no Herbrand interpretation whose path (for initial state ) has the prefix . Thus a path through a free schema cannot pass more than once (for initial state ) through the same predicate term. Hence if a Herbrand interpretation maps only finitely many predicate terms to , and is a free schema, then the path terminates. Similarly, if a schema is free and predicate-linear, and a Herbrand interpretation maps finitely many while predicate terms in to , then the path terminates.
Proposition 11 demonstrates the use of requiring our schemas to be liberal.
Proposition 11
Let be predicate-free schemas and assume that each schema is liberal. Let . If , then holds.
*Proof. *Assume holds. We will prove by induction on the number of assignments in . We may assume that each schema contains an assignment to , since if this holds for exactly one value of , then a contradiction is obtained, and if it is false for both values of , then the conclusion follows immediately. Write
[TABLE]
and let be the last assignment to in for each . Clearly .
- âą
Suppose that in the case of , this last assignment to occurs in . Thus this assignment sets the variable to . Since is liberal and holds, no assignment in can set a variable to along , hence is also the last assignment to in , and so follows, thus proving the desired result.
- âą
Thus we may assume that the last assignment to in occurs in . Similarly, we may assume that the last assignment to in occurs in . Let and be the first components of and respectively, and write for each . By the inductive hypothesis applied to and each , the term is the same for each ; the Proposition then follows from the analogous result for the other components of each .
Proposition 11 need not hold for non-liberal schemas; for example, if and are both (so is not liberal), and .
As mentioned in the introduction, it was proved in [2] that it is not decidable whether an (unstructured) schema is free, but it is decidable whether it is liberal, or liberal and free. Theorem 12 proves the latter result for structured schemas. It is an open question as to whether freeness of a linear or function-linear schema is decidable.
Theorem 12** (it is decidable whether a schema is liberal and free)**
Let be a schema. Then is both liberal and free if and only if for every path-segment in with , and such that the same labelled symbol does not occur more than once in or in , then either and reference a different vector of variables, or the path-segment contains an assignment to a variable referenced by .
In particular, it is decidable whether a schema is both liberal and free.*
*Proof * [2]. Assume that is both liberal and free. Then for any path-segment satisfying the conditions given, there is a prefix and a Herbrand interpretation such that , and distinct (predicate) terms are defined when and are reached, thus proving the necessity of the condition.
To prove sufficiency, first observe that the ânon-repeatingâ condition on the letters of the path-segment may be ignored, since path-segments that begin and end with letters having the same labelled symbol can be removed from within and until it is satisfied. Consider the set of prefixes of of the form with such that satisfies the condition given. By induction on the length of such prefixes, it can be shown that every assignment encountered along such a prefix defines a different term (for initial state ), and the result follows immediately from this.
Since there are finitely many path-segments in satisfying the conditions given for and these can be enumerated, the decidability of liberality and freeness for the set of schemas follows easily.
4 Subschemas and Slicing Conditions
Definition 13** (Subschemas of a schema)**
The set of subschemas of a schema is the minimal set of schemas which satisfies the following rules;
- âą
is a subschema of any schema.
- âą
and are both subschemas of any schema .
- âą
If is a subschema of , then and are subschemas of and respectively.
- âą
if is a subschema of then is a subschema of ;
- âą
if is a subschema of then the if schema is a subschema of
(the true and false parts may be interchanged in this example);
- âą
a subschema of a subschema of is itself a subschema of .
Definition 14** (the semantic -slice condition for )**
Let be a subschema of a schema . Then given , we say that is a -slice of if given any domain , any state and any , holds. We also say that is an -slice of if given any domain , any state and any , holds.
Thus the -slice condition is given in terms of every conceivable domain and initial state; however it is well known that the Herbrand domain is the only one that needs to be considered when considering many schema problems. Theorem 15, which is virtually a restatement of [17, Theorem 4-1], ensures that for slicing purposes, we only need to consider Herbrand interpretations and the natural state .
Theorem 15
Let be a set of schemas, let be a domain, let be a function from the set of variables into and let be an interpretation using this domain. Then there is a Herbrand interpretation such that the following hold.
- (1)
For all , the path . 2. (2)
If and are variables and for and , then also holds.
Throughout the remainder of the paper, all interpretations will be assumed to be Herbrand.
5 The data dependence relations and and Weiserâs labelled symbol set
Definition 16 formalises the and relations introduced in Section 1.2.
Definition 16** (the and relations and parameterised path-segments)**
Let be a schema and let be a path-segment in .
We call an -path-segment, or -path-segment for and if for some is an -term, or -term, respectively. We also call these path-segments an -path-segment or -path-segment respectively.
We call an -path-segment or -path-segment in if is an -term for some referenced by in . We define -path-segments analogously.
We sometimes strengthen these definitions by using labelled function symbols in the word to indicate which labelled assignment in creates the appropriate subterm of . We write if contains an -path-segment for and , and write if contains a terminal path-segment such that is an -term.
The relations and correspond to the data dependence relation in program slicing. We now give examples of these relations. If is the schema of Fig. 4, the path-segment in is both an -path-segment and a -path-segment, and the relation holds. Similarly, the path-segment is a -path-segment and an -path-segment, and holds. Since is a terminal path-segment in , holds.
Definition 17** (Weiserâs labelled symbol set)**
Let be a schema and let . Then we define to be the minimal set satisfying the following conditions.
- (1)
If , then holds. 2. (2)
If then . 3. (3)
If and , then holds. 4. (4)
If and then .
The set (traditionally only defined for the case in which , and for programs rather than schemas) is fundamental to most slicing algorithms. It contains all symbols which might conceivably affect the final value of (if is a variable) or termination (if ). This assertion is formalised in Theorem 18.
Given a schema and a set satisfying , there is a subschema of such that , obtained from by deleting all elements of from . This subschema is easily shown to be unique. In particular, for any , every schema has a unique subschema satisfying . By Theorem 12, if is both free and liberal, then so is .
Theorem 18
Let be any schema, let and let be a subschema of . If , then is a -slice of .
*Proof. *Proved in [4, Theorem 18].
If is liberal, free, and function-linear, then a subschema of is the -slice of with the minimal number of labelled symbols if and only if holds, as was proved in [4]; but in general this is false. To see this, consider the schema in Figure 5. It is clearly irrelevant whether maps to or , and hence the assignment may be deleted to give a -slice.
Even if a schema is both free and linear, Weiserâs algorithm need not give minimal slices. To see this, consider the linear schema of Figure 4 which can easily be seen to be free. Owing to the constant -assignment, is not liberal; any path entering the true part of more than once would assign the same value, , to each time.
Since contains the -path-segment , and and hold, follows; but the subschema of in which the assignment is deleted is a -slice of , since any interpretation satisfying would have to define a path passing through the -assignment (since otherwise the deletion of from would make no difference to ), and so the value of would be thus fixed at .
6 Couples of interpretations
In order to establish which predicate symbols of a schema must be included in a slice in order to preserve our desired semantics, we define the notion of a -couple for a predicate .
Definition 19** (couples)**
Let be interpretations and let . We say that the set is a -couple if there is a vector term such that and differ only at the predicate term . In this case we may also say that is a -couple. If a component of is an -term for , then is an -couple. Given any and schema , we also say that is an -couple or -couple for if also and both sides terminate. Lastly, we may label (an -couple, or -couple for ) to indicate that the paths and diverge at (at which point the predicate term is defined).
We also make analogous definitions if instead ; that is, if a set is a -couple for a predicate symbol , then we say is a -couple for a schema if exactly one path in terminates.
Note that a -couple is simply an -couple with as the empty word. The existence of a -couple for a schema âwitnessesâ the fact that affects the semantics of , as defined by . As an example of a -couple, let be an interpretation that maps the predicate terms , and to , and maps and to and let the interpretation be identical except that it maps to . Then is a -couple. If is the schema in Fig. 4, then both paths pass twice through the body of , with passing through only on the first occasion, whereas passes twice through . Since both interpretations define the same final value for , is not a -couple for . However, if is the schema obtained from by replacing the assignment by , then is a -couple for .
Proposition 20 follows immediately from Definition 19.
Proposition 20
If and is a -slice of a schema , then a -couple for is also a -couple for .
Definition 21** (head and tails of a couple)**
Let be a schema. Let , and let . Let be a -couple for and write
[TABLE]
for each and ; then we define for each , and .
The motivation for Definition 21 is given by Lemma 22, which shows that given a -couple for a free liberal schema, a new -couple may be obtained from it by replacing its head by any prefix leading to , while keeping the same tails.
Lemma 22** (Changing the head of a couple)**
Let be a free liberal schema and let and . Suppose there is a -couple for and a prefix in , then there is a -couple for such that and . In particular, if there is a -couple for and contains an -path-segment for , then there exists an -couple for .
*Proof. *Write . Since is free, there exist interpretations defining paths and for , and by Proposition 11, the final value of after each path is still distinct. Thus it suffices to prove that need not differ on any predicate term except the -predicate term defined after . However, if this is false, then must be a consequence of one of the paths and must be a consequence of the other, for some predicate term and . Again, since is free, must occur on the tails of both paths, and by Proposition 11 applied to the variables referenced by the appropriate occurrences of on each path and the prefixes of the paths preceding these occurrences, the same incompatibility would contradict the existence of the -couple . Thus we may define .
For the remainder of this paper, we use the following terminology with interpretations. If is an interpretation, is a predicate term and , then is the interpretation which maps every predicate term to the same value as except , which it maps to .
Lemma 22 need not hold for schemas that are not both free and liberal. To see this, consider the free, linear, non-liberal schema of Figure 4.
Let the interpretation satisfy if and only if the term , and . If the interpretation , then is an -couple for , since whereas , but there is no -couple for , although contains an -path-segment, since any interpretation such that passes through the -assignment must satisfy .
7 Restriction to Special Schemas
In order to prove our main results, we need to exclude from consideration schemas such as the one in Figure 5. Therefore we will now only consider schemas such that if the same function symbol occurs in both parts of any if predicate, then the occurences assign to different variables. The utility of this assumption is demonstrated by Proposition 24.
Definition 23** (Special schemas)**
Let be a predicate-linear free liberal schema. We say that is special if given any and such that and hold, holds.
Figure 6 in Section 9 gives an example of a special schema.
Proposition 24
Let and let be predicate-free schemas such that either or contains an assignment to , each schema is liberal and for all , if and both contain assignments with function symbol , then they assign to different variables. Then holds.
*Proof. *If only one schema in the set contains an assignment to , then the result follows from the liberality condition. If both do, let be the function symbol of the last assignment to in each . By our hypotheses, , and each term has as the outermost function symbol, giving the result.
8 Main Theorems
We wish to prove that for any , every schema which is a -slice of a given special schema contains every symbol occurring in . Thus we need to refer to the recursive definition of . This motivates Lemmas 25, 28 and 29, and Definition 26. We first consider Condition (4) in Definition 17, and show that the property of defining a -couple is âbackward-preservedâ by the relation.
Lemma 25
Let be a free predicate-linear schema and assume for . Let . Assume that there exists a -couple for . Then there exists a -couple for .
*Proof. *Assume holds and for each , choose , subject to the provisos that and . Since if is a while predicate, this is possible.
Since there exists a -couple for , a pair of interpretations can be chosen such that is a -couple for and the number of predicate terms that maps to is minimal for all such pairs; clearly this number is finite, since the path terminates. The path must pass through and hence through , since holds, and hence there is a predicate term which maps to . Since , also holds. Define the interpretations to be identical to and respectively except that both map to . Thus maps fewer predicate terms to than does, and hence by the minimality assumption on , is not a -couple for . Hence either
[TABLE]
holds.
By the freeness of and the fact that and map finitely many predicate terms for to , the paths and are both terminating, and so either or is a -couple for , giving the result.
It is convenient to make the following definitions, which merely give an alternative way of expressing Weiserâs set.
Definition 26** (-links and -feeding path-segments)**
Let be a predicate-linear schema.
Let and . A -link in is a path-segment for some path in the -part of in .
If , then the path-segment is called a -link in ; and a path-segment in which passes at least once through is a -link.
Let and let . We say that a path-segment in -feeds to if there exists such that is a path-segment in for some -link and is a -term for some and references the variable .
Proposition 27
Let be predicate-free schemas and let be variables such that and assume that is a -term for some . Then holds.
*Proof. *This follows by induction on the total number of assignments and occurrences of in . If then and the result is straightforward. If or for , then for each and so the result follows from the inductive hypothesis applied to . Thus we may assume that . Hence we may write such that for some , is a -term. From the inductive hypothesis applied to , holds. Since for each , the result follows.
We can now prove that the property of defining a -couple is âbackward-preservedâ by the transitive closure of Conditions (3) and (4) of Definition 17.
Lemma 28
Let be a special schema. Let and . Assume that there exists a -couple for . Suppose that there exists an assignment to in the body or in one part of in and that there exists a path-segment in -feeding to . Then there exists a -couple for .
*Proof. *Given a fixed pair , we will assume that the conclusion of the Lemma is false, but that the hypotheses are true for some triple , where is a path-segment in -feeding to , and will show that this leads to a contradiction. We will assume that the triple is chosen such that the path-segment is of minimal length such that the hypotheses of the Lemma are satisfied.
For some , let be a -link passing through an assignment to and let . By Lemma 22, we can choose a -couple for such that . We may assume that and map finitely many while predicate terms to , since the interpretations define terminating paths. Let be the total number of -predicate terms which and both map to , where is the while predicate lying immediately above if , or itself if . If and does not lie in the body of a while predicate, then and are undefined. We assume that is chosen such that if defined, is minimal for the chosen values of , and .
Let be any -link and let be the set of all pairs such that is a consequence of the prefix , but is not a consequence of , and let the interpretations be obtained by altering and respectively in accordance with the pairs in ; thus, if then , otherwise , and similarly for . Thus the paths and both have as a prefix. By the freeness of , the set does not contain any subset of the form and so and are well-defined. We write . We now show that a contradiction is obtained. The proof proceeds in stages.
- (1)
For any , we now show that there is no -couple for . Assume this is false for some . By the definition of , does not occur on , and by Lemma 25 and the fact that by the falsity of the conclusion of the Lemma, does not occur on either, and so has a prefix such that defines after and since is not a consequence of , replacing by in changes the -predicate term defined after . Hence for some variable in the body or in one part of , -feeds to , contradicting the minimality of . 2. (2)
We now show that is a -couple for . Suppose this is false. Since is a -couple for , either or the analogous assertion holds for and . However, since is free, changing or at finitely many predicate terms still results in an interpretation defining a terminating path through , and by (1), does not change the final value of if the predicate terms have the form for some , thus contradicting the definitions of and immediately. 3. (3)
Hence is a -couple for . Let ; thus and differ only at . Clearly and also differ only at and so their paths diverge at . Since is free, is not a consequence of for either , and so by (1) and the definition of , does not occur on either. Also, holds for at least one variable referenced by , by the assumptions on and and Proposition 24 applied to , and , and Proposition 27 applied to , and , and so does not define after . Thus and pass at least twice through after , and and are defined and for some path-segment passing at least once through . 4. (4)
Thus by Lemma 22, there exists a -couple for which has the same pair of tails as and such that . We may assume that each -predicate term which is not a consequence of either path or is mapped to by both interpretations in . We now show that this âcutting outâ of the path-segment passing through from contradicts the minimality of . By (1) and Lemma 25, the elements of map the same number of -predicate terms to as those in do. Thus it suffices to prove that the interpretations in map fewer -predicate terms to than those in . By the freeness of and our assumption on , the number of -predicate terms mapped to by both interpretations in is obtained by adding up the number of occurrences of on to those on either tail of , and subtracting the number of -predicate terms mapped to occurring on both tails of . The analogous assertion holds for . Clearly has fewer occurrences of than has. Since and have the same tails, it thus remains only to prove that the same number of -predicate terms mapping to occur on both and after as after , and this follows from Proposition 11, since replacing the prefix by preserves equalities between predicate terms occurring along and .
We now use Lemma 28 to prove the existence of a -couple where membership of the predicate in is witnessed by iteration of Conditions (1) and (4) of Definition 17.
Lemma 29
Let be a special schema. Let and . Suppose that there exists an assignment to in the body or in one part of in and that there exists a terminal path-segment in such that for some , is a -term. Then there exists a -couple for .
*Proof. *Let be the schema , where are distinct symbols not occurring in . Clearly is special and the path-segment -feeds to in . The result follows from Lemma 28 applied to .
We now use the preceding two Lemmas to show that every symbol of for a special schema can affect the semantics of .
Theorem 30
Let be a special schema. Let .
- (1)
For all there exists a -couple for . 2. (2)
For all , either there exists an interpretation such that the term contains the symbol , or there exists such that there exists a -couple for for some vector term containing .
Proof.
Let be the set of all predicates in such that there exists a -couple for and let .
- (1)
Observe that from Conditions (1,3,4) of Definition 17, is the minimal subset of satisfying the following two conditions.
- âą
If and for a labelled function symbol and there exists a terminal -path-segment for some , then holds.
- âą
If and for a labelled function symbol and and contains an -path-segment for some , then .
By Lemmas 29 and 28 respectively, also satisfies both these conditions; hence , as required. 2. (2)
If , then from Definition 17, one of the following two possibilities must occur.
- âą
There exists an -path-segment for some , in which case by the freeness of there exists an interpretation such that the term contains the symbol , as required.
- âą
The schema contains an -path-segment for some and holds by Part (1) of this Theorem, in which case by Lemma 22, there exists a -couple for for some vector term one of whose components is an -term, proving the result.
The main theorem of the paper follows.
Theorem 31
Let be a special schema. Let and let be a subschema of .
- (1)
If then is a -slice of . 2. (2)
If is a -slice of , then contains at least one occurrence of every symbol in . In particular, if , then no subschema of satisfying is a -slice of unless there exists such that contains at least two occurrences of and contains at least one, but not all occurrences of lying in .
*Proof. *Part (1) is a restatement of Theorem 18 for the subclass of special schemas. Part (2) follows immediately from Theorem 30 and Proposition 20, and the definition of a -slice.
9 Weiserâs algorithm does not give minimal -slices for Special Schemas
Theorems 30 and Part (2) of Theorem 31 do not hold if the variable is replaced by . To see this, consider the special schema of Figure 6. By iterating Conditions (2,3,4) of Definition 17, it follows that contains both occurrences of each of and hence also contains and , but we now show that there is no -couple for . For suppose that is a -couple for , and so and define paths passing different ways through . Let . Observe that one path in defines the same predicate term on the second occasion that it passes through as the other does on the first occasion, and that if , the two paths in define the same predicate term on the th occasion that they pass through . Thus suppose that one path terminates after passing times through . If , then the other also terminates after passing not more than times through . If , then so does the other after passing not more than times through , giving a contradiction.
Thus Part (1) of Theorem 30 is false in this case, and hence it follows easily that the subschema of obtained by deleting the assignment is an -slice.
10 Conclusions and suggestions for further work
We have shown that for any variable and a special schema , the subschema of containing the set of predicate symbols and labelled function symbols in the âWeiser setâ , and no others, has the minimal set of predicate and function symbols of any -slice of .
This leaves open the possibility that there exists a subschema of that is a -slice of and has fewer, but still non-zero, occurrences of some of the function symbols occurring with labels in . It is not clear whether an example of a special schema exists with this property. Further research should investigate this problem. However if is not special, this can certainly happen, as the example of the predicate-linear but non-liberal schema in Figure 3 shows.
For , we have shown that the corresponding result fails, as the special schema in Figure 6 shows. The existence of this special schema does, however, show the strengthening of our main result compared to that of [5]. Further work will also concentrate on obtaining minimal -slices for larger classes of schemas. In particular, it would be of interest to be able to effectively characterise minimal slices for a reasonable class of schemas containing those in Figures 3 and 4, which are near-liberal but not liberal. [9] gives a related decidability result for schema equivalence. In addition, the main theorem of the paper can almost certainly be generalised to allow slicing criteria according to which the value of a given variable at a particular point within a program must be preserved by a slice, rather than at the end.
Acknowledgements
This work was supported by a grant from the Engineering and Physical Sciences Research Council, Grant EP/E002919/1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Greibach, Theory of program structures: schemes, semantics, verification, Vol. 36 of Lecture Notes in Computer Science, Springer-Verlag Inc., New York, NY, USA, 1975.
- 2[2] M. S. Paterson, Equivalence problems in a model of computation, Ph.D. thesis, University of Cambridge, UK (1967).
- 3[3] M. Weiser, Program slices: Formal, psychological, and practical investigations of an automatic program abstraction method, Ph D thesis, University of Michigan, Ann Arbor, MI (1979).
- 4[4] M. R. Laurence, Characterising minimal semantics-preserving slices of function-linear, free, liberal program schemas, Journal of Logic and Algebraic Programming 72 (2) (2005) 157â172.
- 5[5] S. Danicic, C. Fox, M. Harman, R. Hierons, J. Howroyd, M. R. Laurence, Static program slicing algorithms are minimal for free liberal program schemas, The Computer Journal 48 (6) (2005) 737â748.
- 6[6] F. Tip, A survey of program slicing techniques, Tech. Rep. CS-R 9438, Centrum voor Wiskunde en Informatica, Amsterdam (1994).
- 7[7] D. W. Binkley, K. B. Gallagher, Program slicing, in: M. Zelkowitz (Ed.), Advances in Computing, Volume 43, Academic Press, 1996, pp. 1â50.
- 8[8] S. Danicic, Dataflow minimal slicing, Ph D thesis, University of North London, UK, School of Informatics (Apr. 1999).
