A Transformational Approach to Resource Analysis with Typed-norms Inference
Elvira Albert, Samir Genaim, Ra\'ul Guti\'errez, Enrique Martin-Martin

TL;DR
This paper introduces a transformational, data-flow analysis-based method for automatically inferring typed-norms to estimate program resource consumption across various programming paradigms, demonstrating efficiency and accuracy on benchmarks.
Contribution
It presents a novel transformational approach to resource analysis using inferred typed-norms via data-flow analysis, applicable to multiple programming paradigms.
Findings
Efficient and accurate analysis on standard benchmarks.
Applicable across functional, logic, and imperative paradigms.
Formalized on a simple rule-based representation.
Abstract
In order to automatically infer the resource consumption of programs, analyzers track how data sizes change along program's execution. Typically, analyzers measure the sizes of data by applying norms which are mappings from data to natural numbers that represent the sizes of the corresponding data. When norms are defined by taking type information into account, they are named typed-norms. This article presents a transformational approach to resource analysis with typed-norms that are inferred by a data-flow analysis. The analysis is based on a transformation of the program into an intermediate abstract program in which each variable is abstracted with respect to all considered norms which are valid for its type. We also present the data-flow analysis to automatically infer the required, useful, typed-norms from programs. Our analysis is formalized on a simple rule-based representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\nochangebars
A Transformational Approach to Resource Analysis with Typed-norms
Inference††thanks: This work was funded partially by the Spanish MICINN/FEDER, UE projects RTI2018-094403-B-C31 and RTI2018-094403-B-C32, MINECO projects TIN2015-69175-C4-2-R and TIN2015-69175-C4-1-R, by the CM project S2018/TCS-4314, the GV project PROMETEO/2019/098, and the UPV project SP20180225. Raúl Gutiérrez was also supported by INCIBE program “Ayudas para la excelencia de los equipos de investigación avanzada en ciberseguridad”.
Elvira Albert
Samir Genaim
Dep. Sistemas Informáticos y Computación
Universidad Complutense de Madrid
C/ Prof. José García Santesmases 9
28040 Madrid
Spain
[email protected], [email protected]
Raúl Gutiérrez
Dep. Sistemes Informàtics i Computació
Universitat Politècnica de València
Camino de Vera S/N
46022 València
Spain
Enrique Martin-Martin
Dep. Sistemas Informáticos y Computación
Universidad Complutense de Madrid
C/ Prof. José García Santesmases 9
28040 Madrid
Spain
Abstract
In order to automatically infer the resource consumption of programs, analyzers track how data sizes change along program’s execution. Typically, analyzers measure the sizes of data by applying norms which are mappings from data to natural numbers that represent the sizes of the corresponding data. When norms are defined by taking type information into account, they are named typed-norms. This article presents a transformational approach to resource analysis with typed-norms that are inferred by a data-flow analysis. The analysis is based on a transformation of the program into an intermediate abstract program in which each variable is abstracted with respect to all considered norms which are valid for its type. We also present the data-flow analysis to automatically infer the required, useful, typed-norms from programs. Our analysis is formalized on a simple rule-based representation to which programs written in different programming paradigms (e.g., functional, logic, imperative) can be automatically translated. Experimental results on standard benchmarks used by other type-based analyzers show that our approach is both efficient and accurate in practice.
Under consideration in Theory and Practice of Logic Programming (TPLP).
keywords:
resource analysis, typed-norms, data-flow analysis, program transformation
1 Introduction
Automated resource analysis [Wegbreit (1975)] needs to infer how the sizes of data are modified along program’s execution. Size is measured using so-called norms [Bossi et al. (1991)] which define how the size of a term is computed. Examples of norms are list-length which counts the number of elements of a list, tree-depth which counts the depth of a tree, term-size which counts the number of constructors, etc. Basically, in order to infer the resource consumption of executing a loop that traverses a data-structure, the analyzer tries to infer how the size of such data-structure decreases at each iteration w.r.t. the chosen norm. Given a tree t, using a term-size norm, we infer that a loop like “while (t!=leaf) t=t.right;” performs at most nodes(t) iterations, where function nodes returns the number of nodes in the tree. This is because size analysis infers that at each iteration the instruction t=t.right decreases nodes(t). However, by using the tree-depth norm, we will infer that depth(t) is an upper bound on the number of iterations. The latter is obviously more precise than the former bound as depth(t)nodes(t).
The last two decades have witnessed a wealth of research on using norms in termination analysis, especially in the context of logic programming [Bossi et al. (1991), Bruynooghe et al. (2007), Genaim et al. (2002)]. Early work pointed out that the choice of norm affects the precision such that the analyzer may only succeed to prove termination if a certain norm is used, while it cannot prove it with others. Later on, there has been further investigation on applying multiple norms, i.e., using two or more norms by applying them simultaneously [Bossi et al. (1991)]. This means that the same data in the original program is replaced by two or more abstract data each one specifying its size information w.r.t. the corresponding norm. Even a further step has been taken on using typed-norms which allow defining norms based on type information (namely on recursive types) [Bruynooghe et al. (2007)]. Inferring norms from type information makes sense as recursive types represent recursive data-structures and thus, in termination analysis, they identify some potential sources of infinite recursion and, in resource analysis, they might influence the number of iterations that the loops perform. Besides, typed-norms allow that the same term can be measured differently depending on its type. As pointed out in [Genaim et al. (2002)], this is particularly useful when the same function symbol may occur in different type contexts.
In the context of resource analysis, we found early work that already pointed out that the combination of norms affects the precision of lower-bound time analysis [King et al. (1997)]. Sized-types provide a way to consider more than one norm for each type. They have been used in the context of functional [Pedro Vasconcelos (2008), Vasconcelos and Hammond (2003)] and recently in logic programming [Serrano et al. (2013), Serrano et al. (2014)]. In the former case, they are inferred by a type analysis and in the latter via abstract interpretation. In contrast, we propose a transformational approach which provides a simple and accurate way to use multiple typed-norms in resource analysis as follows: (1) we first transform the program into an intermediate abstract program in which each variable is abstracted with respect to all considered norms valid for its type, (2) such intermediate program is then transformed into upper and lower resource bounds automatically. As regards the first phase, we formalize the transformation assuming that the input programs are given in a simple rule-based representation. The rule-based representation contains program rules, pattern matching and assignment using a compact syntax. Programs written in \cbstartfirst-order \cbendfunctional or imperative programming languages can be represented by means of this representation in a straightforward way (since this representation can model control-flow graphs with procedure calls). Logic programs can be represented as well by replacing matching (and assignment) by unification, without any further change in our analysis. As regards the second phase, \cbstartnote that we are interested in relying on existing techniques and using them as a black-box without modifying them. This is important since they receive abstract programs that come from different sources, and we do not want to make any change that is particular to our transformation that might break the functionality of other parts. \cbendThus, formalizing our framework focuses only on the first step.
While allowing multiple norms might lead to more accurate bounds than adopting one norm, the efficiency of the analysis can be degraded considerably. This is because the process of finding resource bounds from abstractions that have more arguments (due to the use of multiple norms) is more costly. Thus, an essential aspect for the practical applicability of our method is to obtain the smallest sets for the relevant typed-norms, i.e., eliminate those abstractions that will not lead to further precision. For this purpose, we present a new algorithm for the inference of typed-norms which, by inspecting the program, can detect which norms are useful to later infer the resource consumption, and discard norms that are useless for this purpose. Our inference is formalized as a data-flow analysis which is applied as a pre-process, such that once the relevant norms are inferred, the transformation into the abstract program is carried out w.r.t. the inferred norms.
1.1 Summary of Contributions
The main contributions of this article can be summarized as follows:
We introduce a transformation from the rule-based representation to an abstract representation in which each variable is abstracted with respect to all considered norms valid for its type, and prove soundness of the process. 2. 2.
We present to the best of our knowledge the first algorithm for the inference of typed-norms that are relevant to infer the resource consumption, and prove soundness of the type inference step. 3. 3.
We extend our approach to handle polymorphic types and context-sensitive norms. 4. 4.
We perform an experimental evaluation and compare the results with those obtained using other systems [Hoffmann et al. (2012), Serrano et al. (2014)].
This article is an extended and revised version of a conference paper that was published in the proceedings of LOPSTR 2013 [Albert et al. (2014)]. The main extensions w.r.t. the conference paper affect all points above. As regards (1), we now provide a semantics for the rule-based representation and for the abstract representation and prove soundness of the transformation process, while [Albert et al. (2014)] did not have soundness results. \cbstart(2) The formalization of the algorithm for the inference of typed-norms and its soundness are new contributions of this article. In [Albert et al. (2014)] the inference algorithm was informally presented without any theoretical result, but in this extended revised version we present a completely formalized data-flow algorithm for inferring typed-norms, prove its termination and also prove that the detected typed-norms cover those that may affect the program executions, i.e., the inference algorithm is correct. \cbend(3) Also, in [Albert et al. (2014)], we had considered only monomorphic types. (4) The experiments of [Albert et al. (2014)] have also been improved to deal with the same benchmarks as in related work [Hoffmann et al. (2012), Serrano et al. (2014)] and a comparison with these systems has been included. We also have analyzed an industrial case study to show the performance of our approach when handling larger programs.
1.2 Organization of the Article
The article is organized as follows. In Section 2 we describe the syntax and the semantics of the rule-based representation. Section 3 presents our transformational approach to resource analysis with typed-norms. We start by reviewing the concept of typed-norm in Section 3.1. It is then extended to symbolic typed-norm and used to define the program abstraction in Section 3.2. Soundness of the transformation is proven in Section 3.3. Section 4 presents a typed-norms inference algorithm that is essential for the scalability of our approach. It infers the smallest sets for the typed-norms that are relevant for the inference of upper bounds. Section 4.1 formalizes the inference process and Section 4.2 proves its soundness. In Section 5.1 we describe the extension of our approach to handle polymorphic types. Section 6 contains our experimental evaluation, Section 7 compares our approach to related work, and Section 8 concludes. \cbstartFinally, A contains the proofs of the theoretical results. \cbend
2 A Rule-based Language
To simplify the presentation, we formalize our approach on a compact program syntax called rule-based representation (RBR) that contains program rules, pattern matching, and assignment. It already incorporates static single assignment [Cytron et al. (1991)] (each variable is assigned exactly once). Recursion is the only iterative mechanism and rule guards are the only conditional constructions in the RBR. Although simple, the RBR syntax can represent programs from different programming languages by means of an intermediate translation. For example, the RBR can be obtained from Java programs [Albert et al. (2012)], from the functional part of Abstract Behavioral Specification [Johnsen et al. (2012)] (ABS) programs [Albert et al. (2015)], and from the imperative part of ABS [Albert et al. (2015)]. This RBR can handle core-Prolog programs as well simply by interpreting the pattern matching as unification. Interestingly this does not require any further change in our size abstraction since our abstract programs are actually constraint logic programs. However, we note that analyzing abstract programs that originate from logic programs for cost should be done by an analyzer that takes failure into account [Serrano et al. (2014)].
2.1 Syntax of the rule-based language
In order to present typed-norms and its impact on termination and resource analyses in a clear way, for now we will consider only monomorphic types, although in Section 5.1 we will present the extension to polymorphic types.
Definition 1** (Monomorphic types)**
A monomorphic type can be a built-in data type as Int or an algebraic data type defined as:
\begin{array}[]{l@{~~~~~}lll}\hfil~{}~{}~{}~{}~{}&\mathit{Dd}&::=&\mathit{data}~{}\mathit{D}=\mathit{Alt}~{}[\overline{\mid\mathit{Alt}{}}]\\ \hfil~{}~{}~{}~{}~{}&\mathit{Alt}&::=&\mathit{Co}[(\overline{T})]\\ \end{array}**
where represents a data constructor and the notation represents an optional sequence of elements . For simplicity, we assume that recursive types are in direct recursive form, otherwise, we could consider mutually recursive types to be the same type.
Example 1** (List of integer numbers)**
Using the syntax presented in Def. 1 we can define the data type of integer lists (IntList) as follows:
data IntList = Nil | Cons(Int, IntList)**
In this case, the type of the nullary data constructor Nil is IntList, and the type of the binary data constructor Cons is Int IntList IntList.
We define programs in rule-based representation (RBR programs in the sequel) as a set of data declarations followed by typed procedures:
Definition 2** (RBR syntax)**
[TABLE]
RBR programs P are formed by an optional set of data declarations () followed by a set of typed procedures (). A typed procedure begins with a type declaration stating the types of its input arguments () and the types of its output arguments (). After the type declarations there is a set of guarded rules (), where is the head of the rule, the guard specifies the conditions for the rule to be applicable and are the statements in the rule’s body. For clarity, we sometimes enclose input and output arguments with angles “” and “”, i.e., and . If a program P has rules, we say that and represents the i-th rule of . Guards and , where and and are of the same type, check if the value stored in variable matches with pattern . Patterns () are data constructors applied to variables. Terms () can be expressions (variables, integer numbers or arithmetic operations over expressions) or data constructors applied to properly typed terms (e.g., Cons, where has type Int and has type IntList). Terms not containing variables are called closed terms (a.k.a. ground terms). \cbstartWe assume that RBR programs are well-typed, i.e., every term and subterm in the program (including variables) have a type assigned that is coherent with procedure type declarations and data constructor types, considering a standard monomorphic type system [Pierce (2002)]\cbend.
Example 2** (RBR program)**
Figure 1 contains the RBR of a program with three principal procedures: fact computes the factorial of an integer number, factSum traverses a list of integer numbers and adds the factorial value for each element, and main is the entry point of the program that invokes factSum with the unitary list Cons(2,Nil). Both fact and factSum have only one rule each, which establishes the value of the accumulator and invokes while_n. Each loop is a procedure with two rules: one for finishing the loop and other for computing one iteration. For example, the first rule of while_0 has the guard 0 >= n (Line 9) to check that the loop has finished, therefore returning the input accumulator as output value. On the other hand, the second rule of while_0 (lines 11–14) contains the guard 0 < n that checks that the loop has not finished yet. In that case, updated values of prod and n are stored in prod1 and n1 (recall the use of static single assignment) and those values are used in the recursive call. Finally, main is a procedure with one rule that simply invokes factSum.
2.2 Semantics of the rule-based language
The rule-based language is evaluated using an operational semantics based on variable mappings and configurations, which are defined as:
Definition 3** (Variable mappings)**
A variable mapping is a mapping that associates values (namely integer numbers or ground constructed terms) to variables . We use the symbol for empty mappings and for the union of variable mappings with disjoint domain. The notation represents the extension of with the new mappings (this operation redefines the previous mappings for variables if they appear in the domain of ). The application returns the value associated to variable , and returns the term resulting of replacing every variable in by its value in (similarly for patterns ).
Example 3** (Variable mappings)**
Consider two variable mappings and . Since and have disjoint domains its union is defined, with result . On the other hand, because the mapping has been redefined in the extension of .
Definition 4** (Configurations)**
A configuration \cbstart(or call stack\cbend), denoted as , is a sequence of activation records. An activation record is a triple of the form where is a procedure name,111While procedure names are not really needed in the operational semantics, they increase clarity and simplify the proofs in A. is the next statement to execute, is the sequence of statements after , and is the variable mapping that stores the values of the procedure parameters and local variables. Elements in sequences are separated by dots, where the last element of the sequence can represent the rest of the sequence (for example or ), and we overload the symbol to denote empty sequences.
Figure 2 shows the rules of the operational semantics () that evaluates RBR programs. We consider a function that evaluates a term under a variable mapping , returning a value , and a function that checks if a guard is satisfied under a variable mapping , returning a new (possibly empty) mapping that performs pattern matching.222The straightforward definition of these functions can be found in A. Figure 2 contains 3 rules. Rule evaluates activation records where the next statement is an assignment. In that case the value obtained from the right-hand side is introduced in the variable mapping of the activation record. Rule evaluates an activation record where the next statement is a procedure call . The first step is to obtain a fresh version of a rule of with all its variables renamed to avoid any collision. Then a variable mapping is created for parameter passing and this mapping is used to evaluate the rule guard . If the guard is evaluated to , a new activation record with the rule body and the variable mapping333Static single assignment guarantees that does not appears in , so we could use mapping union instead of extension. (where is generated during the evaluation of the guard) is inserted in front of the configuration. Note that the activation record of the caller stores the relation between the output values of the rule and the output parameters of the call. Finally, rule handles empty activation records, which are removed after the output values of the callee are stored in the variable mapping of the caller.
When needed, we decorate steps with two values : the semantic rule applied—(1), (2) or (3)—and the program rule number used, considering the whole program. If the step is not a procedure call i.e., it uses semantic rules (1) or (3), the program rule number is set to . Examples of these decorations can be seen in the next example:
Example 4** (Evaluation of an RBR program)**
The evaluation of the main procedure in Figure 1 proceeds as shown below. For simplicity, when obtaining fresh names for variables—rule (2) in Figure 2—we simply use subscripts (n) with the same number as the current configuration , we have underlined the statement that controls each step, and we write for several -steps. Note that the program rule of factSum is the rule in the program, and the second rule of while_1 is the rule in the program.
[TABLE]
Definition 5** (Traces and sequences of steps)**
A trace is a sequence of -steps from an initial configuration . Given a trace , its steps are defined as , i.e., the sequence of step decorations of the -steps in . Finally, the set of trace steps combines the steps of all the possible traces starting from a given configuration. Formally, . \cbstartThese notions will play an important role when defining the soundness of the abstraction (Section 3.3) and the soundness of the typed-norms inference in Section 4.2. \cbend
\cbstart
Let us informally discuss how the operational semantics is typically instrumented to account for cost of traces. We first assume that our language includes a special instruction tick(), where is a number, which is used to simulate a consumption of resources. Note that can be negative to simulate release of resources as well. In practice, we also allow the use any arithmetic expression instead of , but for simplicity we assume it is a constant. Next, we instrument activation records with a global resource consumption counter called cost, and add a corresponding semantic rule for tick() that simply sets cost to . A trace is said to be valid if cost never takes negative values, i.e., cost is initialized with an amount of resources that is enough to carry out this particular execution. The cost of a trace is the minimal initial value for cost that makes it valid. Finally, a function that maps initial configurations to nonnegative values is called an upper bound on the worst-case cost if, for any initial configuration , setting the initial value of cost to guarantees generating valid traces only. It is called a lower bound on the best-case cost if, for any initial configuration , setting the initial value of cost to a value smaller than generates an invalid trace. \cbend
3 Size Abstraction Using Typed-Norms
The resource analysis framework we rely on [Albert et al. (2007), Albert et al. (2015)] performs two phases: (1) the program is transformed into an abstract representation where data are replaced by their sizes, and (2) such intermediate program is then analyzed to obtain upper/lower bounds on the resource consumption. In this section we will present how to abstract the size of program data using typed-norms and we will show that this abstraction is sound wrt. the original semantics of RBR programs. The second phase, performed by cost relation solvers like PUBS [Albert et al. (2013)], CoFloCo [Flores-Montoya and Hähnle (2014), Flores-Montoya (2016)] or the solver in CiaoPP [Serrano et al. (2014)], is independent of the technique applied to abstract data sizes and therefore will not be covered in this paper.
3.1 Preliminaries on Typed-Norms
In order to obtain the abstract representation of a program, we first replace data with numbers representing their sizes and then transform each instruction to linear constraints that reflect how these sizes change. The mapping from data structures to sizes is done by means of size functions (usually called norms). The most well-known norm used in the literature is term-size [Bossi et al. (1991), Bruynooghe et al. (2007)], which counts the number of constructors in a given data structure:
Definition 6** (Term-size norm)**
*The size of a term using the term-size norm is defined as: . *
Notice that term-size is defined for terms containing only data constructors and cannot handle integer numbers.
Example 5** (Term-size norm)**
Consider the following program that uses binary trees (BT), list of binary trees (BTL), and lists of lists of binary trees (BTLL).
The trees function counts the number of binary trees in a list of type BTL, and sumtrees counts the number of binary trees in all the lists inside a list of type BTLL. Using the term-size norm, an empty list N or NL has size (one data constructor), whereas the list C(E,C(E,N)) has size (three list constructors plus two empty binary tree constructors). The function sumtrees traverses every element of the list for computing its number of trees. Using the term-size norm, a static analysis obtains a complexity of \cbstartfor the function sumtrees\cbend, where is the total size of the original list (i.e., the number of list, list of lists, and binary tree constructors). The size of each inner BTL list is bounded by , so each call to trees will contribute to the overall complexity. This is an example where term-size is not very precise, as it does not keep separate the information about the length of the BTLL list (), the length of the inner BTL lists (), and the size of the binary trees () to obtain a more accurate complexity of . Note that the size of the binary trees does not play any role in the actual complexity of sumtrees because the binary trees are not traversed.
In order to overcome the mentioned imprecision we use typed-norms, which distinguish data constructors according to their types. Using this kind of norms we can measure the length of a list and the size of its elements separately, similarly to what has been done in the context of termination analysis of logic programs [Bruynooghe et al. (2007)]. Before introducing typed-norms, we present some notation about types that will be used throughout the paper:
Definition 7
Given two types and , we say that depends on , written , if the definition of type uses (either directly or transitively) type . Relying on this notion, we define the set of constituent types of T as , i.e., all the types involved in the definition of including itself. If we say that is a recursive type. \cbstartWe use to refer to the type of a term in the i-th rule of the program, or simply if the rule is clear from the context (note that the same variable can have different types in different rules). As the program is well-typed then every term has a monomorphic type assigned, so simply returns that type.\cbend
By definition of —Line 1) in Figure 1—we have that and , therefore is a recursive type and .
Definition 8** (Typed-norms for closed terms)**
We consider two typed-norms for computing the size of a closed term regarding a type : and . We will not contemplate integer values for the first norm but non-negative integer numbers (denoted as ) as we explain later.
[TABLE]
[TABLE]
Note that, in the last case of the definition of , the of an empty set is if is , and [math] otherwise. In principle, this depends on the domain of the elements whose we are taking. The intuition behind these typed-norms is to count the number of data constructors of type that appear in term . For integer numbers they simply return their value, whereas for constructed terms they check whether the term has type or not in order to count the constructor in the head. The difference is how to handle those nested subterms of type that occur in a bigger term of type different from : sums the sizes of those subterms, whereas just keeps the size of the maximal subterm. The reason to constrain to instead of is that adding the integers inside a data structure provides a size that is not sound when negative values are involved. For example, would be [math]. On the other hand, , so we know that any integer inside the list is smaller than or equal to . We claim that suits better in the static analysis framework we consider, as we explain in Example 6.
In our rule-based language we only consider the basic type of integer numbers, so the rest of values are constructed using data constructors . However, typed-norms could be easily extended to support more basic types by providing a suitable case in the definition of and . For example, if were a basic type, we could define the size of strings as their length: (similarly for ).
Example 6** (Typed-norms for closed terms)**
\cbstart
Consider the program involving lists of binary trees shown in Example 5. The program does not contain integer values but only constructed data, so in this case both typed-norms and are sound. \cbendRegarding the list types BTL and BTLL, both typed-norms obtain the same values: and because they contain exactly one list constructor. They also coincide in the list of type BTLL with two elements CL( C(E,C(E,N)), CL(N,NL) ) regarding the type BTLL——as both count the number of list constructors: two constructors CL and one NL. However, the two norms differ wrt. type BTL: because it is the sum of all the BTL constructors in the list, 3 in the first element and one in the second element, but as it is the maximum number of BTL constructors that appear in some element of the list. This difference has an impact on the concrete upper bounds obtained by static analysis. For example, the function sumtrees defined in Example 5 has an asymptotic complexity of in both cases, where is the size of the list of lists BTLL and is the size of its inner lists BTL. The size of any list is the same using both typed-norms (), however, the size of its elements of type BTL will differ ( ). Since the static analysis framework we are considering is compositional and assumes worst-case scenarios for each iteration, the upper bound obtained using \cbstartwill in general be tighter.\cbend
As a final comment, note that we could also define a typed-norm analogous to that estimates the minimum value, by replacing “” with “”. This would be useful in situations where the upper bounds depend on the minimum value that an inner element of type can take, for example in recursive definitions where the value increments in every invocation. Similarly, if we replace the sum in the definition of by “”, then we estimate the depth of terms instead of the number of their constructs. This is useful in cases like the example in Section 1. Note that all these norms can be used at the same time, so the size of some elements could be measured using minimum, maximum, depth, etc., if these values are relevant for the cost.
3.2 Our Transformational Approach
Next we describe how to use typed-norms to translate RBR programs to abstract programs that only contain procedure calls and constraints between sizes. From this information the resource analysis framework can produce cost relations [Albert et al. (2011)] to obtain the desired bounds. The main feature of our approach is that we allow the use of several abstractions for the same variable at the same time, as in [Bruynooghe et al. (2007)]. Thus, we can estimate the size of a term using different measures, and even relations between sizes of different measures which might be crucial for precision as claimed in [Bruynooghe et al. (2007)].
This is important since two different parts of the program might traverse two different parts of the same data structure, so having both measures allows us to provide tighter bounds. Since rules will usually contain variables, we need symbolic versions of typed-norms. These versions extend the typed-norms presented in Definition 8 with new cases for handling variables. In the following definition we only show the symbolic typed-norm , but it is analogous for . Notice that we use the same notation to represent typed-norms for closed terms and their symbolic counterparts because the version used is clear from the context.
Definition 9** (Symbolic typed-norms)**
The symbolic typed-norm to compute the size of a term (possibly with variables) regarding a type is defined as:
[TABLE]
Note that, as in the closed case, the max of an empty set is if , and [math] otherwise. When a variable is abstracted using a type , it generates a variable . On the other hand, abstracting variables not related to using the type generates the size . This special case is useful to obtain precise sizes in terms containing integer numbers and variables like pairs . For example, we get because , which is more precise than a size of [math] obtained if . Integer numbers are abstracted to themselves if is , and in integer expressions with both subexpressions are abstracted recursively. The cases for data constructors are similar to Definition 8. Finally, if the type is not valid wrt. the type of , i.e., cannot include any subterm of type , then the symbolic typed-norm simply returns [math]. Notice that the symbolic version is equivalent to the typed-norm defined in Definition 8 for closed terms.
As an example, consider the term where both and have type . The abstractions wrt. and are and . If we consider a more complex term like \cbstartCons(z,Cons(5,Nil)) \cbendwe have that and .
In some cases it is inevitable to measure the size of a variable wrt. all its dependent types——to bound the resource consumption of a program. However, \cbstartoften many types \cbenddo not play any role for termination or resource consumption and therefore can be safely ignored. In order to consider only those important types we use the notion of relevant types. For every variable and rule in the program, we write to refer to the set of types wrt. which we want to measure the size of that variable. In Section 4 we explain how to automatically infer these types. If the rule is clear from the context, we usually omit the subscript. Note that the set must be a subset of all dependent types of a variable in a program rule, i.e., .
Using the above symbolic typed-norms and the notion of relevant type-norms we can define our transformation of RBR programs. Figure 3 contains the definition of the function that abstracts guards, statements, rules, and procedures into \cbstartprocedure calls and conjunctions of arithmetic constraints between sizes444Relations () between linear expressions containing integer variables representing sizes. . \cbendIn all these cases we assume that given a type the variable is an integer valued variable representing the size of (the value of) w.r.t. the typed-norm . If , then we implicitly assume , as constructed terms cannot have negative size. For a sequence of variables , we consider is a sequence that results from replacing each variable by , where .
As shown in Figure 3, the trivial guard true is transformed into the truth value , the identity element of conjunction. Conjunction of guards () is abstracted into the conjunction of their size abstractions. Arithmetic guards are abstracted by replacing their variables with . For example, the expression is directly abstracted as , whereas is abstracted as . Match guards——generate a conjunction with as many elements as types in . Each element in the conjunction will be an equality between the variable (the size of wrt. ) and (the size of the term wrt. , computed using the symbolic typed-norm). For example, if , the abstraction of will be the conjunction . On the other hand, the guard is abstracted to the truth value , as it does not provide any information relating the sizes of and . Regarding statements, the abstraction of a procedure call simply replaces each variable with variables related to its different sizes555When needed, we use angles “” and “” to enclose input and output variables, as we do in Section 2. (for example, is abstracted to considering and ). Likewise, an assignment is abstracted by generating a conjunction of equalities between the different variables and their sizes. The abstraction of a rule proceeds compositionally by abstracting its head, the guard and all the statements. As mentioned before, non-integer variables in a rule are assumed to be non-negative. The abstraction inserts these constraints explicitly using function , where is the set of variables occurring in the rule. The definition of is the following:
[TABLE]
Finally, the abstraction of a procedure is the abstraction of all its rules.
Definition 10** (Program abstraction)**
Given a program , its size abstraction is obtained by abstracting each procedure, i.e., .
Figure 4 shows the abstraction for a fragment of the running example \cbstartgiven in Figure 1\cbend. When using the typed-norm in Definition 9, might include constraints of the form where is an arithmetic expression that involves “”. If the second phase of the resource analysis framework does not support this kind of constraints, they can be approximated using linear constraints as follows: (1) replace the sub-expression by a new auxiliary variable , (2) add the constraints ; and (3) if for all , then add the constraint as well. This process might be applied repeatedly in case of nested or multiple occurrences of . In the case of nested “” expression, we try to flatten them first. Notice that this approximation is not the only approach to handle constraints, as they could also be approximated as in [Alonso-Blas et al. (2011)]. Moreover, in practice, the constant (that we used in the definition of norms) can safely be replaced by any other constant — replacing it by the minimum value that syntactically appears in the program (or by [math]) works well in practice.
Example 7** (Program abstraction)**
Figure 4 shows the abstraction using the typed-norm of the procedures factSum, while_1 and main from Figure 1. Each line contains a comment indicating if it involves non-negativity constraints or is the abstraction of a particular line of the original RBR program. We assume that list variables ( l, l1) have , whereas integer variables ( sum, sum”, e, prod, sum1, r”) have . The abstraction proceeds rule by rule, abstracting rule heads, guards, and statements; \cbstartso rule numbers (gray circles in the left margin) does not change from the concrete program. \cbendNotice that the constraints from the guard are combined with the non-negativity of all the non-integer variables of each rule. The most interesting part is the abstraction of the guard match(l, Cons(e,l1)) in the second rule of while_1. For the type , the guard generates LInt = max(EInt, L1Int)—which could be approximated by linear constraints—and for the type the guard generates the constraint , stating that the new list l1 is one constructor smaller than l.
\cbstart
Note that deterministic programs can be abstracted to non-deterministic programs if different terms of the same type have the same size. For example, consider the simple data type data Dir = Up | Down, and the following price :: procedure:
Since and in both rules, and , the program abstraction results in the nondeterministic version:
However, the possible nondeterminism introduced by the abstraction is not a problem from the point of view of soundness, as any trace in the deterministic program can be performed in the nondeterministic abstraction. This result is presented in the next section. \cbend
3.3 Soundness
The abstraction presented in the previous section transforms RBR programs into abstract programs, which the resource analysis framework can take as input for computing bounds. Therefore, we ensure that the size of variables that are observed in traces of RBR programs can be observed in traces of their abstractions. In order to prove this soundness result, we need an abstract operational semantics for abstract programs, which is an adaptation of the operational semantics for RBR programs in Figure 2. This abstract operational semantics evaluates abstract configurations step by step:
Definition 11** (Abstract configurations)**
An abstract configuration, denoted as , is a sequence of abstract activation records followed by a conjunction of constraints. Similarly, an abstract activation record, denoted as , is a sequence of constraint conjunctions and abstract procedure calls . We use Greek letters (, , …) to denote constraint conjunctions, to denote one constraint conjunction or one abstract procedure call, and to denote a sequence .
Figure 5 contains the operational semantics that evaluates abstract configurations regarding an abstract program . Rule handles abstracted assignments, which are translated into conjunctive constraints . If is consistent wrt. the global set of constraints () then it is added to the global constraints and the evaluation continues with the next element of the abstract activation record. Rule handles procedure invocation. First, we obtain a fresh rule \cbstartfrom the abstract program \cbend. Then, if the parameter passing, the guard and the global constraints are consistent \cbstart(),666We use the notation to denote the conjunction of constraints . \cbenda new abstract activation record containing the body of the procedure is inserted in the abstract configuration.
Similarly to the semantics, the relation between the output variables and the parameters is stored as a mark. Finally, rule removes an empty abstract activation record if the output variables are consistent with the global constraints. Notice that global constraints accumulate variables from all the abstract activation records, even those whose execution has finished and therefore have been removed. \cbstartSimilarly to , we use the notion of abstract trace where its steps are defined as , and the set of abstract trace steps as the combination of the steps of all the possible abstract traces starting from a given configuration (. \cbend
Example 8** (Evaluation of an abstract program)**
The abstract evaluation of the main procedure in Figure 1 proceeds similarly to Example 4. When obtaining fresh names for variables—rule () in Figure 5—we use superscripts with the same number as the current abstract configuration . For simplicity, the statement or constraint that controls each step is underlined, the constraints in each abstract configuration are denoted as and is omitted. \cbstartEach step is decorated with the semantic rule and abstract program rule used, \cbendand refers to many -steps.
[TABLE]
In the first program rule of while_1 (Line 6 in Figure 4) cannot be used to evaluate the call because the guard is not compatible with the global set of constraints, namely: . Therefore, the evaluation can only proceed with the second rule of while_1 (Rule 6, Line 11 in Figure 4).
The soundness result relates -traces starting from a configuration with abstract -traces starting from the abstraction of . The following definitions present the abstraction of configurations, which is based on and the abstraction of statements defined in Figure 3.
Definition 12** (Variable mapping abstraction)**
The abstraction of a variable mapping is defined as . Notice that is a conjunction of equalities between distinct variables and integer values, since are concrete values and therefore generates integer numbers.
Definition 13** (Configuration abstraction)**
Let be a configuration. Its abstraction is defined as: where . Notice that is a conjunction of equalities between distinct variables and integer values, since every activation record uses fresh variables and is the conjunction of abstracted variable mappings.
The following \cbstarttheorem establishes the soundness of our translation using typed-norms: for any trace it is possible to create an abstract trace with the same steps.\cbend
\cbstart
Theorem 1** (Soundness)**
If then there is an abstract trace such that , and .
\cbend
Intuitively, the above theorem states that the sizes of the variables of the concrete configuration , w.r.t. the corresponding norm, define a model of the abstract state configuration.
\cbstart
Let us now informally explain how abstract programs preserve cost. The idea is to simulate the same process that we have described at the end of Section 2. First, during the abstraction phase the instruction tick() is kept in the abstract program. Next, we instrument abstract activation records with a cost counter , and add a corresponding abstract semantic rule for tick() that simply sets to . An abstract trace is said to be valid if never takes negative values, i.e., cost is initialized with an amount of resources that is enough to carry out this particular abstract execution. Finally, a function that maps initial abstract configurations to nonnegative values is called an upper bound on the (abstract) worst-case cost if, for and , setting the initial value of to guarantees generating valid abstract traces only. It is called a lower bound on the best-case cost if, for any , setting the initial value of to a value smaller than generates an invalid abstract trace. The function is typically given in terms of the input abstract variables, i.e., in terms of the sizes of the corresponding data. The black-box component that infer the cost of the abstract program, infers such functions.
Given the statement of Theorem 1, it is easy to see that if is is an upper (resp. lower) bound on the abstract worst-case (resp. best-case) cost, then it is also an upper (resp. lower) bound on the concrete cost (up to rewriting it in terms of typed-norms instead of corresponding abstract variables).
As a final remark, let us mention that although the paper and the experiments focus on upper bounds inference, our transformation is valid also to infer lower bounds as it ensures that the cost of every trace is preserved by the transformed program. \cbend
4 Inference of Relevant Types
As explained in the previous section, in order to abstract the size of a variable in the rule number we consider a set containing its relevant types. It is safe to assume that contains all the constituent types of . However, the \cbstartcomplexity \cbendof the solving phase that obtains bounds grows exponentially with the number of variables involved, so it is very important to obtain the smallest sets for the relevant typed-norms. In this section we will present the inference algorithm for relevant types as well as its soundness result.
As it was observed in [Albert et al. (2008)], variables that do not affect the cost can be removed from the abstract program and the bounds obtained do not change. In our setting, the cost of a program depends primarily on the number of recursive calls performed, which is affected by the guards in the rules. Therefore, any variable that does not affect directly or indirectly the value of a guard can be ignored. We push this idea further \cbstartand detect, from each variable, those types that do not affect \cbenda guard evaluation directly or indirectly. These types are useless from the point of view of resource analysis, so they should be discarded from the set of relevant types of the variable.
The main intuition behind the algorithm for inferring relevant types is detecting those constituent types of a guard variable that are involved in the guard evaluation. For example, in match(l,Cons(x,xs)) we say that the type IntList of variable l is involved in the guard evaluation because the pattern matched is of type IntList. On the other hand, the type Int of the same variable is not involved in the guard evaluation because it can succeed or fail regardless of the possible Int values stored in l. Once we have this information from the guards, we propagate it backwards to include those relevant types in the rest of variables that can affect the value of the guard variable. \cbstartRelevant types will be propagated to the rules formal parameters, where they will be combined \cbendwith the relevant types from the rest of rules of the same procedure. Similarly, when invoking a procedure, the relevant types of the formal parameters will be included in the variables of the actual arguments.
Example 9** (Intuition of relevant types inference)**
Consider the RBR program in Figure 1.
From Line 9 (Rule
2
) we discover that Int is involved in the arithmetic guard 0 >= n for procedure while_0, so n.
The procedure while_0 is invoked in
Line 6 of Rule
1
, so that
this relevant type is propagated to the fact rule, i.e., n. Similarly, fact is invoked in Line 27 of the Rule
6
corresponding to the while_1 predicate, so Int will be a relevant type for variable e. That rule contains a guard match(l, Cons(e,l1)) in Line 26, so Int will be propagated from variable e to l, i.e., l. After another step of propagation in Rule
4
(procedure factSum) we will obtain that . In summary, the relevant type Int detected in Rule
2
has been propagated to Rule
4
following the path
2
1
6
4
.
A similar but shorter process would produce that l by propagating it in the path
5
4
. In this case all the constituent types of IntList are relevant types for the parameter l in factSum (Rule
4
), as the resource usage depends both in the length of the list and its stored numbers.
4.1 Formalization of relevant type inference
We formalize the relevant types inference algorithms as a data-flow analysis that constructs a mapping for the complete program. This mapping relates a set of relevant types to every variable in the program, and is defined as follows:
Definition 14** (Rule and program mappings)**
A rule mapping maps variables in a rule to sets of types. The domain of is denoted as , and returns the set of relevant types related to variable . We use to denote an empty rule mapping.
A program mapping aggregates the mappings of all the rules of a program , considering that has rules. We use the notation to refer to the ith rule mapping in , and to refer to the set of relevant types of variable in the rule.
Rule and program mappings support the following set of standard operations (extension, ordering, combination, restriction and renaming) that we use in the formalization of the inference algorithm.
Definition 15** (Operations on rule and program mappings)**
A rule mapping can be extended to a program mapping by assigning it to the rule and considering the empty mapping for the rest of rules. The extension of a rule mapping is denoted as .
We consider the natural order of rule mappings based on subset inclusion. We say that iff. . This order is extended to program mappings as follows: iff. for all rule .
Rule mappings can be combined using the commutative and associative operator , defined as:
[TABLE]
If a variable appears in both rule mappings, its typed-norms are combined, otherwise it takes the typed-norms from the mapping where \cbstartit appears. \cbendNotice that variables not appearing in \cbstartnor \cbend are undefined in . We also consider the combination of program mappings of the same length by proceeding element-wise. We overload the symbol :
[TABLE]
When combining sequences of mappings we use the notation , and for combining sets of program mappings.
Rule mappings can be restricted to a set of variables , formally:
[TABLE]
Finally, we use to denote the renaming of a rule mapping , where the variables are renamed to (we consider that and ):
[TABLE]
Example 10** (Operations on rule and program mappings)**
Consider the following rule mappings:
[TABLE]
We have that ( for variable ) and ( for variable ) but () and ( for variable ). Regarding combination, we have that and . The restriction of rule mappings simply reduces the domain, so and \cbstart. \cbendFinally, the renaming changes the variables in the domain, for example because is renamed by and is renamed by .
Next we explain the inference algorithm, which is based on the definition of that is given in Figure 6. Let be a program with rules (). Then the relevant types inferred for the program is the least fixed point of the function , i.e., . \cbstartNote that is a monotone function and is a finite complete lattice, so the least fixed point can be computed exactly. \cbendConsidering this fixed point, the notation is defined as . The function takes a program mapping and extends it with the new information from the different rules of the program. For each program rule, it computes and combines them with the current program mapping.
The function takes the current program mapping and extends it using the information in rule number . This function processes a rule by combining the information from the guard (), the body () and also collecting the relevant types from the parameters of all the rules of the same procedure (mapping ). This last step, which requires a renaming of the parameters to , forces rules of the same predicate to have the same relevant types and therefore be abstracted to rules with the same name and number of abstracted parameters.
The function \cbstart \cbendtakes a guard and a program mapping and generates new relevant types in a rule mapping. This function proceeds by combining the new relevant types obtained in every fragment of the guard. A true guard always succeeds, so it does not impose any relevant type. Arithmetic guards requires Int as a relevant type for any variable in the expression. If there is a guard and has a recursive type then this type is included as a relevant type for . Non-recursive types are ignored because they cannot directly affect the number of recursions. However, if it contains an inner recursive type it will be detected and propagated when computing the fixed point. Finally, in and guards all relevant types already detected in the variables of the pattern are propagated to the matching variable . \cbstartNote that in guards, the recursive type of will be only relevant if the procedure is recursive, otherwise it will not affect the execution. Therefore, in practice we can infer relevant types for recursive procedures and then transfer that information to non-recursive ones. \cbend
The function is overloaded. When it takes a set of statements and a program mapping then it traverses all the statements, invoking for each one and combining the results. When is invoked with only one statement then it propagates the information from according to the statement processed. If then all the relevant types detected for that are constituents of the type of some variable of the right-hand side (set ) are propagated to that variable . Finally, if the statement is a procedure call then the function propagates the relevant types of the parameters. The set propagates all the relevant types for the input parameters of the rules for the same procedure to the input parameters of the current rule . Similarly, the set propagates the relevant types of the output parameters of the current rule to the output parameters of the rules of the same predicate. Sets and contain program mappings that are combined.
The result of the relevant types inference always exists, and it can be effectively computed by iterating starting from the empty program typed-norms mapping, as we state below.
Theorem 2
Consider a program such that . Then exists and is the supremum of the ascending Kleene chain starting from .
Example 11** (Relevant types inference)**
Here we explain how the functions in Figure 6 produce the result intuitively explained in Example 9. We consider as the RBR program in Figure 1.
The process starts with an empty program mapping . In the first iteration , when processing Rule
2
we obtain that because of the call 0 >= n.
In the next iteration, , so that information is propagated to Rule
1
, i.e., , thanks to the invocation
\cbstartwhile_0(n, prod, prod’)’. \cbendThis information is collected in the set from Rule
2
.
Similarly, that information is propagated to \cbstart from the procedure call fact(e, prod) \cbendin Line 27 by invoking fact(e, prod).
In the next step, the relevant type of \cbstarte \cbendis propagated to l——when processing the guard with match(l, Cons(e,l1)).
Finally, the relevant type in the input parameter \cbstartl \cbendof Rule
6
is propagated to Rule
4
when processing the procedure call with \cbstartwhile_1(l, sum, sum’)’, \cbendso . As the function is monotone, all this information will be kept in the least fixed point.
4.2 Soundness of relevant types inference
The inference of relevant types previously presented obtains a set of interesting types for every variable in every rule in the program. In order to state that those inferred types are enough to guide the traces, we need to introduce a new notion: value variation.
Definition 16** (Value variation)**
We say that is a variation of wrt. a type (written ) if results from replacing some components of with type for other components of the same type. For example, —replacing by —and —replacing the list completely.
Using the notion of variation, we can define when a type is useful for a variable: will be useful for a variable in the i-th rule of a program if changing the value of that variable changes the set of trace steps starting from a configuration of rule . Formally:
Definition 17** (Useful type)**
We say that a type is useful for variable in the i-th rule of a program —written —if there are configurations and such that:
, where are statements from the i-th rule of 2. 2.
** 3. 3.
* with and * 4. 4.
**
Finally, we can state the soundness result of the inference of relevant types: if changing some components of type of a variable in rule affects the possible trace steps—i.e. —then that type will be inferred by our process—i.e. .
Theorem 3** (Soundness)**
If then .
5 Extensions
The purpose of this section is to propose two extensions, which are very useful in practice, to the previous analysis. This, moreover, demonstrates that extending our framework is very easy.
5.1 Extension to Polymorphic Types
Let us now describe the extension of our approach to handle polymorphic types. First, we extend the data type definition in Section 2 by adding polymorphic types.
Definition 18** (Polymorphic types)**
Given a countable set of type variables , a polymorphic type can be either a monomorphic type , a type variable , or an algebraic data type defined as:
\begin{array}[]{lll}\mathit{pDd}&::=&\mathit{data}~{}\mathit{D}\langle\Gamma\rangle=p\mathit{Alt}~{}[\overline{\mid p\mathit{Alt}{}}]\\ p\mathit{Alt}&::=&\mathit{Co}[(\overline{T(\Gamma)})]\\ \end{array}**
where is a finite subset of variables and any type variable in the right-hand side of a data type definition must be in the finite subset of variables in the left-hand side.
Example 12** (Polymorphic list)**
Using the syntax presented in Definition 18 we can define the data type of a polymorphic list (ListA) as follows:
data ListA = Nil | Cons(A, ListA)**
The type IntList in Definition 1 can be represented by the type ListInt, where the parametric type A is instantiated to Int.
The RBR program syntax in Definition 2 can also be extended by adding polymorphic typed procedures in the following way:
[TABLE]
In this context, properly typed terms must be understood as the natural extension to polymorphic types. Figure 7 contains a RBR program with polymorphic types. In order to avoid confusion with the angles used to represent polymorphic types, in the examples of this section we use and to separate input and output arguments.
Analyzing RBR programs with polymorphic types can be done by transforming them into equivalent programs with monomorphic types only. This basically creates monomorphic versions of procedures (and types) for every use of the polymorphic types. Note that all possible uses (or instantiations) of the polymorphic types can be inferred using standard type inference algorithms. The following list of steps describes how such a transformation can be carried out:
- •
For every polymorphic type definition and every needed instantiation of the polymorphic type, we replace the polymorphic type definition by new copies using fresh monomorphic types and constructors.
- •
For every polymorphic procedure and every needed instantiation of its polymorphic type declarations, we replace the polymorphic procedure by new copies using fresh monomorphic types. Calls to procedures are also replaced by their properly typed monomorphic translations.
Note that, in addition, in some situations the user might be interested in analyzing a polymorphic procedure without providing any calling context, i.e., the polymorphic type cannot be instantiated to a monomorphic one in such cases. To handle these cases, for every type variable A in the program, we create a corresponding fresh monomorphic type A together with a fresh constant a, and then use those types to instantiate the corresponding procedures.
Example 13
Consider the RBR program in Figure 7. We can analyze the program and extract that head and tail are called in the body of while_1 with the type A instantiated to . Therefore, we obtain the transformed program presented in Figure 8 and its abstraction in Figure 9. Note also that we have instantiated procedures head and tail with respect to the auxiliary monomorphic type A. This allows analyzing these procedures directly without considering the calling context of while_1.
\cbstart
Notice that, in this example, the number of arguments of head and tail procedures in the different instantiated versions of Figure 9 are the same, but they can differ depending on the instantiations of the type variables. In this way, we can keep the size relations between input and output instantiated abstractions (this is not possible if we have only one version of head and tail and a fixed number of arguments). \cbend
It is easy to check that any evaluation step given in the original RBR program is translated into an evaluation step in the transformed RBR program. The rest of the analysis follows from previous sections.
Another possibility for handling polymorphism could be to analyze every method once and instantiate that information in every particular method invocation. However, this is not an straightforward approach because different uses of the same polymorphic method can involve a different number of norms whose sizes must be related. Our approach is not completely modular and cannot be applied in every scenario (for example, polymorphic recursion cannot be handled), but provides a simple and effective way of supporting standard parametric polymorphism in our setting.
5.2 Context-Sensitive Norms
\cbstart
In this section, we propose a way to improve the precision of typed-norms using annotations. When we define a data type, we can use the same type in different positions and for different purposes. \cbstart\cbendIn such cases, if the type is not part of the recursive data structure, we can distinguish the different uses of the same type. The corresponding upper bounds are typically more precise and provide insights on the complexity of processing each part of the data. Technically, to achieve this, we \cbstart\cbendneed to annotate non-recursive types with their positions in the data type structures.
Definition 19** (Annotated Monomorphic types)**
An annotated monomorphic type can be a built-in data type as Int or an algebraic data type defined as:
\begin{array}[]{l@{~~~~~}lll}\hfil~{}~{}~{}~{}~{}&\mathit{Dd}&::=&\mathit{data}~{}\mathit{D}=\mathit{Alt}_{D}~{}[\overline{\mid\mathit{Alt}_{D}}]\\ \hfil~{}~{}~{}~{}~{}&\mathit{Alt}_{D}&::=&\mathit{Co}[(T_{D,\mathit{Co},1},\ldots,T_{D,\mathit{Co},n})]\\ \hfil~{}~{}~{}~{}~{}&T_{D,\mathit{Co},1}&::=&\mbox{\begin{cases}T&\mbox{ if is recursive in }\ T_{\mathit{Co},i}&\mbox{ if is non-recursive in }\end{cases}}\end{array}**
By unrolling annotated type definitions as trees we can extend to consider the annotations in the path from the root to the types. Figure 10 shows the type definition IntListPair = Pair(IntList, IntList) as a tree. From these annotations we obtain , where each element of the set is formed by a pair (type, chain of positions) that represents its annotations in the tree, and the empty chain is represented by . Notice that, if we do not have duplicated non-recursive types, we obtain the same number of norms (but annotated).
Symbolic typed norms are extended to deal with annotated types. The intuition in is that we traverse the annotated data structure using and when the annotation is empty (), we have reached the type-norm we want to compute and we can use a definition similar to Definition 9.
Definition 20** (Symbolic annotated typed-norms)**
The symbolic typed-norm to compute the size of a term (possibly with variables) regarding a type is defined as where:
[TABLE]
[TABLE]
By using , we obtain a sound abstraction in the sense of Theorem 1, and the relevant type inference presented in Section 4 could be easily adapted to consider annotated type-norms. In the following, we present an example that shows how this technique is applied in practice.
Example 14
Consider the program in Figure 11 (left) where the rule lengthP takes a non-recursive IntListPair structure and computes the length of the second list if the first list is empty; otherwise, returns the length of the first list. Since both lists are of the same type (IntList), we are interested in considering their norms separately to obtain a better upper bound. Figure 11 \cbstart(right) \cbendshows this abstraction777For simplicity, we use and instead of and , and and instead of and .. It basically distinguishes the uses of IntList by their position in the definition of IntListPair—note the use of and . Using this improvement, we can obtain the upper bound for the function instead of . Note that in this case .
\cbend
The proposed approach using annotated types to handle context-sensitive norms generates abstract programs without any type information where every parameter represents an integer value. Therefore, these abstract programs can be solved automatically to upper and lower resource bounds using the existing techniques with no additional adaptations.
6 Experiments
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Agha and Callsen (1993) Agha, G. and Callsen, C. J. 1993. Actorspace: An open distributed programming paradigm. In P roceedings 4th ACM Conference on Principles and Practice of Parallel Programming, ACM SIGPLAN Notices. 23–32.
- 2Albert et al . (2015) Albert, E. , Arenas, P. , Correas, J. , Genaim, S. , Gómez-Zamalloa, M. , and an d Guillermo Román-Díez, G. P. 2015. Object-Sensitive Cost Analysis for Concurrent Objects. S oftware Testing, Verification and Reliability 2 5, 3, 218–271.
- 3Albert et al . (2014) Albert, E. , Arenas, P. , Flores-Montoya, A. , Genaim, S. , Gómez-Zamalloa, M. , Martin-Martin, E. , Puebla, G. , and Román-Díez, G. 2014. SACO: Static Analyzer for Concurrent Objects. In T ools and Algorithms for the Construction and Analysis of Systems - 20th International Conference, TACAS 2014, E. Ábrahám and K. Havelund, Eds. Lecture Notes in Computer Science, vol. 8413. Springer, 562–567.
- 4Albert et al . (2011) Albert, E. , Arenas, P. , Genaim, S. , Gómez-Zamalloa, M. , and Puebla, G. 2011. Cost Analysis of Concurrent OO programs. In P rogramming Languages and Systems - 9th Asian Symposium, APLAS 2011, Kenting, Taiwan, December 5-7, 2011. Proceedings, H. Yang, Ed. Lecture Notes in Computer Science, vol. 7078. Springer, 238–254.
- 5Albert et al . (2011) Albert, E. , Arenas, P. , Genaim, S. , and Puebla, G. 2011. Closed-Form Upper Bounds in Static Cost Analysis. J ournal of Automated Reasoning 4 6, 2, 161–203.
- 6Albert et al . (2007) Albert, E. , Arenas, P. , Genaim, S. , Puebla, G. , and Zanardini, D. 2007. Cost Analysis of Java Bytecode. In P rogramming Languages and Systems, 16th European Symposium on Programming, ESOP 2007, Held as Part of the Joint European Conferences on Theory and Practics of Software, ETAPS 2007, Braga, Portugal, March 24 - April 1, 2007, Proceedings, R. D. Nicola, Ed. Lecture Notes in Computer Science, vol. 4421. Springer-Verlag, 157–172.
- 7Albert et al . (2008) Albert, E. , Arenas, P. , Genaim, S. , Puebla, G. , and Zanardini, D. 2008. Removing Useless Variables in Cost Analysis of Java Bytecode. In P roceedings of the 2008 ACM Symposium on Applied Computing (SAC), Fortaleza, Ceara, Brazil, March 16-20, 2008, R. L. Wainwright and H. Haddad, Eds. ACM, 368–375.
- 8Albert et al . (2012) Albert, E. , Arenas, P. , Genaim, S. , Puebla, G. , and Zanardini, D. 2012. Cost Analysis of Object-Oriented Bytecode Programs. T heoretical Computer Science (Special Issue on Quantitative Aspects of Programming Languages) 4 13, 1, 142–159.
