Optimization Modulo the Theories of Signed Bit-Vectors and Floating-Point Numbers
Patrick Trentin, Roberto Sebastiani

TL;DR
This paper introduces a new Optimization Modulo Theories approach for handling signed Bit-Vectors and Floating-Point Numbers, extending previous work and demonstrating its effectiveness through implementation and testing.
Contribution
It presents a novel OMT method for signed BV and FP, introducing the concepts of attractor and dynamic attractor, filling a gap in existing SMT/OMT techniques.
Findings
Validated the approach with empirical tests on SMT-LIB problems.
Extended OMT techniques to signed BV and FP.
Demonstrated feasibility and effectiveness of the new method.
Abstract
Optimization Modulo Theories (OMT) is an important extension of SMT which allows for finding models that optimize given objective functions, typically consisting in linear-arithmetic or pseudo-Boolean terms. However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors (BV) for the integers and that of Floating-Point Numbers (FP) for the reals respectively. Whereas an approach for OMT with (unsigned) BV has been proposed by Nadel & Ryvchin, unfortunately we are not aware of any existing approach for OMT with FP. In this paper we fill this gap. We present a novel OMT approach, based on the novel concept of attractor and dynamic attractor, which extends the work of Nadel & Ryvchin to signed BV and, most importantly, to FP. We have implemented some…
| sign | exp | sig | value | |
| 1 | #b0 | #b111 | #b1111 | NaN |
| … | … | … | NaN | |
| 2 | #b0 | #b111 | #b0000 | |
| 3 | #b0 | #b110 | #b1111 | |
| … | … | … | … | |
| 4 | #b0 | #b000 | #b0001 | |
| 5 | #b0 | #b000 | #b0000 | |
| 6 | #b1 | #b000 | #b0000 | |
| 7 | #b1 | #b000 | #b0001 | |
| … | … | … | … | |
| 8 | #b1 | #b110 | #b1111 | |
| 9 | #b1 | #b111 | #b0000 | |
| … | … | … | NaN | |
| 10 | #b1 | #b111 | #b1111 | NaN |
| tool, configuration & encoding | inst. | term. | t.o. | u | bt | st | time (s.) |
| OptiMathSAT(eager+omt+lin) | 1120 | 1003 | 117 | 0 | 5 | 73 | 76375 |
| OptiMathSAT(eager+omt+lin+pi) | 1120 | 1003 | 117 | 0 | 5 | 71 | 76785 |
| OptiMathSAT(eager+omt+lin+bp) | 1120 | 956 | 164 | 0 | 6 | 105 | 77480 |
| OptiMathSAT(eager+omt+lin+bp+pi) | 1120 | 873 | 247 | 0 | 77 | 217 | 54859 |
| OptiMathSAT(lazy+omt+lin) | 1120 | 868 | 252 | 0 | 93 | 203 | 29832 |
| OptiMathSAT(eager+omt+bin) | 1120 | 1014 | 106 | 0 | 11 | 281 | 67834 |
| OptiMathSAT(eager+omt+bin+pi) | 1120 | 970 | 150 | 0 | 8 | 285 | 69765 |
| OptiMathSAT(eager+omt+bin+bp) | 1120 | 1016 | 104 | 0 | 14 | 205 | 68255 |
| OptiMathSAT(eager+omt+bin+bp+pi) | 1120 | 991 | 129 | 0 | 65 | 321 | 56941 |
| OptiMathSAT(lazy+omt+bin) | 1120 | 900 | 220 | 0 | 90 | 243 | 33260 |
| OptiMathSAT(eager+obvbs) [reduction] | 1120 | 1013 | 107 | 0 | 14 | 141 | 65954 |
| OptiMathSAT(eager+ofpbs) | 1120 | 1017 | 103 | 0 | 9 | 171 | 70732 |
| OptiMathSAT(eager+ofpbs+pi) | 1120 | 1019 | 101 | 0 | 34 | 280 | 64896 |
| OptiMathSAT(eager+ofpbs+pi+so) | 1120 | 1018 | 102 | 0 | 7 | 179 | 71430 |
| OptiMathSAT(eager+ofpbs+bp) | 1120 | 975 | 145 | 0 | 2 | 145 | 65543 |
| OptiMathSAT(eager+ofpbs+bp+so) | 1120 | 1000 | 120 | 0 | 3 | 124 | 68390 |
| OptiMathSAT(eager+ofpbs+bp+pi) | 1120 | 1001 | 119 | 0 | 77 | 273 | 60365 |
| OptiMathSAT(eager+ofpbs+bp+pi+so) | 1120 | 1006 | 114 | 19 | 32 | 245 | 59463 |
| virtual best | 1120 | 1074 | 46 | - | 559 | 1074 | 27788 |
| OptiMathSAT(eager+smt) [no optimization] | 1120 | 1048 | 72 | - | - | - | 9259 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: DISI, University of Trento, Italy
Optimization Modulo the Theories of Signed Bit-Vectors and Floating-Point Numbers
††thanks: We would like to thank the anonymous reviewers for their insightful comments and suggestions, and we thank Alberto Griggio for support with MathSAT code.
Patrick Trentin
Roberto Sebastiani
Abstract
Optimization Modulo Theories (OMT) is an important extension of SMT which allows for finding models that optimize given objective functions, typically consisting in linear-arithmetic or pseudo-Boolean terms. However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors () for the integers and that of Floating-Point Numbers () for the reals respectively. Whereas an approach for OMT with (unsigned) has been proposed by Nadel & Ryvchin, unfortunately we are not aware of any existing approach for OMT with .
In this paper we fill this gap. We present a novel OMT approach, based on the novel concept of attractor and dynamic attractor, which extends the work of Nadel & Ryvchin to signed and, most importantly, to . We have implemented some and procedures on top of OptiMathSAT and tested the latter ones on modified problems from the SMT-LIB repository. The empirical results support the validity and feasibility of the novel approach.
1 Introduction
Optimization Modulo Theories (OMT) [34, 19, 35, 37, 21, 20, 30, 29, 8, 28, 38, 39, 40, 7, 31, 41, 5, 42, 22, 27, 6]
is an important extension to Satisfiability Modulo Theories which allows for finding models that optimize one or more objectives, which typically consist in some linear-arithmetic or Pseudo-Boolean function application.
However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors () for the integers and that of Floating-Point Numbers () for the reals respectively. (For instance, during the verification process of a piece of software, one may look for the minimum/maximum value of some int [resp. float] parameter causing an [resp. ] call to return sat—which typically corresponds to the presence of some bug— so that to guarantee a safe range for such parameter. )
OMT for the theory of (unsigned) bit-vectors () was proposed by Nadel and Ryvchin [31], although a reduction to the problem to MaxSAT was already implemented in the SMT/OMT solver Z3 [9]. The work in [31] was based on the observation that OMT on unsigned can be seen as lexicographic optimization over the bits in the bitwise representation of the objective, ordered from the most-significant bit (MSB) to the least-significant bit (LSB).
In this paper we address —for the first time to the best of our knowledge— OMT for the theory of signed Bit-Vectors and, most importantly, for the theory of Floating-Point Arithmetic (), by exploiting some properties of the two’s complement encoding for signed and of the IEEE 754-2008 encoding for respectively.
We start from introducing the notion of attractor, which represent (the bitwise encoding of) the target value for the objective which the optimization process aims at. This allows us for easily leverage the procedure of [31] to work with both signed and unsigned Bit-Vectors, by minimizing lexicographically the bitwise distance between the objective and the attractor, that is, by minimizing lexicographically the bitwise-xor between the objective and the attractor.
Unfortunately there is no such notion of (fixed) attractor for numbers, because the target value moves as long as the bits of the objective are updated from the MSB to the LSB, and the optimization process may have to change dynamically its aim, even at the opposite direction. (For instance, as soon as the minimization process realizes there is no solution with a negative value for the objective and thus sets its MSB to 0, the target value is switched from to , and the search switches direction, from the maximization of the exponent and the significand to their minimization.)
To cope with this fact, we introduce the notions of dynamic attractor and attractor trajectory, representing the dynamics of the moving target value, which are progressively updated as soon as the bits of the objective are updated from the MSB to the LSB. Based on these ideas, we present novel procedures, which require at most , incremental calls to an solver, being the number of bits in the representation of the objective. Notice that these procedures do not depend on the underlying procedure used, provided the latter allows for accessing and setting the single bits of the objective.
We have implemented these and procedures on top of the OptiMathSAT OMT solver [42]. We have run an experimental evaluation of the procedures on modified problems from the SMT-LIB library. The empirical results support the validity and feasibility of the novel approach.
The rest of the paper is organized as follows. In §2 we provide the necessary background on and theories and reasoning. In §3 we provide the novel theoretical definitions and results. In §4 we describe our novel procedures. In §5 we present the empirical evaluation. In §6 we conclude, hinting some future directions.
2 Background
We assume some basic knowledge on SAT and SMT and briefly introduce the reader to the Bit-Vector and Floating-Point theories.
Bit-Vectors.
A bit is a Boolean variable that can be interpreted as [math] or . A Bit-Vector () variable is a vector of bits, where is the Most Significant Bit (MSB) and is the Least Significant Bit (LSB).111 Although most often in the literature the indexes use to grow from the LSB to the MSB, in this paper we use the opposite notation because we always reason from the MSB down to the LSB, so that to much simplify the explanation.
A constant of width is an interpreted vector of values in . We a bit value or a value to denote its complement (e.g., is ). A variable/constant of width can be unsigned, in which case its domain is , or signed, which we assume to comply with the Two’s complement representation, so that its domain is . Therefore, the vector can be interpreted either as the unsigned constant or as the signed constant . Following the SMT-LIBv2 standard [3], we may also represent a constant in binary (e.g. is written ) or in hexadecimal (e.g. is written ) form. A term is built from constants, variables and interpreted functions which represent standard RTL operators: word concatenation (e.g. ), sub-word selection (e.g. ), modulo-n sum and multiplication (e.g. and ), bit-wise operators (like, e.g., , , , , ), left and right shift , . A atom can be built by combining terms with interpreted predicates like , (e.g. ) and equality. We refer the reader to [3, 24]
for further details on the syntax and semantics of Bit-Vector theory.
There are two main techniques for satisfiability, the “eager” and the “lazy” approach, which are substantially complementary to one another [25]. In the eager approach, terms and constraints are encoded into SAT via bit-blasting [23, 17, 16, 24, 33, 32].
In the lazy approach, terms are not immediately expanded –so to avoid any scalability issue– and the solver is comprised by a layered set of techniques, each of which deals with a sub-portion of the theory [15, 10, 18, 24].
Floating-Point.
The theory of Floating-Point Numbers (), [3, 36, 13], is based on the IEEE standard 754-2008 [4] for floating-point arithmetic, restricted to the binary case. A sort is an indexed nullary sort identifier of the form (_ FP <> <>) s.t. both and are positive integers greater than one, defines the number of bits in the exponent and defines the number of bits in the significand, including the hidden bit. A variable with sort (_ FP <> <>) can be indifferently viewed as a vector of bits, where is the Most Significant Bit (MSB) and is the Least Significant Bit (LSB), or as a triplet of Bit-Vectors s.t. is a of size , is a of size and is a of size . A constant is a triplet of constants. Given a fixed floating-point sort, i.e. a pair , the following constants are implicitly defined:
[TABLE]
where t is either [math] or and s is a which contains at least a .
Setting aside special constants, the remaining values can be classified to be either normal or subnormal (a.k.a. denormal) [4]. A number is said to be subnormal when every bit in its exponent is equal to zero, and normal otherwise. The significand of a normal number is always interpreted as if the leading binary digit is equal , while for denormalized values the leading binary digit is always [math]. This allows for the representation of numbers that are closer to zero, although with reduced precision.
Example 1
Let be the normal constant (_ FP #b0 #b1100 #b0101000), and be the subnormal constant (_ FP #b0 #b0000 #b0101000), so that their corresponding sort is (_ FP <4> <8>). Then, according to the semantics defined in the IEEE standard 754-2008 [4], the floating-point value of and in decimal notation is given by:
[TABLE]
The theory of provides a variety of built-in floating-point operations as defined in the IEEE standard 754-2008. This includes binary arithmetic operations (e.g. ), basic unary operations (e.g. ), binary comparison operations (e.g. ), the remainder operation, the square root operation and more. Importantly, arithmetic operations are performed as if with infinite precision, but the result is then rounded to the “nearest” representable number according to the specified rounding mode. Five rounding modes are made available, as in [4].
The most common approach for -satisfiability is to encode expressions into formulas based on the circuits used to implement floating-point operations, using appropriate under- and over-approximation schemes –or a mixture of both– to improve performance [14, 44, 45, 43].
Then, the -Solver is used to deal with the formula, using either the eager or the lazy approach. An alternative approach, based on abstract interpretation, is presented in [11, 12, 26].
With this technique, called Abstract CDCL (ACDCL), the set of feasible solutions is over-approximated with floating-point intervals, so that intervals-based conflict analysis is performed to decide -satisfiability.
3 Theoretical Framework
We present our generalization of [31] to the case of signed/unsigned Bit-Vector Optimization, and then move on to deal with Floating-Point Optimization.
3.1 Bit-Vector Optimization
Without any loss of generality, we assume that every objective function is replaced by a variable of the same type by conjoining “” to the input formula. We use the symbol to denote the bit-width of , and to denote the -th bit of , where and are the Most Significant Bit (MSB) and the Least Significant Bit (LSB) of respectively.\footreffootnote:msbtolsb
We define the Bit-Vector Optimization problem as follows.
Definition 1
(). Let be a formula and obj be a –signed or unsigned– variable occurring in . We call an Optimization Modulo problem, , the problem of finding a model for (if any) whose value of obj, denoted with , is minimum wrt. the total order relation for signed s if obj is signed, and the one for unsigned s otherwise. (The dual definition where we look for the maximum follows straightforwardly)
Hereafter, we generalize the unsigned maximization procedures described in [31] to the case of signed and unsigned optimization. To this extent, we introduce the novel notion of attractor.
Definition 2
(Attractor, attractor equalities).
When minimizing [resp. maximizing], we call attractor for obj the smallest [resp. greatest] -value of the sort of obj. We call vector of attractor equalities the vector s.t. , .
Example 2
If is an unsigned objective of width , then its corresponding attractor is , i.e. , when is minimized and it is , i.e. , when is maximized. When is instead a signed objective, following the two’s complement encoding, the corresponding is , i.e. , for minimization and , i.e. , for maximization.
In essence, the attractor can be seen as the target value of the optimization search and therefore it can be used to determine the desired improvement direction and to guide the decisions taken by the optimization search. By construction, if a model satisfies all equalities , then .
More in general, if is a model of , then the value of in , denoted with , is given by
[TABLE]
when is an unsigned objective, and by
[TABLE]
when is a signed objective, using the two’s complement representation. The function ite, appearing in both previous equations, returns if the attractor equality is true in and otherwise.
We use the symbol to denote a generic (possibly partial) assignment which assigns at least the most-significant bits of . We use the symbol to denote an assignment to all and only the most-significant bits of . Given , we denote by [resp. ] the value in assigned to by [resp. ]. Moreover, we use the expression where to denote the restriction of to all and only the most-significant bits of , . Given a model of and a variable , we denote by the evaluation of in . With a little abuse of notation, and when this does not cause ambiguities, we sometimes use an attractor equality to denote the single-bit assignment and its negation to denote the assignment to the complement value .
Definition 3
(lexicographic maximization)
Consider an OMT instance and the vector of attractor equalities . We say that an assignment to obj lexicographically maximizes wrt. iff, for every ,
- •
if is unsatisfiable,
- •
otherwise.
where is the attractor equality . (The dual definition of “lexicographically minimizes” comes by switching with .) Given a model for , we say that lexicographically maximizes wrt. iff its restriction to obj lexicographically maximizes wrt. .
Starting from the MSB to the LSB, [resp. ] in Definition 3 assigns to each the value unless it is inconsistent wrt. and the assignments to the previous s, . Notice that this corresponds to minimize [resp. maximize] the value [resp. ], —where is the bitwise-xor operator and is its complement— because .
The following fact derives from the above definitions and the properties of two’s complement representation adopted by the SMT-LIBv2 standard222If the standard adopted were the sign-and-magnitude binary encoding, then Theorem 3.1 would not hold. Nevertheless, in such a case we could adopt a simplified version of the technique for optimization described in §3.2.
for signed .
Theorem 3.1
An optimal solution of an problem is any model of which lexicographically maximizes the vector of attractor equalities .
Proof
(We investigate the minimization case, since the maximization case is dual.)
In the case of minimization with unsigned , is , so that the lexicographic optimization corresponds to minimize which is the standard minimization for unsigned .
In the case of minimization with signed , is , so that the lexicographic optimization corresponds to minimize which —by means of subtracting the constant value — is equivalent to minimize , which is the standard minimization for two’s complement .
Definitions 2 and 3 with Theorem 3.1 suggest thus a direct extension to the minimization/maximization of signed of the algorithm for unsigned in [31]: apply the unsigned- maximization [resp. minimization] algorithm of [31] to the objective [resp. ] instead than simply to obj [resp. ].
Example 3
Let be a signed goal of bits to be minimized and be its attractor, so that the corresponding vector of attractor equalities is equal to .
An assignment (for which ) is lexicographically better than (for which ), because the former satisfies the attractor equality corresponding to the MSB while the latter does not. Moreover, the assignment is lexicographically worse than the assignment (for which ), because –all the rest being equal– the latter assignment makes the attractor equality true.
3.2 Floating-Point Optimization
We define the Floating-Point Optimization problem as follows.
Definition 4
().
Let be a formula and obj be a variable occurring in . We call an Optimization Modulo problem, the problem of finding a model for (if any) whose value of obj, denoted with , is either
- •
minimum wrt. the usual total order relation for numbers, if is satisfied by at least one model s.t. is not NaN,
- •
some binary representation of NaN, otherwise.
(The dual definition where we look for the maximum follows straightforwardly.)
Definition 4 is made necessarily convoluted by the fact that obj can be NaN. In fact, in the SMT-LIBv2 standard the comparisons between NaN and any other value are always evaluated false because NaN has multiple representations at the binary level (see Table 1). Also, requiring the optimal solution to be always different from NaN makes the resulting problem unsatisfiable when is satisfied only by models s.t. is NaN. For these reasons, we admit NaN as the optimal solution value for obj if and only if is satisfied only by models s.t. is NaN.
In the rest of this section we assume that we have already checked, in sequence, that
the input formula is satisfiable —by invoking an solver on . If the solver returns unsat, then there is no need to proceed;
is satisfied by at least one model s.t. is not NaN —by invoking an solver on if the model returned by the previous SMT call is s.t. is NaN. If the solver returns unsat, then we conclude that the minimum is NaN.
After that, we can safely focus our investigation on the restricted problem , where , knowing it is satisfiable.
In Section §3.1, we have introduced the concept of a objective attractor, and we have shown how this value can be used to drive the optimization search towards the optimum value, when minimizing or maximizing a signed or unsigned goal. However, in the case of floating-point optimization, it is not possible to statically determine the attractor value in advance, before the search is even started. This is due to the more complex representation of variables, which uses three separate Bit-Vectors (i.e. sign, exponent and significand), and the presence of various classes of special values (i.e. zeros, infinity, NaN), which make definition 2 ambiguous for optimization. We illustrate this problem with the following example.
Example 4
Let be an problem where obj is a objective, of sort (_ FP 3 5), to be minimized. To make our explanation easier to follow, we show in Table 1 a short list of sample values for an variable of the same sort as obj. Each value is represented as a triplet of Bit-Vectors –following the SMT-LIBv2 conventions described in Section §2– and also in decimal notation.
From Table 1, we immediately notice that the binary representation of both the exponent and the significant of a Floating-Point number grows in opposite directions in the positive and in the negative domains. In addition, by sorting the values according to their binary representation, we observe that [resp. ] is not the smallest [resp. greatest] representable value in the negative [resp. positive] domain. In fact, both extreme ends of the table are occupied by NaN, which has multiple binary representations.
In what follows, we temporarily disregard the effects of unit-propagation, which might assign some (or all) bits of as a result of some constraints in , and pick some values as candidate attractors for an goal to be minimized.
Suppose that the attractor is chosen to be equal to the value listed at row in Table 1, which is the smallest value wrt. total order relation for numbers. Assume that the optimal value of the goal is the sub-normal value (fp #b1 #b000 #b1111) (i.e. ). Then, it can be seen that after both the sign and the exponent bits have been decided to be equal #b1 and #b000 respectively, the remaining bits of the attractor pull the search in the wrong direction, that is, towards .
Selecting a different value as candidate attractor does not really solve the problem, or rather, it results in a different set of issues.
For instance, an attractor equal to the NaN value listed at row in Table 1, which is the smallest representable value according to the binary ordering, would solve the problem for the previous case in which the optimum value is (fp #b1 #b000 #b1111). However, this attractor would remain an unsuitable choice for an instance where the goal is forced to be positive, because after the sign bit of the objective function has been decided to be equal #b0 the remaining bits of the attractor drive the search in the wrong direction, that is, towards .
Since there is no statically-determined value that can be used as an attractor when dealing with floating-point optimization, we introduce the new concept of dynamic attractor.
Definition 5
(Dynamic Attractor.)
Let be a restricted problem, where is a satisfiable formula and obj is a objective to be minimized [resp. maximized]. Let and be an assignment to the most-significant bits of obj.
Then, we say that an -value for obj is a dynamic attractor for obj wrt. iff it is the smallest [resp. largest] value different from NaN s.t. the most-significant bits of have the same value of the most-significant bits of obj in . We call vector of attractor equalities the vector s.t. , .
The following fact derives from the above definitions and the properties of IEEE 754-2008 standard representation adopted by SMT-LIBv2 standard for .
Lemma 1
Let be a restricted minimization [resp. maximization] problem, let be an assignment to and be its corresponding dynamic attractor, for some . Let and , and let , two models for which extend and respectively.
Then [resp. ].
Proof
(We prove the case of minimization, since that of maximization is dual wrt. the value of the sign bit.) We distinguish three cases based on the value of .
Case (sign bit). Then , and , where is the MSB of and represents the sign of the floating-point value. Then is smaller or equal zero in every model and larger or equal zero in every model of , so that is verified.
Case (exponent bits), where is the number of bits in the exponent of . Then, is if and [math] otherwise.
In the first case, can only be negative-valued in both and . More precisely, can be either or a normal negative value, whereas can be either a normal or a sub-normal negative value. Hereafter, we consider only the case in which both have a normal negative value, because the case in which or is sub-normal are both trivial, given that the absolute value of any sub-normal number is smaller than the absolute value of any normal number. Furthermore, we disregard the significand bits in and because their contribution to the value of is always less significant than that of the bits in the exponent. Given these premises, the exponent value of in every possible is larger than the exponent of in every possible by a value equal to and therefore, given that both and are negative-valued, .
The case in which , that is when can only be positive-valued in both and , is dual.
Case (significand bits). Then there are three sub-cases.
If for every the value of is equal , then the only possible value of for every possible is , and therefore . On the other hand, there exists no possible model of , because the assignment would imply being equal to NaN, so the statement is vacuously true.
If instead there is some s.t. , then is if (i.e. obj is negative-valued) and [math] otherwise (i.e. obj is positive-valued). In both cases, we can disregard the exponent bits in and because their contribution to the value of is the same in either model. For the same reasons, since and can only be either both normal or both sub-normal, we can ignore the contribution of the leading hidden bit and focus on the bits of the significand.
When and obj must be negative-valued, the decimal value of the significand in is larger than the decimal value of every possible significand in by exactly . Given that both and are negative-valued, we have that .
The case in which , that is when obj can only be positive-valued in both and , is dual.
Lemma 1 states that, given the current assignment to the most-significant-bits of obj, is always the best extension of to the next bit (when consistent). A dynamic attractor can thus be used by the optimization search to guide the assignment of the -th bit of towards the direction of maximum gain which is allowed by , so that to obtain the “best” extension of . Once the (new) assignment is found, the OMT solver can compute the dynamic attractor for wrt. and then use it to assign the -th bit of , and so on.
Let be an instance, s.t. is a variable of bits, and be an initially empty assignment. If at each step of the optimization search the assignment of the -th bit of is guided by the dynamic attractor for wrt. , then the corresponding sequence of dynamic attractors (of increasing order ) is unique and depends exclusively on . Intuitively, this is the case because the (current) dynamic attractor always points in the direction of maximum gain. We illustrate this in the following example.
Example 5
Let be an problem where obj is a objective, of sort (_ FP 3 5), to be minimized, as in Example 4. At the beginning of the search, nothing is known about the structure of the solution. Therefore, and, since obj is being minimized, the dynamic attractor for wrt. (i.e. ) is equal to (fp #b1 #b111 #b0000) (i.e. ), which gives a preference to any feasible value of obj in the negative domain.
If at some point of the optimization search we discover that the domain of the objective function can only be positive, so that the first bit of is permanently set to [math] in , then the new dynamic attractor for obj wrt. (i.e. ) is equal to (fp #b0 #b000 #b0000) (i.e. ).
Furthermore, if later on we also find out that at least one bit in the exponent of obj can be assigned to [math] in a feasible solution of the problem that extends , for some , then we can remove from the optimization search interval.
Definition 6
(Attractor Trajectory ).
Consider the restricted problem s.t. as in Definition 5, a triplet of inductively-defined sequences —where each is an assignment to the first most-significant bits of s.t. , is its corresponding dynamic attractor and is its corresponding vector of attractor equalities— so that, for every :
- (i)
if is unsatisfiable,
- (ii)
otherwise.
Then we define the attractor trajectory as the vector .
The attractor trajectory contains those attractor equalities which are of critical importance for the decisions taken by the optimization search. Intuitively, this is the case because the value of the -th bit of obj (i.e. ) is still undecided in .
Example 6
Let be a restricted problem where obj is a objective, of sort (_ FP 3 5), to be minimized, as in Example 4. We consider the case in which the input formula requires to be larger or equal and it does not impose any other constraint on the value of . Given the sequence of (partial) assignments in Figure 1, the corresponding list of dynamic attractors and the corresponding vectors of attractor equalities, then the attractor trajectory is equal to the vector .
Lemma 2
Consider , , , , and as in definition 6. Then lexicographically maximizes wrt. .
Proof
By Definition 6, we have that, for each ,
if is unsatisfiable,
otherwise.
By construction, . Therefore, we can replace with so that
if is unsatisfiable,
otherwise.
We notice the following facts. For each , . Furthermore, for each , because by the definition of attractor trajectory, and by the equality . Thus, we can replace with and with , as follows. For each ,
if is unsatisfiable,
otherwise.
Hence, lexicographically maximizes wrt. .
Finally, we make the following two observations. The first is that the sequence in definition 6 can be iteratively constructed using its list of requirements, for instance, by means of a sequence of incremental calls to an SMT solver. The second, more important, observation is that corresponds to the assignment of values which makes optimal in .
Using the above definitions, we show that the following fact holds.
Theorem 3.2
Let , , , , and be as in definition 6. Then, any model of which lexicographically maximizes the attractor trajectory is an optimal solution for the problem .
Proof
(We prove the case of minimization, since that of maximizations is dual.)
By Lemma 2 we have that lexicographically maximize . Let be a model of which lexicographically maximizes , and let be its restriction to obj. Since both and lexicographically maximize , for the uniqueness of , we immediately notice that , so that for each and lexicographically maximize .
By definition, is an optimal solution for iff there exists no other model for it s.t. . Hence, we show by contradiction that no such can exist.
Assume (for the sake of contradiction), that there exists a model for , s.t. , and let be the restriction of to obj. Then there must be at least one index for which . Let be the smallest such index. Recalling that and , we set . Then, , , . In particular, and therefore if , and vice versa.
Then, we distinguish two cases.
In the first case, and . From and the fact that lexicographically maximizes , we derive that is unsatisfiable, where . Since and , we conclude that , so that cannot be a model of , contradicting the initial assumption.
In the second case, and . Therefore, by Lemma 1, for every pair of models , for which extend respectively and we have that . Since and , it follows that , contradicting the initial assumption.
4 Procedures
In this paper, we consider two approaches for dealing with : a basic linear/binary search, based on the inline OMT schema for presented in [38], and Floating-Point Optimization with Binary Search (ofp-bs), a brand-new engine inspired by the obv-bs algorithm for unsigned Bit-Vectors in [31] and by Theorem 3.2 and relative definitions in §3.2.
4.1 OMT-based Approach
The OMT-based approach for adapts the linear- and binary-search schemata for presented in [38] to deal with objectives.
In the basic linear-search schema, the optimization search is advanced by means of a sequence of linear cuts, each of which forces the OMT solver to look for a new model which improves the value of wrt. the most recent model . In the binary-search schema, instead, the OMT solver learns an incremental sequence of cuts which bisect the current domain of the objective function. For clarity, we recap here the essential elements of the binary-search schema presented in [37, 38]. At the beginning of the optimization search and following each update of the lower- () and upper- () bounds of , the OMT solver computes a pivoting value , for some value of (e.g. ). If lies inside the range , a cut of the form is learned. Otherwise, if –due to rounding side-effects of operations– lies outside the range , a cut of the form is learned instead. If the cut is satisfiable, the upper-bound of is updated with a new model value of . Otherwise, the lower-bound is made equal to [resp. ]. The algorithm terminates when the search interval becomes empty. In general, it is reasonable to expect the binary-search schema to converge towards the optimal solution faster than the linear-search schema, because the feasible domain of a goal can be comprised by an exponentially large number of values (wrt. the bit-width of the cost function).
In either schema, whenever the optimization engine encounters for the first time a solution s.t. , the OMT solver learns a unit-clause of the form so as to look for an optimal solution different from NaN (if any).
When dealing with objectives, differently from the case of in [38], it is not necessary to implement a specialized optimization procedure within the -Solver in order to guarantee the termination of the optimization search. Indeed, such procedure is not available when Floating-Point terms are bit-blasted into Bit-Vectors eagerly, or when the acdcl -Solver is used, because by the time the optimization procedure is called the domain interval of any term contains a singleton value. Conversely, such a minimization procedure could be envisaged when the OMT solver uses a lazy -Solver as back-end, so as to speed-up the convergence towards the optimal solution333 Currently, there is no such specialized optimization procedure embedded within the lazy -Solver of OptiMathSAT, so we won’t describe this approach any further. .
4.2 Floating-Point Optimization with Binary Search
The Floating-Point Optimization with Binary Search algorithm is a new engine for which is inspired by the obv-bs algorithm for [31] and is a direct implementation of Definition 6 and Theorem 3.2.
The optimization search tries to lexicographically maximize an implicit attractor trajectory vector , which is incrementally derived from the current value of the dynamic attractor. The raw value of the dynamic attractor’s bits drive the optimization search towards the direction of maximum gain at any given point in time, without disrupting any decision that has been already made. The dynamic attractor is incrementally updated along the search, based on the outcome of the previous rounds of the optimization search. At each round, one bit of the objective function is assigned its final value. The first round decides the sign, the next batch of rounds decides the exponent and the remaining rounds decide the fine-grained details of the significand.
The pseudo-code of ofp-bs is shown in Figure 2. The arguments of the algorithm are the input formula and the objective , where is a variable with bits in the exponent, in the significand and bits overall.
The procedure starts by checking whether the input formula is satisfiable and immediately terminates if that is not the case (lines -). If in then the procedure checks whether there exists a model for (lines -). If this is not the case, the procedure terminates immediately and returns the pair (line ). Otherwise, the model is updated with the new model , and is permanently extended with the constraint (lines -).
At this point, the procedure initializes the value of the dynamic attractor by invoking an external function update_dynamic_attractor() with the empty assignment as parameter, so that the returned value is equal to when minimizing and when maximizing (lines -). Then, the execution moves to the section of code implementing the core part of the ofp-bs algorithm (lines -), which consists of a loop over the bits of obj, starting from the MSB down to the LSB .
Inside this loop, ofp-bs first checks whether the value of in matches the -th bit of the (current) dynamic attractor . If this is the case, then the -th bit is already set to its “best” value in . Thus, the assignment is extended so as to permanently set (line ), and the optimization search moves to the next iteration of the loop. If instead in , we need to verify whether the value of the objective function in can be improved by forcing the -th bit of equal to the -th bit of the dynamic attractor. To do so, we incrementally invoke the underlying SMT solver, this time checking the satisfiability of under the list of assumptions (line ). If the SMT solver returns sat, then the value of the objective function has been successfully improved. Hence, is extended with an assignment setting equal to , and is replaced with the new model (lines -). Otherwise, it is not possible to improve the objective function by toggling the value of , and is extended so as to permanently set (line ). At this point, there is a mismatch between the value of the first bits of obj in , corresponding to the assignment , and those of the current dynamic attractor. This mismatch is resolved by calling the function update_dynamic_attractor() with the updated assignment as parameter (line ). In either case, the execution moves to the next iteration of loop.
After exactly iterations of the loop, the optimization search terminates with the pair , where is the optimum model of the given instance. The ofp-bs algorithm requires at most incremental calls to an underlying solver. The test in rows 17-18 allows for saving lots of such SMT calls when the current model already assigns to its corresponding value in the attractor.
The function update_dynamic_attractor() takes as input , a (partial) assignment over the most-significant bits of and, when is minimized 444 The implementation of update_dynamic_attractor() is dual when is maximized. , and it essentially works as follows. If , then nothing is known about the solution of the problem, so is returned. Otherwise, the procedure must compute the smallest value different from NaN (if any) which extends . Since then we know that the sign of the objective function has been permanently decided in . If in , i.e. must be positive, the procedure must return the smallest positive value admitted by . Hence, we extend with and return the corresponding value. If in , i.e. can be negative values, the procedure must return the largest negative value admitted by . We first check whether there exists a bit in the exponent of which is assigned to [math] in . If that is the case, we extend with and return the corresponding value. Otherwise, the procedure returns the value , which is still a viable extension of .
4.3 Search Enhancements
Given a value and a goal , (a combination of) the following techniques can be used to adjust the behavior of the optimization search, similarly what has been proposed for the case of by Nadel et al. in [31].
- •
branching preference: the bits of the objective obj are marked, inside the OMT solver, as preferred variables for branching starting from the MSB down to the LSB. This ensures that conflicts involving the value of the objective function are handled as early as possible, possibly reducing the amount of work that needs to be redone after each back-jump.
- •
polarity initialization: the phase-saving value of each is initialized with the value of . This encourages the OMT solver to assign the bits of so as to reassemble the bits of , thus possibly speeding-up the convergence towards the optimal value.
In the case of the basic OMT schema described in Section §4.1, the effectiveness of either technique depends on the initial choice for . In the lucky case, the value of pulls the optimization search in the right direction and speeds up the search. In the unlucky case, when pulls in the wrong direction, there is no visible effect or an overall slow down. For instance, in the case of the linear-search optimization schema, enabling both options with an unlucky choice of can cause the OMT solver to start the search from the furthest possible point from the optional solution, and thus enumerate an exponential number of intermediate solutions. Naturally, the OMT-based optimization search algorithm is still guaranteed to terminate even in the worst-case scenario, but the unpredictable performance makes using either technique a generally unsuitable option in practice.
In the case of the ofp-bs algorithm described in Section §4.2, we use the latest value of the dynamic attractor for both the branching preference (lines and of Figure 2) and the polarity initialization (rows and of Figure 2) techniques. We observe that the value of every bit in the dynamic attractor can change after the sign of the objective function has been decided. Furthermore, the value of all the significand’s bits in the dynamic attractor can also change during the process of determining the optimal exponent value of the objective function (see, e.g., Example 4). As a consequence, if the OMT solver applies either enhancement before the correct improving direction is known, this may cause the underlying OMT engine to advance the search starting from a sub-optimal set of initial decisions. Enabling both enhancements at the same time could make things even worse. In order to mitigate this issue, we have designed a variant of our optimization-search approach which does not apply either enhancement on those bits of the objective function for which the best improving direction is not yet known. We have called this variant safe bits restriction.
5 Experimental Evaluation
We assess the performance of OptiMathSAT (v. 1.6.2) on a set of formulas that have been automatically generated using the benchmark-set of [3]. The formulas, the results and the scripts necessary to reproduce these results are made publicly available and can be downloaded from [1].
Experiment Setup.
This experiment has been performed on an i7-6500U 2.50GHz Intel Quad-Core machine with of ram and running Ubuntu Linux . For each formula being tested we used a timeout of seconds. The instances used in this experiment have been automatically generated starting from the satisfiable formulas included in the benchmark-set of [3]. We did not consider any of the unsatisfiable instances that are present in the remote repository.
For each of the original formulas we applied the following transformations. First, we either relaxed or removed some of the constraints in the original problem, so as to broaden the set of feasible solutions. This step is necessary because the majority of the original formulas admits only one solution. However, this is not necessarily the ideal situation when comparing different optimization approaches. Second, for each variable appearing inside a problem we generated a pair of instances, one for the minimization and another for the maximization of . At the end of this step, we obtained formulas. Third, we randomly selected up to instances from each of the five groups of problems in the benchmark-set. This filtering step yielded a total of SMT-LIBv2 formulas.
We consider two OMT-based baseline configurations, OptiMathSAT(omt+lin) and OptiMathSAT(omt+bin), that run the linear- and the binary-search respectively. These configurations have been tested using both the eager and the lazy approaches. The third baseline approach, named OptiMathSAT(eager+obv-bs), is based on a reduction of the problem to and it uses OptiMathSAT’s implementation of the obv-bs engine555 The binaries of the original tools presented in [31] are not publicly available. presented by Nadel et al. in [31]. For this test, we have generated an benchmark-set using a encoding that mimics the essential aspects of the ofp-bs algorithm described Section §4.2.
We compared these baseline approaches with a configuration using the ofp-bs algorithm and the eager approach, namely OptiMathSAT(eager+ofp-bs).
We have separately tested the effect of enabling the branching preference (bp), the polarity initialization (pi) and the safe bits restriction (so) enhancements described in Section §3.2, whenever these options were supported by the given configuration.
Last, in order to assess the significance of the optimization problems used in this experiment, we have collected the run-time statistics of OptiMathSAT on the SMT formulas obtained by stripping the objective function from each OMT instance. We named this configuration OptiMathSAT(eager+smt).
We have not included other tools in our experiment because we are not aware of any other solver. For all problem instances, we verified the correctness of the optimal solution found by each configuration with an SMT solver (MathSAT5). When terminating, all tools returned the same optimum value. In order to perform this cross-check as efficiently as possible, we enabled model generation on every configuration so that the optimum model could be extracted and verified.
Experiment Results.
The results of this experiment are listed in Table 2. Figure 4 depicts the loc-scale cactus plot of the same data, for a visual comparison among the different configurations. In addition, Figures 5, 6 and 7 show a selection of relevant pairwise comparisons among various OptiMathSAT configurations. Figure 5 focuses on variants of the OMT-based linear-search approach, Figure 6 depicts variants of the OMT-based binary-search approach, whereas Figure 7 focuses on the ofp-bs engine.
For what concerns OMT-based linear-search optimization, we observe that OptiMathSAT performs the best when no enhancement is enabled. In particular, the empirical evidence suggests that enabling branching preference significantly increases the number of timeouts, generally deteriorating the performance (plot in Fig. 5). Enabling only polarity initialization does not result in an appreciable change on the running time of the solver (plot in Fig. 5). In contrast, enabling both enhancements at the same time has a small chance to result in a small improvement of the search time (plot in Fig. 5), but it generally worsens the performance and results in a drastic increase in the number of timeouts (Table 2). We justify these results as follows. First, when only polarity initialization is used, the phase-saving value that is being set by OptiMathSAT does not really matter because the optimization search is dominated by the structure of the formula itself rather than by the bits of the objective. Second, when polarity initialization is used on top of branching preference, there is an even more drastic decrease in performance due to the fact that the initial phase-saving value that is statically assigned by the OMT solver to the bits of the objective cannot be expected to be “good enough” for any situation.
In fact, as illustrated in example 4, the initial phase-saving can be misleading and force the OMT solver –when running in linear-search– to explore an exponential number of intermediate satisfiable solutions.
In the case of the OMT-based binary-search optimization approach, we observe that it solves more formulas than linear-search and it generally appears to be faster (plot in Fig. 5). Overall, polarity initialization does not seem to be beneficial, whereas enabling branching preference increases the number of formulas solved within the timeout. This behavior is different from the linear-search approach, and we conjecture that it is due to the fact that, with the OMT-based binary-search approach, branching over the bits of the objective function can reveal in advance any (partial) assignment to the bits of the objective function that it is inconsistent wrt. the pivoting cuts learned by the optimization engine.
Using the lazy engine results in fewer formulas being solved, although a significant number of these benchmarks is solved faster than with any other configuration (over instances, for both configurations).
The OptiMathSAT(eager+obv-bs) configuration is able to solve formulas within the timeout, showing that can be reduced to effectively, and that –on the given benchmark-set– the performance of this approach are comparable with the best configurations being tested.
Overall, the best performance is obtained by using the ofp-bs engine, with up to benchmark-set instances being solved in correspondence to the OptiMathSAT(eager+ofp-bs+pi) configuration.
In plot of Figures 5 and 6, we show the pairwise comparison of the best ofp-bs configuration with the best OMT-based run.
Similarly to the case of OMT-based optimization with linear-search, we observe that enabling branching preference generally makes the performance worse (plot in Fig. 7). Instead, when polarity initialization is used we observe a general performance improvement that does not only result in an increase in the number of formulas being solved within the timeout, but also a noticeable reduction of the solving time as a whole. This is in contrast with the case of OMT-based optimization, and it can be explained by the fact that ofp-bs uses an internal heuristic function to dynamically determine and update the most appropriate phase-saving value for the bits of the objective function. An equally important role is played by the safe bits restriction, that limits the effects of branching preference and polarity initialization to only certain bits of the dynamic attractor.
As illustrated by the plots in the second and third rows of Figure 7 and by the data in Table 2, this feature is particularly effective when used in combination with branching preference.
The results of OptiMathSAT over the SMT-only version of the benchmark-set are reported in Table 2 and in the scatter-plot in Fig. 6, and show that for a large number of instances the OMT problem is considerably harder than its SMT-only version There are a few exceptions to this rule, that we ascribe to the fact that the removal of the objective function alters the internal stack of formulas, and this can have unpredictable consequences on the behavior of various internal heuristics that depend on it. A solution can be found in a shorter amount of time when the sequence of (heuristic) choices is compatible with its assignment and it requires little back-tracking effort.
6 Conclusions and Future Work
We have presented for the first time OMT procedures (for signed Bit-Vectors and) Floating-Point numbers, based on the novel notions of attractor, dynamic attractor and attractor trajectory, which we have implemented in OptiMathSAT and tested on modified problems from SMT-LIB.
Ongoing research involves implementing our ofp-bs procedure on top of the ACDCL procedure —which is not immediate to do efficiently because the latter approach does not allow directly accessing and setting the single bits of the objective (since and are not signature-disjoint). Future research involves experimenting the new OMT procedure directly on problems coming from bit-precise SW and HW verification, produced, e.g., by the NuXmv model checker [2].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] http://disi.unitn.it/trentin/resources/floatingpoint_test.tar.gz .
- 2[2] nu Xmv . https://nuxmv.fbk.eu .
- 3[3] Smt Libv 2. www.smtlib.cs.uiowa.edu/ .
- 4[4] IEEE standard 754, 2008. http://grouper.ieee.org/groups/754/ .
- 5[5] H. F. Albuquerque, R. F. Araujo, I. V. de Bessa, L. C. Cordeiro, and E. B. de Lima Filho. Opt CE: A Counterexample-Guided Inductive Optimization Solver. In SBMF , volume 10623 of Lecture Notes in Computer Science , pages 125–141. Springer, 2017.
- 6[6] R. F. Araujo, H. F. Albuquerque, I. V. de Bessa, L. C. Cordeiro, and J. E. C. Filho. Counterexample guided inductive optimization based on satisfiability modulo theories. Sci. Comput. Program. , 165:3–23, 2018.
- 7[7] R. Araújo, I. Bessa, L. C. Cordeiro, and J. E. C. Filho. SMT-based Verification Applied to Non-convex Optimization Problems. In 2016 VI Brazilian Symposium on Computing Systems Engineering (SBESC) , pages 1–8, Nov 2016.
- 8[8] N. Bjorner and A.-D. Phan. ν Z 𝜈 𝑍 \nu{}Z - Maximal Satisfaction with Z 3. In Proc International Symposium on Symbolic Computation in Software Science , Gammart, Tunisia, December 2014. Easy Chair Proceedings in Computing (E Pi C).
