Optimization Modulo the Theories of Signed Bit-Vectors and   Floating-Point Numbers

Patrick Trentin; Roberto Sebastiani

arXiv:1905.02838·cs.LO·May 9, 2019

Optimization Modulo the Theories of Signed Bit-Vectors and Floating-Point Numbers

Patrick Trentin, Roberto Sebastiani

PDF

TL;DR

This paper introduces a new Optimization Modulo Theories approach for handling signed Bit-Vectors and Floating-Point Numbers, extending previous work and demonstrating its effectiveness through implementation and testing.

Contribution

It presents a novel OMT method for signed BV and FP, introducing the concepts of attractor and dynamic attractor, filling a gap in existing SMT/OMT techniques.

Findings

01

Validated the approach with empirical tests on SMT-LIB problems.

02

Extended OMT techniques to signed BV and FP.

03

Demonstrated feasibility and effectiveness of the new method.

Abstract

Optimization Modulo Theories (OMT) is an important extension of SMT which allows for finding models that optimize given objective functions, typically consisting in linear-arithmetic or pseudo-Boolean terms. However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors (BV) for the integers and that of Floating-Point Numbers (FP) for the reals respectively. Whereas an approach for OMT with (unsigned) BV has been proposed by Nadel & Ryvchin, unfortunately we are not aware of any existing approach for OMT with FP. In this paper we fill this gap. We present a novel OMT approach, based on the novel concept of attractor and dynamic attractor, which extends the work of Nadel & Ryvchin to signed BV and, most importantly, to FP. We have implemented some…

Tables2

Table 1. Table 1: Sample values for a ℱ 𝒫 ℱ 𝒫 \mathcal{FP} variable with sort (_ FP 3 5) .

	sign	exp	sig	value
1	#b0	#b111	#b1111	NaN
	…	…	…	NaN
2	#b0	#b111	#b0000	$+ \infty$
3	#b0	#b110	#b1111	$\frac{31}{2}$
	…	…	…	…
4	#b0	#b000	#b0001	$\frac{1}{64}$
5	#b0	#b000	#b0000	$+ 0$
6	#b1	#b000	#b0000	$- 0$
7	#b1	#b000	#b0001	$- \frac{1}{64}$
	…	…	…	…
8	#b1	#b110	#b1111	$- \frac{31}{2}$
9	#b1	#b111	#b0000	$- \infty$
	…	…	…	NaN
10	#b1	#b111	#b1111	NaN

Table 2. Table 2: Comparison among various OptiMathSAT configurations on the OMT ( ℱ 𝒫 ) OMT ℱ 𝒫 \text{OMT}(\mathcal{FP}) benchmark-set. The columns list the total number of instances (inst.), the number of instances solved (term.), the number of timeouts (t.o.), the number of instances uniquely solved by the given configuration (u), the number of instances solved faster than any other configuration (bt), the total number of instances solved in the shortest amount of time (st) and the total solving time for all solved instances (time).

tool, configuration & encoding	inst.	term.	t.o.	u	bt	st	time (s.)
OptiMathSAT(eager+omt+lin)	1120	1003	117	0	5	73	76375
OptiMathSAT(eager+omt+lin+pi)	1120	1003	117	0	5	71	76785
OptiMathSAT(eager+omt+lin+bp)	1120	956	164	0	6	105	77480
OptiMathSAT(eager+omt+lin+bp+pi)	1120	873	247	0	77	217	54859
OptiMathSAT(lazy+omt+lin)	1120	868	252	0	93	203	29832
OptiMathSAT(eager+omt+bin)	1120	1014	106	0	11	281	67834
OptiMathSAT(eager+omt+bin+pi)	1120	970	150	0	8	285	69765
OptiMathSAT(eager+omt+bin+bp)	1120	1016	104	0	14	205	68255
OptiMathSAT(eager+omt+bin+bp+pi)	1120	991	129	0	65	321	56941
OptiMathSAT(lazy+omt+bin)	1120	900	220	0	90	243	33260
OptiMathSAT(eager+obvbs) [reduction]	1120	1013	107	0	14	141	65954
OptiMathSAT(eager+ofpbs)	1120	1017	103	0	9	171	70732
OptiMathSAT(eager+ofpbs+pi)	1120	1019	101	0	34	280	64896
OptiMathSAT(eager+ofpbs+pi+so)	1120	1018	102	0	7	179	71430
OptiMathSAT(eager+ofpbs+bp)	1120	975	145	0	2	145	65543
OptiMathSAT(eager+ofpbs+bp+so)	1120	1000	120	0	3	124	68390
OptiMathSAT(eager+ofpbs+bp+pi)	1120	1001	119	0	77	273	60365
OptiMathSAT(eager+ofpbs+bp+pi+so)	1120	1006	114	19	32	245	59463
virtual best	1120	1074	46	-	559	1074	27788
OptiMathSAT(eager+smt) [no optimization]	1120	1048	72	-	-	-	9259

Equations28

x =

x =

y =

τ (obj) = i = 0 \sum i = n - 1 (2^{n - 1 - i} \cdot \textsc i t e (M (A [i]), a tt r [i], \overline{a tt r [i]}))

τ (obj) = i = 0 \sum i = n - 1 (2^{n - 1 - i} \cdot \textsc i t e (M (A [i]), a tt r [i], \overline{a tt r [i]}))

τ (obj) =

τ (obj) =

- (2^{n - 1}) \cdot \textsc i t e (M (A [0]), a tt r [0], \overline{a tt r [0]})

τ_{0}

τ_{0}

τ_{1}

τ_{2}

τ_{3}

τ_{4}

τ_{5}

τ_{6}

τ_{7}

τ_{8}

A_{τ_{0}}

A_{τ_{0}}

A_{τ_{1}}

A_{τ_{2}}

A_{τ_{3}}

A_{τ_{4}}

A_{τ_{5}}

A_{τ_{6}}

A_{τ_{7}}

A_{τ_{8}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: DISI, University of Trento, Italy

Optimization Modulo the Theories of Signed Bit-Vectors and Floating-Point Numbers

††thanks: We would like to thank the anonymous reviewers for their insightful comments and suggestions, and we thank Alberto Griggio for support with MathSAT code.

Patrick Trentin

Roberto Sebastiani

Abstract

Optimization Modulo Theories (OMT) is an important extension of SMT which allows for finding models that optimize given objective functions, typically consisting in linear-arithmetic or pseudo-Boolean terms. However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors ( $\mathcal{BV}$ ) for the integers and that of Floating-Point Numbers ( $\mathcal{FP}$ ) for the reals respectively. Whereas an approach for OMT with (unsigned) $\mathcal{BV}$ has been proposed by Nadel & Ryvchin, unfortunately we are not aware of any existing approach for OMT with $\mathcal{FP}$ .

In this paper we fill this gap. We present a novel OMT approach, based on the novel concept of attractor and dynamic attractor, which extends the work of Nadel & Ryvchin to signed $\mathcal{BV}$ and, most importantly, to $\mathcal{FP}$ . We have implemented some $\text{OMT}(\mathcal{BV})$ and $\text{OMT}(\mathcal{FP})$ procedures on top of OptiMathSAT and tested the latter ones on modified problems from the SMT-LIB repository. The empirical results support the validity and feasibility of the novel approach.

1 Introduction

Optimization Modulo Theories (OMT) [34, 19, 35, 37, 21, 20, 30, 29, 8, 28, 38, 39, 40, 7, 31, 41, 5, 42, 22, 27, 6]

is an important extension to Satisfiability Modulo Theories which allows for finding models that optimize one or more objectives, which typically consist in some linear-arithmetic or Pseudo-Boolean function application.

However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors ( $\mathcal{BV}$ ) for the integers and that of Floating-Point Numbers ( $\mathcal{FP}$ ) for the reals respectively. (For instance, during the verification process of a piece of software, one may look for the minimum/maximum value of some int [resp. float] parameter causing an $\text{SMT}(\mathcal{BV})$ [resp. $\text{SMT}(\mathcal{FP})$ ] call to return sat—which typically corresponds to the presence of some bug— so that to guarantee a safe range for such parameter. )

OMT for the theory of (unsigned) bit-vectors ( $\text{OMT}(\mathcal{BV})$ ) was proposed by Nadel and Ryvchin [31], although a reduction to the problem to MaxSAT was already implemented in the SMT/OMT solver Z3 [9]. The work in [31] was based on the observation that OMT on unsigned $\mathcal{BV}$ can be seen as lexicographic optimization over the bits in the bitwise representation of the objective, ordered from the most-significant bit (MSB) to the least-significant bit (LSB).

In this paper we address —for the first time to the best of our knowledge— OMT for the theory of signed Bit-Vectors and, most importantly, for the theory of Floating-Point Arithmetic ( $\text{OMT}(\mathcal{FP})$ ), by exploiting some properties of the two’s complement encoding for signed $\mathcal{BV}$ and of the IEEE 754-2008 encoding for $\mathcal{FP}$ respectively.

We start from introducing the notion of attractor, which represent (the bitwise encoding of) the target value for the objective which the optimization process aims at. This allows us for easily leverage the procedure of [31] to work with both signed and unsigned Bit-Vectors, by minimizing lexicographically the bitwise distance between the objective and the attractor, that is, by minimizing lexicographically the bitwise-xor between the objective and the attractor.

Unfortunately there is no such notion of (fixed) attractor for $\mathcal{FP}$ numbers, because the target value moves as long as the bits of the objective are updated from the MSB to the LSB, and the optimization process may have to change dynamically its aim, even at the opposite direction. (For instance, as soon as the minimization process realizes there is no solution with a negative value for the objective and thus sets its MSB to 0, the target value is switched from $-\infty$ to $0+$ , and the search switches direction, from the maximization of the exponent and the significand to their minimization.)

To cope with this fact, we introduce the notions of dynamic attractor and attractor trajectory, representing the dynamics of the moving target value, which are progressively updated as soon as the bits of the objective are updated from the MSB to the LSB. Based on these ideas, we present novel $\text{OMT}(\mathcal{FP})$ procedures, which require at most $n+2$ , incremental calls to an $\text{SMT}(\mathcal{FP})$ solver, $n$ being the number of bits in the representation of the objective. Notice that these procedures do not depend on the underlying $\text{SMT}(\mathcal{FP})$ procedure used, provided the latter allows for accessing and setting the single bits of the objective.

We have implemented these $\text{OMT}(\mathcal{BV})$ and $\text{OMT}(\mathcal{FP})$ procedures on top of the OptiMathSAT OMT solver [42]. We have run an experimental evaluation of the $\text{OMT}(\mathcal{FP})$ procedures on modified $\text{SMT}(\mathcal{FP})$ problems from the SMT-LIB library. The empirical results support the validity and feasibility of the novel approach.

The rest of the paper is organized as follows. In §2 we provide the necessary background on $\mathcal{BV}$ and $\mathcal{FP}$ theories and reasoning. In §3 we provide the novel theoretical definitions and results. In §4 we describe our novel $\text{OMT}(\mathcal{FP})$ procedures. In §5 we present the empirical evaluation. In §6 we conclude, hinting some future directions.

2 Background

We assume some basic knowledge on SAT and SMT and briefly introduce the reader to the Bit-Vector and Floating-Point theories.

Bit-Vectors.

A bit is a Boolean variable that can be interpreted as [math] or $1$ . A Bit-Vector ( $\mathcal{BV}$ ) variable $\mathbf{v}^{[n]}$ is a vector of $n$ bits, where $v[0]$ is the Most Significant Bit (MSB) and $v[n-1]$ is the Least Significant Bit (LSB).111 Although most often in the literature the indexes $i\in[0,...,n-1]$ use to grow from the LSB to the MSB, in this paper we use the opposite notation because we always reason from the MSB down to the LSB, so that to much simplify the explanation.

A $\mathcal{BV}$ constant of width $n$ is an interpreted vector of $n$ values in $\{{0,1}\}$ . We $\overline{overline}$ a bit value or a $\mathcal{BV}$ value to denote its complement (e.g., $\overline{[11010010]}$ is $[00101101]$ ). A $\mathcal{BV}$ variable/constant of width $n$ can be unsigned, in which case its domain is $[0,2^{n}-1]$ , or signed, which we assume to comply with the Two’s complement representation, so that its domain is $[-2^{(n-1)},2^{(n-1)}-1]$ . Therefore, the vector $[11111111]$ can be interpreted either as the unsigned $\mathcal{BV}$ constant $\mathbf{255}^{[8]}$ or as the signed $\mathcal{BV}$ constant $\mathbf{-1}^{[8]}$ . Following the SMT-LIBv2 standard [3], we may also represent a $\mathcal{BV}$ constant in binary (e.g. $\mathbf{28}^{[8]}$ is written $\#b00011100$ ) or in hexadecimal (e.g. $\mathbf{28}^{[8]}$ is written $\#x1C$ ) form. A $\mathcal{BV}$ term is built from $\mathcal{BV}$ constants, variables and interpreted $\mathcal{BV}$ functions which represent standard RTL operators: word concatenation (e.g. $\mathbf{3}^{[8]}\circ\mathbf{x}^{[8]}$ ), sub-word selection (e.g. $(\mathbf{3}^{[8]}[6:3])^{[4]}$ ), modulo-n sum and multiplication (e.g. $\mathbf{x}^{[8]}+_{8}\mathbf{y}^{[8]}$ and $\mathbf{x}^{[8]}\cdot_{8}\mathbf{y}^{[8]}$ ), bit-wise operators (like, e.g., $\textbf{and}_{n}$ , $\textbf{or}_{n}$ , $\textbf{xor}_{n}$ , $\textbf{nxor}_{n}$ , $\textbf{not}_{n}$ ), left and right shift ${<<}_{n}$ , ${>>}_{n}$ . A $\mathcal{BV}$ atom can be built by combining $\mathcal{BV}$ terms with interpreted predicates like $\geq_{n}$ , $<_{n}$ (e.g. $\mathbf{0}^{[8]}\geq_{8}\mathbf{x}^{[8]}$ ) and equality. We refer the reader to [3, 24]

for further details on the syntax and semantics of Bit-Vector theory.

There are two main techniques for $\mathcal{BV}$ satisfiability, the “eager” and the “lazy” approach, which are substantially complementary to one another [25]. In the eager approach, $\mathcal{BV}$ terms and constraints are encoded into SAT via bit-blasting [23, 17, 16, 24, 33, 32].

In the lazy approach, $\mathcal{BV}$ terms are not immediately expanded –so to avoid any scalability issue– and the $\mathcal{BV}$ solver is comprised by a layered set of techniques, each of which deals with a sub-portion of the $\mathcal{BV}$ theory [15, 10, 18, 24].

Floating-Point.

The theory of Floating-Point Numbers ( $\mathcal{FP}$ ), [3, 36, 13], is based on the IEEE standard 754-2008 [4] for floating-point arithmetic, restricted to the binary case. A $\mathcal{FP}$ sort is an indexed nullary sort identifier of the form (_ FP < $ebits$ > < $sbits$ >) s.t. both $ebits$ and $sbits$ are positive integers greater than one, $ebits$ defines the number of bits in the exponent and $sbits$ defines the number of bits in the significand, including the hidden bit. A $\mathcal{FP}$ variable $\mathbf{v}^{[n]}$ with sort (_ FP < $ebits$ > < $sbits$ >) can be indifferently viewed as a vector of $n\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}ebits+sbits$ bits, where $v[0]$ is the Most Significant Bit (MSB) and $v[n-1]$ is the Least Significant Bit (LSB), or as a triplet of Bit-Vectors $\langle\mathbf{sign},\mathbf{exp},\mathbf{sig}\rangle$ s.t. $\mathbf{sign}$ is a $\mathcal{BV}$ of size $1$ , $\mathbf{exp}$ is a $\mathcal{BV}$ of size $ebits$ and $\mathbf{sig}$ is a $\mathcal{BV}$ of size $sbits-1$ . A $\mathcal{FP}$ constant is a triplet of $\mathcal{BV}$ constants. Given a fixed floating-point sort, i.e. a pair $\langle{ebits},{sbits}\rangle$ , the following $\mathcal{FP}$ constants are implicitly defined:

[TABLE]

where t is either [math] or $1$ and s is a $\mathcal{BV}$ which contains at least a $1$ .

Setting aside special $\mathcal{FP}$ constants, the remaining $\mathcal{FP}$ values can be classified to be either normal or subnormal (a.k.a. denormal) [4]. A $\mathcal{FP}$ number is said to be subnormal when every bit in its exponent is equal to zero, and normal otherwise. The significand of a normal $\mathcal{FP}$ number is always interpreted as if the leading binary digit is equal $1$ , while for denormalized $\mathcal{FP}$ values the leading binary digit is always [math]. This allows for the representation of numbers that are closer to zero, although with reduced precision.

Example 1

Let $x$ be the normal $\mathcal{FP}$ constant (_ FP #b0 #b1100 #b0101000), and $y$ be the subnormal $\mathcal{FP}$ constant (_ FP #b0 #b0000 #b0101000), so that their corresponding sort is (_ FP <4> <8>). Then, according to the semantics defined in the IEEE standard 754-2008 [4], the floating-point value of $x$ and $y$ in decimal notation is given by:

[TABLE]

The theory of $\mathcal{FP}$ provides a variety of built-in floating-point operations as defined in the IEEE standard 754-2008. This includes binary arithmetic operations (e.g. $+,-,\star,\div$ ), basic unary operations (e.g. $abs,-$ ), binary comparison operations (e.g. $\leq,<,\neq,=,>,\geq$ ), the remainder operation, the square root operation and more. Importantly, arithmetic operations are performed as if with infinite precision, but the result is then rounded to the “nearest” representable $\mathcal{FP}$ number according to the specified rounding mode. Five rounding modes are made available, as in [4].

The most common approach for $\mathcal{FP}$ -satisfiability is to encode $\mathcal{FP}$ expressions into $\mathcal{BV}$ formulas based on the circuits used to implement floating-point operations, using appropriate under- and over-approximation schemes –or a mixture of both– to improve performance [14, 44, 45, 43].

Then, the $\mathcal{BV}$ -Solver is used to deal with the $\mathcal{FP}$ formula, using either the eager or the lazy $\mathcal{BV}$ approach. An alternative approach, based on abstract interpretation, is presented in [11, 12, 26].

With this technique, called Abstract CDCL (ACDCL), the set of feasible solutions is over-approximated with floating-point intervals, so that intervals-based conflict analysis is performed to decide $\mathcal{FP}$ -satisfiability.

3 Theoretical Framework

We present our generalization of [31] to the case of signed/unsigned Bit-Vector Optimization, and then move on to deal with Floating-Point Optimization.

3.1 Bit-Vector Optimization

Without any loss of generality, we assume that every objective function $f(...)$ is replaced by a variable ${\sf obj}$ of the same type by conjoining “ ${\sf obj}=f(...)$ ” to the input formula. We use the symbol $n$ to denote the bit-width of ${\sf obj}$ , and ${\sf obj}[i]$ to denote the $i$ -th bit of ${\sf obj}$ , where ${\sf obj}[0]$ and ${\sf obj}[n-1]$ are the Most Significant Bit (MSB) and the Least Significant Bit (LSB) of ${\sf obj}$ respectively.\footreffootnote:msbtolsb

We define the Bit-Vector Optimization problem as follows.

Definition 1

( $\text{OMT}(\mathcal{BV})$ ). Let $\varphi$ be a $\text{SMT}(\mathcal{BV})$ formula and obj be a –signed or unsigned– $\mathcal{BV}$ variable occurring in $\varphi$ . We call an Optimization Modulo $\mathcal{BV}$ problem, $\text{OMT}(\mathcal{BV})$ , the problem of finding a model $\mathcal{M}$ for $\varphi$ (if any) whose value of obj, denoted with $\mathsf{min}_{\sf obj}(\varphi)$ , is minimum wrt. the total order relation $\leq_{n}$ for signed $\mathcal{BV}$ s if obj is signed, and the one for unsigned $\mathcal{BV}$ s otherwise. (The dual definition where we look for the maximum follows straightforwardly)

Hereafter, we generalize the unsigned $\mathcal{BV}$ maximization procedures described in [31] to the case of signed and unsigned $\mathcal{BV}$ optimization. To this extent, we introduce the novel notion of $\mathcal{BV}$ attractor.

Definition 2

(Attractor, attractor equalities).

When minimizing [resp. maximizing], we call attractor for obj the smallest [resp. greatest] $\mathcal{BV}$ -value $attr$ of the sort of obj. We call vector of attractor equalities the vector $A$ s.t. $A[k]\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}({\sf obj}[k]=attr[k])$ , $k\in[0..n-1]$ .

Example 2

If ${\sf obj}^{[8]}$ is an unsigned $\mathcal{BV}$ objective of width $8$ , then its corresponding attractor $attr$ is $\mathbf{0}^{[8]}$ , i.e. $[00000000]$ , when ${\sf obj}^{[8]}$ is minimized and it is $\mathbf{255}^{[8]}$ , i.e. $[11111111]$ , when ${\sf obj}^{[8]}$ is maximized. When ${\sf obj}^{[8]}$ is instead a signed $\mathcal{BV}$ objective, following the two’s complement encoding, the corresponding $attr$ is $\mathbf{-128}^{[8]}$ , i.e. $[10000000]$ , for minimization and $\mathbf{127}^{[8]}$ , i.e. $[01111111]$ , for maximization. $\diamond$

In essence, the attractor can be seen as the target value of the optimization search and therefore it can be used to determine the desired improvement direction and to guide the decisions taken by the optimization search. By construction, if a model $\mathcal{M}$ satisfies all equalities $A[i]$ , then $\mathcal{M}({\sf obj})=attr$ .

More in general, if $\mathcal{M}$ is a model of $\varphi$ , then the value of ${\sf obj}$ in $\mathcal{M}$ , denoted with $\mathcal{M}({\sf obj})$ , is given by

[TABLE]

when ${\sf obj}$ is an unsigned $\mathcal{BV}$ objective, and by

[TABLE]

when ${\sf obj}$ is a signed $\mathcal{BV}$ objective, using the two’s complement representation. The function ite, appearing in both previous equations, returns $attr[i]$ if the attractor equality $A[i]$ is true in $\mathcal{M}$ and $\overline{attr[i]}$ otherwise.

We use the symbol $\mu_{k}$ to denote a generic (possibly partial) assignment which assigns at least the $k$ most-significant bits of ${\sf obj}$ . We use the symbol $\tau_{k}$ to denote an assignment to all and only the $k$ most-significant bits of ${\sf obj}$ . Given $i<k$ , we denote by $\mu_{k}[i]$ [resp. $\tau_{k}[i]$ ] the value in $\{{0,1}\}$ assigned to ${\sf obj}[i]$ by $\mu_{k}$ [resp. $\tau_{k}$ ]. Moreover, we use the expression $[\![\mu_{k}]\!]_{i}$ where $i\leq k$ to denote the restriction of $\mu_{k}$ to all and only the $i$ most-significant bits of ${\sf obj}$ , ${\sf obj}[0],...,{\sf obj}[i-1]$ . Given a model $\mathcal{M}$ of $\varphi$ and a variable $v$ , we denote by $\mathcal{M}(v)$ the evaluation of $v$ in $\mathcal{M}$ . With a little abuse of notation, and when this does not cause ambiguities, we sometimes use an attractor equality $A[i]\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}({\sf obj}[i]=attr[i])$ to denote the single-bit assignment ${\sf obj}[i]:=attr[i]$ and its negation $\neg A[i]$ to denote the assignment to the complement value ${\sf obj}[i]:=\overline{attr[i]}$ .

Definition 3

(lexicographic maximization)

Consider an OMT instance $\langle{\varphi},{{\sf obj}}\rangle$ and the vector of attractor equalities $A$ . We say that an assignment $\tau_{n}$ to obj lexicographically maximizes $A$ wrt. $\varphi$ iff, for every $k\in[0..{n-1}]$ ,

•

$\tau_{n}[k]=\overline{attr{}[k]}$ if $\varphi\wedge[\![\tau_{n}]\!]_{k}\wedge A[k]$ is unsatisfiable,

•

$\tau_{n}[k]=attr{}[k]$ otherwise.

where $A[k]$ is the attractor equality $({\sf obj}[k]=attr{}[k])$ . (The dual definition of “lexicographically minimizes” comes by switching $attr{}[k]$ with $\overline{attr{}[k]}$ .) Given a model $\mathcal{M}$ for $\varphi$ , we say that $\mathcal{M}$ lexicographically maximizes $A$ wrt. $\varphi$ iff its restriction to obj lexicographically maximizes $A$ wrt. $\varphi$ .

Starting from the MSB to the LSB, $\tau_{n}$ [resp. $\mathcal{M}$ ] in Definition 3 assigns to each ${\sf obj}[k]$ the value $attr[k]$ unless it is inconsistent wrt. $\varphi$ and the assignments to the previous ${\sf obj}[i]$ s, $i\in[0..k-1]$ . Notice that this corresponds to minimize [resp. maximize] the value $\sum_{k=0}^{n-1}2^{n-1-k}\cdot({\sf obj}[k]\>\mathbf{xor}_{1}\>attr[k])$ [resp. $\sum_{k=0}^{n-1}2^{n-1-k}\cdot{({\sf obj}[k]\>\mathbf{nxor}_{1}\>attr[k])}$ ], —where $\mathbf{xor}_{n}$ is the bitwise-xor operator and $\mathbf{nxor}_{n}$ is its complement— because $2^{n-1-i}>\sum_{k=i+1}^{n-1}2^{n-1-k}$ .

The following fact derives from the above definitions and the properties of two’s complement representation adopted by the SMT-LIBv2 standard222If the standard adopted were the sign-and-magnitude binary encoding, then Theorem 3.1 would not hold. Nevertheless, in such a case we could adopt a simplified version of the technique for $\mathcal{FP}$ optimization described in §3.2.

for signed $\mathcal{BV}$ .

Theorem 3.1

An optimal solution of an $\text{OMT}(\mathcal{BV})$ problem $\langle{\varphi},{{\sf obj}}\rangle$ is any model $\mathcal{M}$ of $\varphi$ which lexicographically maximizes the vector of attractor equalities $A$ .

Proof

(We investigate the minimization case, since the maximization case is dual.)

In the case of minimization with unsigned $\mathcal{BV}$ , $attr$ is $[00...00]$ , so that the lexicographic optimization corresponds to minimize $\sum_{k=0}^{n-1}2^{n-1-k}\cdot{\sf obj}[k]$ which is the standard minimization for unsigned $\mathcal{BV}$ .

In the case of minimization with signed $\mathcal{BV}$ , $attr$ is $[10...00]$ , so that the lexicographic optimization corresponds to minimize $2^{n-1}\cdot\overline{{\sf obj}[0]}+\sum_{k=1}^{n-1}2^{n-1-k}\cdot{\sf obj}[k]$ which —by means of subtracting the constant value $2^{n-1}$ — is equivalent to minimize $-2^{n-1}\cdot{\sf obj}[0]+\sum_{k=1}^{n-1}2^{n-1-k}\cdot{\sf obj}[k]$ , which is the standard minimization for two’s complement $\mathcal{BV}$ . $\Box$

Definitions 2 and 3 with Theorem 3.1 suggest thus a direct extension to the minimization/maximization of signed $\mathcal{BV}$ of the algorithm for unsigned $\mathcal{BV}$ in [31]: apply the unsigned- $\mathcal{BV}$ maximization [resp. minimization] algorithm of [31] to the objective ${\sf obj}^{\prime}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}{({\sf obj}\>\mathbf{nxor}_{n}\>attr)}$ [resp. ${\sf obj}^{\prime}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}({\sf obj}\>\mathbf{xor}_{n}\>attr)$ ] instead than simply to obj [resp. $\overline{{\sf obj}}$ ].

Example 3

Let ${\sf obj}^{[3]}$ be a signed $\mathcal{BV}$ goal of $3$ bits to be minimized and $attr\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}[100]$ be its attractor, so that the corresponding vector of attractor equalities $A$ is equal to $[{\sf obj}[0]=1,{\sf obj}[1]=0,{\sf obj}[2]=0]$ .

An assignment $\tau_{3}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\{{A[0],\neg A[1],\neg A[2]}\}$ (for which ${\sf obj}^{[3]}=\mathbf{-1}^{[3]}$ ) is lexicographically better than $\tau_{3}^{\prime}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\{{\neg A[0],A[1],A[2]}\}$ (for which ${\sf obj}^{[3]}=\mathbf{0}^{[3]}$ ), because the former satisfies the attractor equality corresponding to the MSB while the latter does not. Moreover, the assignment $\tau_{3}$ is lexicographically worse than the assignment $\tau_{3}^{\prime\prime}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\{{A[0],\neg A[1],A[2]}\}$ (for which ${\sf obj}^{[3]}=\mathbf{-2}^{[3]}$ ), because –all the rest being equal– the latter assignment makes the attractor equality $({\sf obj}[2]=0)$ true. $\diamond$

3.2 Floating-Point Optimization

We define the Floating-Point Optimization problem as follows.

Definition 4

( $\text{OMT}(\mathcal{FP})$ ).

Let $\varphi$ be a $\text{SMT}(\mathcal{FP})$ formula and obj be a $\mathcal{FP}$ variable occurring in $\varphi$ . We call an Optimization Modulo $\mathcal{FP}$ problem, the problem of finding a model $\mathcal{M}$ for $\varphi$ (if any) whose value of obj, denoted with $\mathsf{min}_{\sf obj}(\varphi)$ , is either

•

minimum wrt. the usual total order relation $\leq$ for $\mathcal{FP}$ numbers, if $\varphi$ is satisfied by at least one model $\mathcal{M}^{\prime}$ s.t. $\mathcal{M}^{\prime}({\sf obj})$ is not NaN,

•

some binary representation of NaN, otherwise.

(The dual definition where we look for the maximum follows straightforwardly.)

Definition 4 is made necessarily convoluted by the fact that obj can be NaN. In fact, in the SMT-LIBv2 standard the comparisons $\{{\leq,<,\geq,>}\}$ between NaN and any other $\mathcal{FP}$ value are always evaluated false because NaN has multiple representations at the binary level (see Table 1). Also, requiring the optimal solution to be always different from NaN makes the resulting $\text{OMT}(\mathcal{FP})$ problem $\langle{\varphi\wedge\neg\mathsf{IsNaN({{\sf obj}})}},{{\sf obj}}\rangle$ unsatisfiable when $\varphi$ is satisfied only by models $\mathcal{M}$ s.t. $\mathcal{M}({\sf obj})$ is NaN. For these reasons, we admit NaN as the optimal solution value for obj if and only if $\varphi$ is satisfied only by models $\mathcal{M}$ s.t. $\mathcal{M}({\sf obj})$ is NaN.

In the rest of this section we assume that we have already checked, in sequence, that

$i)$

the input formula $\varphi$ is satisfiable —by invoking an $\text{SMT}(\mathcal{FP})$ solver on $\varphi$ . If the solver returns unsat, then there is no need to proceed;

$ii)$

$\varphi$ is satisfied by at least one model $\mathcal{M}^{\prime}$ s.t. $\mathcal{M}^{\prime}({\sf obj})$ is not NaN —by invoking an $\text{SMT}(\mathcal{FP})$ solver on $\varphi\wedge\neg\mathsf{IsNaN({{\sf obj}})}$ if the model $\mathcal{M}$ returned by the previous SMT call is s.t. $\mathcal{M}({\sf obj})$ is NaN. If the solver returns unsat, then we conclude that the minimum is NaN.

After that, we can safely focus our investigation on the restricted $\text{OMT}(\mathcal{FP})$ problem $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ , where ${\varphi_{\mathsf{noNaN}}}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\varphi\wedge\neg\mathsf{IsNaN({{\sf obj}})}$ , knowing it is satisfiable.

In Section §3.1, we have introduced the concept of a $\mathcal{BV}$ objective attractor, and we have shown how this value can be used to drive the optimization search towards the optimum value, when minimizing or maximizing a signed or unsigned $\mathcal{BV}$ goal. However, in the case of floating-point optimization, it is not possible to statically determine the attractor value in advance, before the search is even started. This is due to the more complex representation of $\mathcal{FP}$ variables, which uses three separate Bit-Vectors (i.e. sign, exponent and significand), and the presence of various classes of special values (i.e. zeros, infinity, NaN), which make definition 2 ambiguous for $\mathcal{FP}$ optimization. We illustrate this problem with the following example.

Example 4

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ be an $\text{OMT}(\mathcal{FP})$ problem where obj is a $\mathcal{FP}$ objective, of sort (_ FP 3 5), to be minimized. To make our explanation easier to follow, we show in Table 1 a short list of sample values for an $\mathcal{FP}$ variable of the same sort as obj. Each $\mathcal{FP}$ value is represented as a triplet of Bit-Vectors $\langle\mathbf{sign},\mathbf{exp},\mathbf{sig}\rangle$ –following the SMT-LIBv2 conventions described in Section §2– and also in decimal notation.

From Table 1, we immediately notice that the binary representation of both the exponent and the significant of a Floating-Point number grows in opposite directions in the positive and in the negative domains. In addition, by sorting the values according to their binary representation, we observe that $\mathtt{-\infty}$ [resp. $\mathtt{+\infty}$ ] is not the smallest [resp. greatest] representable $\mathcal{FP}$ value in the negative [resp. positive] domain. In fact, both extreme ends of the table are occupied by NaN, which has multiple binary representations.

In what follows, we temporarily disregard the effects of unit-propagation, which might assign some (or all) bits of ${\sf obj}$ as a result of some constraints in ${\varphi_{\mathsf{noNaN}}}$ , and pick some values as candidate attractors for an $\mathcal{FP}$ goal to be minimized.

Suppose that the attractor is chosen to be equal to the value $\mathtt{-\infty}$ listed at row $9$ in Table 1, which is the smallest $\mathcal{FP}$ value wrt. total order relation $\leq$ for $\mathcal{FP}$ numbers. Assume that the optimal value of the $\mathcal{FP}$ goal is the sub-normal $\mathcal{FP}$ value (fp #b1 #b000 #b1111) (i.e. $\frac{-15}{64}$ ). Then, it can be seen that after both the sign and the exponent bits have been decided to be equal #b1 and #b000 respectively, the remaining bits of the attractor pull the search in the wrong direction, that is, towards $-0$ .

Selecting a different $\mathcal{FP}$ value as candidate attractor does not really solve the problem, or rather, it results in a different set of issues.

For instance, an attractor equal to the NaN value listed at row $10$ in Table 1, which is the smallest representable $\mathcal{FP}$ value according to the binary ordering, would solve the problem for the previous case in which the optimum $\mathcal{FP}$ value is (fp #b1 #b000 #b1111). However, this attractor would remain an unsuitable choice for an $\text{OMT}(\mathcal{FP})$ instance where the $\mathcal{FP}$ goal is forced to be positive, because after the sign bit of the objective function has been decided to be equal #b0 the remaining bits of the attractor drive the search in the wrong direction, that is, towards $\mathtt{+\infty}$ . $\diamond$

Since there is no statically-determined $\mathcal{FP}$ value that can be used as an attractor when dealing with floating-point optimization, we introduce the new concept of dynamic attractor.

Definition 5

(Dynamic Attractor.)

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ be a restricted $\text{OMT}(\mathcal{FP})$ problem, where ${\varphi_{\mathsf{noNaN}}}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\varphi\wedge\neg\mathsf{IsNaN({{\sf obj}})}$ is a satisfiable $\text{SMT}(\mathcal{FP})$ formula and obj is a $\mathcal{FP}$ objective to be minimized [resp. maximized]. Let $k\in[0..n]$ and $\tau_{k}$ be an assignment to the $k$ most-significant bits of obj.

Then, we say that an $\mathcal{FP}$ -value $attr_{\tau_{k}}$ for obj is a dynamic attractor for obj wrt. $\tau_{k}$ iff it is the smallest [resp. largest] $\mathcal{FP}$ value different from NaN s.t. the $k$ most-significant bits of $attr_{\tau_{k}}$ have the same value of the $k$ most-significant bits of obj in $\tau_{k}$ . We call vector of attractor equalities the vector $A_{\tau_{k}}$ s.t. $A_{\tau_{k}}[i]\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}({\sf obj}[i]=attr_{\tau_{k}}[i])$ , $i\in[0..n-1]$ .

The following fact derives from the above definitions and the properties of IEEE 754-2008 standard representation adopted by SMT-LIBv2 standard for $\mathcal{FP}$ .

Lemma 1

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ be a restricted minimization [resp. maximization] $\text{OMT}(\mathcal{FP})$ problem, let $\tau_{k}$ be an assignment to ${\sf obj}[0]...{\sf obj}[k-1]$ and $attr_{\tau_{k}}$ be its corresponding dynamic attractor, for some $k\in[0..n-1]$ . Let $\tau_{k+1}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\tau_{k}\cup\{{{\sf obj}[k]:=attr_{\tau_{k}}[k]}\}$ and $\tau^{\prime}_{k+1}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\tau_{k}\cup\{{{\sf obj}[k]:=\overline{attr_{\tau_{k}}[k]}}\}$ , and let $\mathcal{M}$ , $\mathcal{M}^{\prime}$ two models for ${\varphi_{\mathsf{noNaN}}}$ which extend $\tau_{k+1}$ and $\tau^{\prime}_{k+1}$ respectively.

Then $\mathcal{M}({\sf obj})\leq\mathcal{M}^{\prime}({\sf obj})$ [resp. $\mathcal{M}({\sf obj})\geq\mathcal{M}^{\prime}({\sf obj})$ ].

Proof

(We prove the case of minimization, since that of maximization is dual wrt. the value of the sign bit.) We distinguish three cases based on the value of $k$ .

Case $k=0$ (sign bit). Then $attr_{\tau_{0}}[0]=1$ , $\tau_{1}=\{{{\sf obj}[0]=1}\}$ and $\tau^{\prime}_{1}=\{{{\sf obj}[0]=0}\}$ , where ${\sf obj}[0]$ is the MSB of ${\sf obj}$ and represents the sign of the floating-point value. Then ${\sf obj}$ is smaller or equal zero in every model $\mathcal{M}$ and larger or equal zero in every model $\mathcal{M}^{\prime}$ of ${\varphi_{\mathsf{noNaN}}}$ , so that $\mathcal{M}({\sf obj})\leq\mathcal{M}^{\prime}({\sf obj})$ is verified.

Case $k\in[1..ebits]$ (exponent bits), where $ebits$ is the number of bits in the exponent of ${\sf obj}$ . Then, $attr_{\tau_{k}}[k]$ is $1$ if $\tau_{k}[0]=1$ and [math] otherwise.

In the first case, ${\sf obj}$ can only be negative-valued in both $\mathcal{M}$ and $\mathcal{M}^{\prime}$ . More precisely, $\mathcal{M}({\sf obj})$ can be either $\mathtt{-\infty}$ or a normal negative value, whereas $\mathcal{M}^{\prime}({\sf obj})$ can be either a normal or a sub-normal negative value. Hereafter, we consider only the case in which both have a normal negative value, because the case in which $\mathcal{M}({\sf obj})=\mathtt{-\infty}$ or $\mathcal{M}^{\prime}({\sf obj})$ is sub-normal are both trivial, given that the absolute value of any sub-normal $\mathcal{FP}$ number is smaller than the absolute value of any normal $\mathcal{FP}$ number. Furthermore, we disregard the significand bits in $\mathcal{M}$ and $\mathcal{M}^{\prime}$ because their contribution to the value of ${\sf obj}$ is always less significant than that of the bits in the exponent. Given these premises, the exponent value of ${\sf obj}$ in every possible $\mathcal{M}$ is larger than the exponent of ${\sf obj}$ in every possible $\mathcal{M}^{\prime}$ by a value equal to $2^{ebits-k}$ and therefore, given that both $\mathcal{M}({\sf obj})$ and $\mathcal{M}^{\prime}({\sf obj})$ are negative-valued, $\mathcal{M}({\sf obj})\leq\mathcal{M}^{\prime}({\sf obj})$ .

The case in which $\tau_{k}[0]=0$ , that is when ${\sf obj}$ can only be positive-valued in both $\mathcal{M}$ and $\mathcal{M}^{\prime}$ , is dual.

Case $k>ebits$ (significand bits). Then there are three sub-cases.

If for every $i\in[1..ebits]$ the value of $\tau_{k}[i]$ is equal $1$ , then the only possible value of $\mathcal{M}({\sf obj})$ for every possible $\mathcal{M}$ is $\mathtt{+\infty}$ , and therefore $attr_{\tau_{k}}[k]=0$ . On the other hand, there exists no possible model $\mathcal{M}^{\prime}$ of ${\varphi_{\mathsf{noNaN}}}$ , because the assignment ${\sf obj}[k]=1$ would imply ${\sf obj}$ being equal to NaN, so the statement $\mathcal{M}({\sf obj})\leq\mathcal{M}^{\prime}({\sf obj})$ is vacuously true.

If instead there is some $i\in[1..ebits]$ s.t. $\tau_{k}[i]=0$ , then $attr_{\tau_{k}}[k]$ is $1$ if $\tau_{k}[0]=1$ (i.e. obj is negative-valued) and [math] otherwise (i.e. obj is positive-valued). In both cases, we can disregard the exponent bits in $\mathcal{M}$ and $\mathcal{M}^{\prime}$ because their contribution to the value of ${\sf obj}$ is the same in either model. For the same reasons, since $\mathcal{M}({\sf obj})$ and $\mathcal{M}^{\prime}({\sf obj})$ can only be either both normal or both sub-normal, we can ignore the contribution of the leading hidden bit and focus on the bits of the significand.

When $\tau_{k}[0]=1$ and obj must be negative-valued, the decimal value of the significand in $\mathcal{M}$ is larger than the decimal value of every possible significand in $\mathcal{M}^{\prime}$ by exactly $2^{{}-(k-ebits)}$ . Given that both $\mathcal{M}({\sf obj})$ and $\mathcal{M}^{\prime}({\sf obj})$ are negative-valued, we have that $\mathcal{M}({\sf obj})\leq\mathcal{M}^{\prime}({\sf obj})$ .

The case in which $\tau_{k}[0]=0$ , that is when obj can only be positive-valued in both $\mathcal{M}$ and $\mathcal{M}^{\prime}$ , is dual. $\Box$

Lemma 1 states that, given the current assignment $\tau_{k}$ to the $k$ most-significant-bits of obj, ${\sf obj}[k]=attr_{\tau_{k}}[k]$ is always the best extension of $\tau_{k}$ to the next bit (when consistent). A dynamic attractor $attr_{\tau_{k}}$ can thus be used by the optimization search to guide the assignment of the $k+1$ -th bit of ${\sf obj}$ towards the direction of maximum gain which is allowed by $\tau_{k}$ , so that to obtain the “best” extension $\tau_{k+1}$ of $\tau_{k}$ . Once the (new) assignment $\tau_{k+1}$ is found, the OMT solver can compute the dynamic attractor $attr_{\tau_{k+1}}$ for ${\sf obj}$ wrt. $\tau_{k+1}$ and then use it to assign the $k+2$ -th bit of ${\sf obj}$ , and so on.

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ be an $\text{OMT}(\mathcal{FP})$ instance, s.t. ${\sf obj}$ is a $\mathcal{FP}$ variable of $n$ bits, and $\tau_{0}$ be an initially empty assignment. If at each step of the optimization search the assignment of the $k$ -th bit of ${\sf obj}$ is guided by the dynamic attractor for ${\sf obj}$ wrt. $\tau_{k}$ , then the corresponding sequence of $n$ dynamic attractors (of increasing order $k$ ) is unique and depends exclusively on ${\varphi_{\mathsf{noNaN}}}$ . Intuitively, this is the case because the (current) dynamic attractor always points in the direction of maximum gain. We illustrate this in the following example.

Example 5

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ be an $\text{OMT}(\mathcal{FP})$ problem where obj is a $\mathcal{FP}$ objective, of sort (_ FP 3 5), to be minimized, as in Example 4. At the beginning of the search, nothing is known about the structure of the solution. Therefore, $\tau_{0}=\emptyset$ and, since obj is being minimized, the dynamic attractor for ${\sf obj}$ wrt. $\tau_{0}$ (i.e. $attr_{\tau_{0}}$ ) is equal to (fp #b1 #b111 #b0000) (i.e. $\mathtt{-\infty}$ ), which gives a preference to any feasible value of obj in the negative domain.

If at some point of the optimization search we discover that the domain of the objective function can only be positive, so that the first bit of ${\sf obj}$ is permanently set to [math] in $\tau_{1}$ , then the new dynamic attractor for obj wrt. $\tau_{1}$ (i.e. $attr_{\tau_{1}}$ ) is equal to (fp #b0 #b000 #b0000) (i.e. $+0$ ).

Furthermore, if later on we also find out that at least one bit in the exponent of obj can be assigned to [math] in a feasible solution of the problem that extends $\tau_{i}$ , for some $i$ , then we can remove $\mathtt{+\infty}$ from the optimization search interval. $\diamond$

Definition 6

(Attractor Trajectory $\mathcal{A}_{\varphi}$ ).

Consider the restricted $\text{OMT}(\mathcal{FP})$ problem $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ s.t. ${\varphi_{\mathsf{noNaN}}}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}\varphi\wedge\neg\mathsf{IsNaN({{\sf obj}})}$ as in Definition 5, a triplet of inductively-defined sequences $\langle{\{{\tau_{0},\tau_{1},...,\tau_{n}}\},\{{attr_{\tau_{0}},attr_{\tau_{1}},....,attr_{\tau_{n}}}\},\{{A_{\tau_{0}},A_{\tau_{1}},...,A_{\tau_{n}}}\}}\rangle$ —where each $\tau_{k}$ is an assignment to the first $k$ most-significant bits of ${\sf obj}$ s.t. $\tau_{k}\subset\tau_{k+1}$ , $attr_{\tau_{k}}$ is its corresponding dynamic attractor and $A_{\tau_{k}}$ is its corresponding vector of attractor equalities— so that, for every $k\in[0..n-1]$ :

(i)

$\tau_{k+1}[k]=\overline{attr_{\tau_{k}}[k]}$ if ${\varphi_{\mathsf{noNaN}}}\wedge\tau_{k}\wedge A_{\tau_{k}}[k]$ is unsatisfiable,

(ii)

$\tau_{k+1}[k]=attr_{\tau_{k}}[k]$ otherwise.

Then we define the attractor trajectory $\mathcal{A}_{\varphi}$ as the vector $[A_{\tau_{0}}[0],...,A_{\tau_{n-1}}[n-1]]$ .

The attractor trajectory $\mathcal{A}_{\varphi}$ contains those attractor equalities $({\sf obj}[k]=attr_{\tau_{k}}[k])$ which are of critical importance for the decisions taken by the optimization search. Intuitively, this is the case because the value of the $k$ -th bit of obj (i.e. ${\sf obj}[k]$ ) is still undecided in $\tau_{k}$ .

Example 6

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ be a restricted $\text{OMT}(\mathcal{FP})$ problem where obj is a $\mathcal{FP}$ objective, of sort (_ FP 3 5), to be minimized, as in Example 4. We consider the case in which the input formula ${\varphi_{\mathsf{noNaN}}}$ requires ${\sf obj}$ to be larger or equal $\nicefrac{{29}}{{2}}$ and it does not impose any other constraint on the value of ${\sf obj}$ . Given the sequence of (partial) assignments $\tau_{0},...,\tau_{8}$ in Figure 1, the corresponding list of dynamic attractors and the corresponding vectors of attractor equalities, then the attractor trajectory $\mathcal{A}_{\varphi}$ is equal to the vector $[{\sf obj}[0]=1,{\sf obj}[1]=0,{\sf obj}[2]=0,{\sf obj}[3]=0,{\sf obj}[4]=0,{\sf obj}[5]=0,{\sf obj}[6]=0,{\sf obj}[7]=0]$ . $\diamond$

Lemma 2

Consider $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ , $\tau_{0},...,\tau_{n}$ , $attr_{\tau_{0}},....,attr_{\tau_{n}}$ , $A_{\tau_{0}},...,A_{\tau_{n}}$ , and $\mathcal{A}_{\varphi}$ as in definition 6. Then $\tau_{n}$ lexicographically maximizes $\mathcal{A}_{\varphi}$ wrt. ${\varphi_{\mathsf{noNaN}}}$ .

Proof

By Definition 6, we have that, for each $k\in[0..n-1]$ ,

$(i)$

$\tau_{k+1}[k]=\overline{attr_{\tau_{k}}[k]}$ if ${\varphi_{\mathsf{noNaN}}}\wedge\tau_{k}\wedge A_{\tau_{k}}[k]$ is unsatisfiable,

$(ii)$

$\tau_{k+1}[k]=attr_{\tau_{k}}[k]$ otherwise.

By construction, $\tau_{k}=[\![\tau_{n}]\!]_{k}$ . Therefore, we can replace $\tau_{k}$ with $[\![\tau_{n}]\!]_{k}$ so that

$(i)$

$[\![\tau_{n}]\!]_{k+1}[k]=\overline{attr_{[\![\tau_{n}]\!]_{k}}[k]}$ if ${\varphi_{\mathsf{noNaN}}}\wedge[\![\tau_{n}]\!]_{k}\wedge A_{[\![\tau_{n}]\!]_{k}}[k]$ is unsatisfiable,

$(ii)$

$[\![\tau_{n}]\!]_{k+1}[k]=attr_{[\![\tau_{n}]\!]_{k}}[k]$ otherwise.

We notice the following facts. For each $k\in[0..n-1]$ , $[\![\tau_{n}]\!]_{k}\subset\tau_{n}$ . Furthermore, for each $k\in[0..n-1]$ , $\mathcal{A}_{\varphi}[k]=A_{[\![\tau_{n}]\!]_{k}}[k]$ because $\mathcal{A}_{\varphi}[k]=A_{\tau_{k}}[k]$ by the definition of attractor trajectory, and $A_{\tau_{k}}[k]=A_{[\![\tau_{n}]\!]_{k}}[k]$ by the equality $\tau_{k}=[\![\tau_{n}]\!]_{k}$ . Thus, we can replace $[\![\tau_{n}]\!]_{k+1}$ with $\tau_{n}$ and $A_{[\![\tau_{n}]\!]_{k}}[k]$ with $\mathcal{A}_{\varphi}[k]$ , as follows. For each $k\in[0..n-1]$ ,

$(i)$

$\tau_{n}[k]=\overline{attr_{\tau_{n}}[k]}$ if ${\varphi_{\mathsf{noNaN}}}\wedge[\![\tau_{n}]\!]_{k}\wedge\mathcal{A}_{\varphi}[k]$ is unsatisfiable,

$(ii)$

$\tau_{n}[k]=attr_{\tau_{n}}[k]$ otherwise.

Hence, $\tau_{n}$ lexicographically maximizes $\mathcal{A}_{\varphi}$ wrt. ${\varphi_{\mathsf{noNaN}}}$ . $\Box$

Finally, we make the following two observations. The first is that the sequence $\tau_{0},\tau_{1},...,\tau_{n}$ in definition 6 can be iteratively constructed using its list of requirements, for instance, by means of a sequence of incremental calls to an SMT solver. The second, more important, observation is that $\tau_{n}$ corresponds to the assignment of values which makes ${\sf obj}$ optimal in ${\varphi_{\mathsf{noNaN}}}$ .

Using the above definitions, we show that the following fact holds.

Theorem 3.2

Let $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ , $\tau_{0},...,\tau_{n}$ , $attr_{\tau_{0}},....,attr_{\tau_{n}}$ , $A_{\tau_{0}},...,A_{\tau_{n}}$ , and $\mathcal{A}_{\varphi}$ be as in definition 6. Then, any model $\mathcal{M}$ of ${\varphi_{\mathsf{noNaN}}}$ which lexicographically maximizes the attractor trajectory $\mathcal{A}_{\varphi}$ is an optimal solution for the $\text{OMT}(\mathcal{FP})$ problem $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ .

Proof

(We prove the case of minimization, since that of maximizations is dual.)

By Lemma 2 we have that $\tau_{n}$ lexicographically maximize $\mathcal{A}_{\varphi}$ . Let $\mathcal{M}$ be a model of ${\varphi_{\mathsf{noNaN}}}$ which lexicographically maximizes $\mathcal{A}_{\varphi}$ , and let $\mu$ be its restriction to obj. Since both $\tau_{n}$ and $\mathcal{M}$ lexicographically maximize $\mathcal{A}_{\varphi}$ , for the uniqueness of $\tau_{n}$ , we immediately notice that $\mu=\tau_{n}$ , so that $\tau_{k}=[\![\mu]\!]_{k}$ for each $k\in[0..n]$ and $\mu$ lexicographically maximize $\mathcal{A}_{\varphi}$ .

By definition, $\mathcal{M}$ is an optimal solution for $\langle{{\varphi_{\mathsf{noNaN}}}},{{\sf obj}}\rangle$ iff there exists no other model $\mathcal{M}^{\prime}$ for it s.t. $\mathcal{M}^{\prime}({\sf obj})<\mathcal{M}({\sf obj})$ . Hence, we show by contradiction that no such $\mathcal{M}^{\prime}$ can exist.

Assume (for the sake of contradiction), that there exists a model $\mathcal{M}^{\prime}$ for ${\varphi_{\mathsf{noNaN}}}$ , s.t. $\mathcal{M}^{\prime}({\sf obj})<\mathcal{M}({\sf obj})$ , and let $\mu^{\prime}$ be the restriction of $\mathcal{M}^{\prime}$ to obj. Then there must be at least one index $i$ for which $\mu[i]\neq\mu^{\prime}[i]$ . Let $m$ be the smallest such index. Recalling that $\tau_{m}=[\![\mu]\!]_{m}$ and $\tau_{m+1}=[\![\mu]\!]_{m+1}$ , we set $\tau_{m+1}^{\prime}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}[\![\mu^{\prime}]\!]_{m+1}$ . Then, $\tau_{m}\subset\tau_{m+1}$ , $\tau_{m}\subset\tau_{m+1}^{\prime}$ , $\tau_{m+1}\neq\tau_{m+1}^{\prime}$ . In particular, $\tau_{m+1}[m]=\overline{\tau_{m+1}^{\prime}[m]}$ and therefore $\tau_{m+1}[m]=attr_{\tau_{m}}[m]$ if $\tau_{m+1}^{\prime}[m]=\overline{attr_{\tau_{m}}[m]}$ , and vice versa.

Then, we distinguish two cases.

In the first case, $\tau_{m+1}[m]=\overline{attr_{\tau_{m}}[m]}$ and $\tau_{m+1}^{\prime}[m]=attr_{\tau_{m}}[m]$ . From $\tau_{m+1}[m]=\overline{attr_{\tau_{m}}[m]}$ and the fact that $\mu$ lexicographically maximizes $\mathcal{A}_{\varphi}$ , we derive that ${\varphi_{\mathsf{noNaN}}}\wedge\tau_{m}\wedge\mathcal{A}_{\varphi}[m]$ is unsatisfiable, where $\mathcal{A}_{\varphi}[m]\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}({\sf obj}[m]=attr_{\tau_{m}}[m])$ . Since $\tau_{m}\subset\tau_{m+1}^{\prime}\subseteq\mu^{\prime}$ and $\tau_{m+1}^{\prime}[m]=attr_{\tau_{m}}[m]$ , we conclude that ${\varphi_{\mathsf{noNaN}}}\wedge\mu^{\prime}\models\bot$ , so that $\mathcal{M}^{\prime}$ cannot be a model of ${\varphi_{\mathsf{noNaN}}}$ , contradicting the initial assumption.

In the second case, $\tau_{m+1}[m]=attr_{\tau_{m}}[m]$ and $\tau_{m+1}[m]=\overline{attr_{\tau_{m}}[m]}$ . Therefore, by Lemma 1, for every pair of models $\mathcal{M}_{1}$ , $\mathcal{M}_{2}$ for ${\varphi_{\mathsf{noNaN}}}$ which extend respectively $\tau_{m+1}$ and $\tau_{m+1}^{\prime}$ we have that $\mathcal{M}_{1}({\sf obj})\leq\mathcal{M}_{2}({\sf obj})$ . Since $\tau_{m+1}=[\![\mu]\!]_{m+1}$ and $\tau_{m+1}^{\prime}=[\![\mu^{\prime}]\!]_{m+1}$ , it follows that $\mathcal{M}^{\prime}({\sf obj})\not<\mathcal{M}({\sf obj})$ , contradicting the initial assumption. $\Box$

4 $\text{OMT}(\mathcal{FP})$ Procedures

In this paper, we consider two approaches for dealing with $\text{OMT}(\mathcal{FP})$ : a basic linear/binary search, based on the inline OMT schema for $\text{OMT}(\mathcal{LRA}\,\cup\,\mathcal{T})$ presented in [38], and Floating-Point Optimization with Binary Search (ofp-bs), a brand-new engine inspired by the obv-bs algorithm for unsigned Bit-Vectors in [31] and by Theorem 3.2 and relative definitions in §3.2.

4.1 OMT-based Approach

The OMT-based approach for $\text{OMT}(\mathcal{FP})$ adapts the linear- and binary-search schemata for $\text{OMT}(\mathcal{LRA}\,\cup\,\mathcal{T})$ presented in [38] to deal with $\mathcal{FP}$ objectives.

In the basic linear-search schema, the optimization search is advanced by means of a sequence of linear cuts, each of which forces the OMT solver to look for a new model $\mathcal{M}^{\prime}$ which improves the value of ${\sf obj}$ wrt. the most recent model $\mathcal{M}$ . In the binary-search schema, instead, the OMT solver learns an incremental sequence of cuts which bisect the current domain of the objective function. For clarity, we recap here the essential elements of the binary-search schema presented in [37, 38]. At the beginning of the optimization search and following each update of the lower- ( $lb$ ) and upper- ( $ub$ ) bounds of ${\sf obj}$ , the OMT solver computes a pivoting value $\mathsf{pivot}\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}{\tt floor}(\rho\cdot ub+(1-\rho)\cdot lb)$ , for some value of $\rho$ (e.g. $\frac{1}{2}$ ). If $\mathsf{pivot}$ lies inside the range $]lb,ub]$ , a cut of the form $({\sf obj}<\mathsf{pivot})$ is learned. Otherwise, if –due to rounding side-effects of $\mathcal{FP}$ operations– $\mathsf{pivot}$ lies outside the range $]lb,ub]$ , a cut of the form $({\sf obj}<{\sf ub})$ is learned instead. If the cut is satisfiable, the upper-bound of ${\sf obj}$ is updated with a new model value of ${\sf obj}$ . Otherwise, the lower-bound is made equal to $\mathsf{pivot}$ [resp. ${\sf ub}$ ]. The algorithm terminates when the search interval $[lb,ub[$ becomes empty. In general, it is reasonable to expect the binary-search schema to converge towards the optimal solution faster than the linear-search schema, because the feasible domain of a $\mathcal{FP}$ goal can be comprised by an exponentially large number of values (wrt. the bit-width of the cost function).

In either schema, whenever the optimization engine encounters for the first time a solution s.t. ${\sf obj}=\textsc{NaN}$ , the OMT solver learns a unit-clause of the form $\neg(\textsc{isNaN}({\sf obj}))$ so as to look for an optimal solution different from NaN (if any).

When dealing with $\mathcal{FP}$ objectives, differently from the case of $\mathcal{LRA}$ in [38], it is not necessary to implement a specialized optimization procedure within the $\mathcal{FP}$ -Solver in order to guarantee the termination of the optimization search. Indeed, such procedure is not available when Floating-Point terms are bit-blasted into Bit-Vectors eagerly, or when the acdcl $\mathcal{FP}$ -Solver is used, because by the time the optimization procedure is called the domain interval of any $\mathcal{FP}$ term contains a singleton value. Conversely, such a minimization procedure could be envisaged when the OMT solver uses a lazy $\mathcal{FP}$ -Solver as back-end, so as to speed-up the convergence towards the optimal solution333 Currently, there is no such specialized optimization procedure embedded within the lazy $\mathcal{FP}$ -Solver of OptiMathSAT, so we won’t describe this approach any further. .

4.2 Floating-Point Optimization with Binary Search

The Floating-Point Optimization with Binary Search algorithm is a new engine for $\text{OMT}(\mathcal{FP})$ which is inspired by the obv-bs algorithm for $\text{OMT}(\mathcal{BV})$ [31] and is a direct implementation of Definition 6 and Theorem 3.2.

The optimization search tries to lexicographically maximize an implicit attractor trajectory vector $\mathcal{A}_{\varphi}$ , which is incrementally derived from the current value of the dynamic attractor. The raw value of the dynamic attractor’s bits drive the optimization search towards the direction of maximum gain at any given point in time, without disrupting any decision that has been already made. The dynamic attractor is incrementally updated along the search, based on the outcome of the previous rounds of the optimization search. At each round, one bit of the objective function is assigned its final value. The first round decides the sign, the next batch of rounds decides the exponent and the remaining rounds decide the fine-grained details of the significand.

The pseudo-code of ofp-bs is shown in Figure 2. The arguments of the algorithm are the input formula $\varphi$ and the $\mathcal{FP}$ objective ${\sf obj}$ , where ${\sf obj}$ is a $\mathcal{FP}$ variable with $ebits$ bits in the exponent, $sbits-1$ in the significand and $n\stackrel{{\scriptstyle\text{\tiny def}}}{{=}}ebits+sbits$ bits overall.

The procedure starts by checking whether the input formula $\varphi$ is satisfiable and immediately terminates if that is not the case (lines $1$ - $3$ ). If ${\sf obj}=\textsc{NaN}$ in $\mathcal{M}$ then the procedure checks whether there exists a model $\mathcal{M}^{\prime}$ for $\varphi\wedge\neg\mathsf{IsNaN({{\sf obj}})}$ (lines $4$ - $5$ ). If this is not the case, the procedure terminates immediately and returns the pair $\langle{\textsc{sat}},{\mathcal{M}}\rangle$ (line $7$ ). Otherwise, the model $\mathcal{M}$ is updated with the new model $\mathcal{M}^{\prime}$ , and $\varphi$ is permanently extended with the constraint $\neg\mathsf{IsNaN({{\sf obj}})}$ (lines $9$ - $10$ ).

At this point, the procedure initializes the value of the dynamic attractor by invoking an external function update_dynamic_attractor() with the empty assignment $\tau$ as parameter, so that the returned value is equal to $\mathtt{-\infty}$ when minimizing and $\mathtt{+\infty}$ when maximizing (lines $11$ - $12$ ). Then, the execution moves to the section of code implementing the core part of the ofp-bs algorithm (lines $15$ - $28$ ), which consists of a loop over the bits of obj, starting from the MSB ${\sf obj}[0]$ down to the LSB ${\sf obj}[n-1]$ .

Inside this loop, ofp-bs first checks whether the value of ${\sf obj}[i]$ in $\mathcal{M}$ matches the $i$ -th bit of the (current) dynamic attractor $attr_{\tau}$ . If this is the case, then the $i$ -th bit is already set to its “best” value in $\mathcal{M}$ . Thus, the assignment $\tau$ is extended so as to permanently set ${\sf obj}[i]=attr_{\tau}[i]$ (line $16$ ), and the optimization search moves to the next iteration of the loop. If instead ${\sf obj}[i]\neq attr_{\tau}[i]$ in $\mathcal{M}$ , we need to verify whether the value of the objective function in $\mathcal{M}$ can be improved by forcing the $i$ -th bit of ${\sf obj}$ equal to the $i$ -th bit of the dynamic attractor. To do so, we incrementally invoke the underlying SMT solver, this time checking the satisfiability of $\varphi$ under the list of assumptions $\tau\cup\{{\sf obj}[i]=attr_{\tau}[i]\}$ (line $22$ ). If the SMT solver returns sat, then the value of the objective function has been successfully improved. Hence, $\tau$ is extended with an assignment setting ${\sf obj}[i]$ equal to $attr_{\tau}[i]$ , and $\mathcal{M}$ is replaced with the new model $\mathcal{M}^{\prime}$ (lines $23$ - $25$ ). Otherwise, it is not possible to improve the objective function by toggling the value of ${\sf obj}[i]$ , and $\tau$ is extended so as to permanently set ${\sf obj}[i]\neq attr_{\tau}[i]$ (line $27$ ). At this point, there is a mismatch between the value of the first $i+1$ bits of obj in $\mathcal{M}$ , corresponding to the assignment $\tau$ , and those of the current dynamic attractor. This mismatch is resolved by calling the function update_dynamic_attractor() with the updated assignment $\tau$ as parameter (line $28$ ). In either case, the execution moves to the next iteration of loop.

After exactly $n$ iterations of the loop, the optimization search terminates with the pair $\langle{\textsc{sat}},{\mathcal{M}}\rangle$ , where $\mathcal{M}$ is the optimum model of the given $\text{OMT}(\mathcal{FP}\,\cup\,\mathcal{T})$ instance. The ofp-bs algorithm requires at most $n+2$ incremental calls to an underlying $\text{SMT}(\mathcal{FP})$ solver. The test in rows 17-18 allows for saving lots of such SMT calls when the current model already assigns ${\sf obj}[i]$ to its corresponding value in the attractor.

The function update_dynamic_attractor() takes as input $\tau$ , a (partial) assignment over the $k$ most-significant bits of ${\sf obj}$ and, when ${\sf obj}$ is minimized 444 The implementation of update_dynamic_attractor() is dual when ${\sf obj}$ is maximized. , and it essentially works as follows. If $\tau=\emptyset$ , then nothing is known about the solution of the problem, so $\mathtt{-\infty}$ is returned. Otherwise, the procedure must compute the smallest $\mathcal{FP}$ value different from NaN (if any) which extends $\tau$ . Since $\tau\neq\emptyset$ then we know that the sign of the objective function has been permanently decided in $\tau$ . If ${\sf obj}[0]=0$ in $\tau$ , i.e. ${\sf obj}$ must be positive, the procedure must return the smallest positive $\mathcal{FP}$ value admitted by $\tau$ . Hence, we extend $\tau$ with $\bigcup_{i=|\tau|}^{i=n-1}{\sf obj}[i]=0$ and return the corresponding $\mathcal{FP}$ value. If ${\sf obj}[0]=1$ in $\tau$ , i.e. ${\sf obj}$ can be negative values, the procedure must return the largest negative $\mathcal{FP}$ value admitted by $\tau$ . We first check whether there exists a bit in the exponent of ${\sf obj}$ which is assigned to [math] in $\tau$ . If that is the case, we extend $\tau$ with $\bigcup_{i=|\tau|}^{i=n-1}{\sf obj}[i]=1$ and return the corresponding $\mathcal{FP}$ value. Otherwise, the procedure returns the value $\mathtt{-\infty}$ , which is still a viable extension of $\tau$ .

4.3 Search Enhancements

Given a $\mathcal{FP}$ value $attr$ and a $\mathcal{FP}$ goal ${\sf obj}$ , (a combination of) the following techniques can be used to adjust the behavior of the optimization search, similarly what has been proposed for the case of $\text{OMT}(\mathcal{BV})$ by Nadel et al. in [31].

•

branching preference: the bits of the $\mathcal{FP}$ objective obj are marked, inside the OMT solver, as preferred variables for branching starting from the MSB down to the LSB. This ensures that conflicts involving the value of the objective function are handled as early as possible, possibly reducing the amount of work that needs to be redone after each back-jump.

•

polarity initialization: the phase-saving value of each ${\sf obj}[i]$ is initialized with the value of $attr[i]$ . This encourages the OMT solver to assign the bits of ${\sf obj}$ so as to reassemble the bits of $attr$ , thus possibly speeding-up the convergence towards the optimal value.

In the case of the basic OMT schema described in Section §4.1, the effectiveness of either technique depends on the initial choice for $attr$ . In the lucky case, the value of $attr$ pulls the optimization search in the right direction and speeds up the search. In the unlucky case, when $attr$ pulls in the wrong direction, there is no visible effect or an overall slow down. For instance, in the case of the linear-search optimization schema, enabling both options with an unlucky choice of $attr$ can cause the OMT solver to start the search from the furthest possible point from the optional solution, and thus enumerate an exponential number of intermediate solutions. Naturally, the OMT-based optimization search algorithm is still guaranteed to terminate even in the worst-case scenario, but the unpredictable performance makes using either technique a generally unsuitable option in practice.

In the case of the ofp-bs algorithm described in Section §4.2, we use the latest value of the dynamic attractor $attr_{\tau}$ for both the branching preference (lines $11$ and $18$ of Figure 2) and the polarity initialization (rows $12$ and $19$ of Figure 2) techniques. We observe that the value of every bit in the dynamic attractor can change after the sign of the objective function has been decided. Furthermore, the value of all the significand’s bits in the dynamic attractor can also change during the process of determining the optimal exponent value of the objective function (see, e.g., Example 4). As a consequence, if the OMT solver applies either enhancement before the correct improving direction is known, this may cause the underlying OMT engine to advance the search starting from a sub-optimal set of initial decisions. Enabling both enhancements at the same time could make things even worse. In order to mitigate this issue, we have designed a variant of our optimization-search approach which does not apply either enhancement on those bits of the objective function for which the best improving direction is not yet known. We have called this variant safe bits restriction.

5 Experimental Evaluation

We assess the performance of OptiMathSAT (v. 1.6.2) on a set of $\text{OMT}(\mathcal{FP})$ formulas that have been automatically generated using the $\text{SMT}(\mathcal{FP})$ benchmark-set of [3]. The formulas, the results and the scripts necessary to reproduce these results are made publicly available and can be downloaded from [1].

Experiment Setup.

This experiment has been performed on an i7-6500U 2.50GHz Intel Quad-Core machine with $16GB$ of ram and running Ubuntu Linux $17.10$ . For each formula being tested we used a timeout of $600$ seconds. The $\text{OMT}(\mathcal{FP})$ instances used in this experiment have been automatically generated starting from the satisfiable formulas included in the $\text{SMT}(\mathcal{FP})$ benchmark-set of [3]. We did not consider any of the unsatisfiable instances that are present in the remote repository.

For each of the original $\text{SMT}(\mathcal{FP})$ formulas we applied the following transformations. First, we either relaxed or removed some of the constraints in the original problem, so as to broaden the set of feasible solutions. This step is necessary because the majority of the original $\text{SMT}(\mathcal{FP})$ formulas admits only one solution. However, this is not necessarily the ideal situation when comparing different optimization approaches. Second, for each $\mathcal{FP}$ variable $v$ appearing inside a $\text{SMT}(\mathcal{FP})$ problem we generated a pair of $\text{OMT}(\mathcal{FP})$ instances, one for the minimization and another for the maximization of $v$ . At the end of this step, we obtained $39536$ $\text{OMT}(\mathcal{FP})$ formulas. Third, we randomly selected up to $300$ $\text{OMT}(\mathcal{FP})$ instances from each of the five groups of problems in the $\text{OMT}(\mathcal{FP})$ benchmark-set. This filtering step yielded a total of $1120$ SMT-LIBv2 formulas.

We consider two OMT-based baseline configurations, OptiMathSAT(omt+lin) and OptiMathSAT(omt+bin), that run the linear- and the binary-search respectively. These configurations have been tested using both the eager and the lazy $\mathcal{FP}$ approaches. The third baseline approach, named OptiMathSAT(eager+obv-bs), is based on a reduction of the $\text{OMT}(\mathcal{FP})$ problem to $\text{OMT}(\mathcal{BV})$ and it uses OptiMathSAT’s implementation of the obv-bs engine555 The binaries of the original $\text{OMT}(\mathcal{BV})$ tools presented in [31] are not publicly available. presented by Nadel et al. in [31]. For this test, we have generated an $\text{OMT}(\mathcal{BV})$ benchmark-set using a $\mathcal{BV}$ encoding that mimics the essential aspects of the ofp-bs algorithm described Section §4.2.

We compared these baseline approaches with a configuration using the ofp-bs algorithm and the eager $\mathcal{FP}$ approach, namely OptiMathSAT(eager+ofp-bs).

We have separately tested the effect of enabling the branching preference (bp), the polarity initialization (pi) and the safe bits restriction (so) enhancements described in Section §3.2, whenever these options were supported by the given configuration.

Last, in order to assess the significance of the optimization problems used in this experiment, we have collected the run-time statistics of OptiMathSAT on the SMT formulas obtained by stripping the objective function from each OMT instance. We named this configuration OptiMathSAT(eager+smt).

We have not included other tools in our experiment because we are not aware of any other $\text{OMT}(\mathcal{FP})$ solver. For all problem instances, we verified the correctness of the optimal solution found by each configuration with an SMT solver (MathSAT5). When terminating, all tools returned the same optimum value. In order to perform this cross-check as efficiently as possible, we enabled model generation on every configuration so that the optimum model could be extracted and verified.

Experiment Results.

The results of this experiment are listed in Table 2. Figure 4 depicts the loc-scale cactus plot of the same data, for a visual comparison among the different configurations. In addition, Figures 5, 6 and 7 show a selection of relevant pairwise comparisons among various OptiMathSAT configurations. Figure 5 focuses on variants of the OMT-based linear-search approach, Figure 6 depicts variants of the OMT-based binary-search approach, whereas Figure 7 focuses on the ofp-bs engine.

For what concerns OMT-based linear-search optimization, we observe that OptiMathSAT performs the best when no enhancement is enabled. In particular, the empirical evidence suggests that enabling branching preference significantly increases the number of timeouts, generally deteriorating the performance (plot $1A$ in Fig. 5). Enabling only polarity initialization does not result in an appreciable change on the running time of the solver (plot $1B$ in Fig. 5). In contrast, enabling both enhancements at the same time has a small chance to result in a small improvement of the search time (plot $2A$ in Fig. 5), but it generally worsens the performance and results in a drastic increase in the number of timeouts (Table 2). We justify these results as follows. First, when only polarity initialization is used, the phase-saving value that is being set by OptiMathSAT does not really matter because the optimization search is dominated by the structure of the formula itself rather than by the bits of the $\mathcal{FP}$ objective. Second, when polarity initialization is used on top of branching preference, there is an even more drastic decrease in performance due to the fact that the initial phase-saving value that is statically assigned by the OMT solver to the bits of the $\mathcal{FP}$ objective cannot be expected to be “good enough” for any situation.

In fact, as illustrated in example 4, the initial phase-saving can be misleading and force the OMT solver –when running in linear-search– to explore an exponential number of intermediate satisfiable solutions.

In the case of the OMT-based binary-search optimization approach, we observe that it solves more formulas than linear-search and it generally appears to be faster (plot $3B$ in Fig. 5). Overall, polarity initialization does not seem to be beneficial, whereas enabling branching preference increases the number of formulas solved within the timeout. This behavior is different from the linear-search approach, and we conjecture that it is due to the fact that, with the OMT-based binary-search approach, branching over the bits of the objective function can reveal in advance any (partial) assignment to the bits of the objective function that it is inconsistent wrt. the pivoting cuts learned by the optimization engine.

Using the lazy $\mathcal{FP}$ engine results in fewer formulas being solved, although a significant number of these benchmarks is solved faster than with any other configuration (over $90$ instances, for both configurations).

The OptiMathSAT(eager+obv-bs) configuration is able to solve $1013$ formulas within the timeout, showing that $\text{OMT}(\mathcal{FP})$ can be reduced to $\text{OMT}(\mathcal{BV})$ effectively, and that –on the given benchmark-set– the performance of this approach are comparable with the best $\text{OMT}(\mathcal{FP})$ configurations being tested.

Overall, the best performance is obtained by using the ofp-bs engine, with up to $1019$ benchmark-set instances being solved in correspondence to the OptiMathSAT(eager+ofp-bs+pi) configuration.

In plot $2B$ of Figures 5 and 6, we show the pairwise comparison of the best ofp-bs configuration with the best OMT-based run.

Similarly to the case of OMT-based optimization with linear-search, we observe that enabling branching preference generally makes the performance worse (plot $1A$ in Fig. 7). Instead, when polarity initialization is used we observe a general performance improvement that does not only result in an increase in the number of formulas being solved within the timeout, but also a noticeable reduction of the solving time as a whole. This is in contrast with the case of OMT-based optimization, and it can be explained by the fact that ofp-bs uses an internal heuristic function to dynamically determine and update the most appropriate phase-saving value for the bits of the objective function. An equally important role is played by the safe bits restriction, that limits the effects of branching preference and polarity initialization to only certain bits of the dynamic attractor.

As illustrated by the plots in the second and third rows of Figure 7 and by the data in Table 2, this feature is particularly effective when used in combination with branching preference.

The results of OptiMathSAT over the SMT-only version of the benchmark-set are reported in Table 2 and in the scatter-plot $3B$ in Fig. 6, and show that for a large number of instances the OMT problem is considerably harder than its SMT-only version There are a few exceptions to this rule, that we ascribe to the fact that the removal of the objective function alters the internal stack of formulas, and this can have unpredictable consequences on the behavior of various internal heuristics that depend on it. A solution can be found in a shorter amount of time when the sequence of (heuristic) choices is compatible with its assignment and it requires little back-tracking effort.

6 Conclusions and Future Work

We have presented for the first time OMT procedures (for signed Bit-Vectors and) Floating-Point numbers, based on the novel notions of attractor, dynamic attractor and attractor trajectory, which we have implemented in OptiMathSAT and tested on modified problems from SMT-LIB.

Ongoing research involves implementing our ofp-bs procedure on top of the ACDCL $\text{SMT}(\mathcal{FP})$ procedure —which is not immediate to do efficiently because the latter approach does not allow directly accessing and setting the single bits of the objective (since $\mathcal{BV}$ and $\mathcal{FP}$ are not signature-disjoint). Future research involves experimenting the new OMT procedure directly on problems coming from bit-precise SW and HW verification, produced, e.g., by the NuXmv model checker [2].

Bibliography45

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] http://disi.unitn.it/trentin/resources/floatingpoint_test.tar.gz .
2[2] nu Xmv . https://nuxmv.fbk.eu .
3[3] Smt Libv 2. www.smtlib.cs.uiowa.edu/ .
4[4] IEEE standard 754, 2008. http://grouper.ieee.org/groups/754/ .
5[5] H. F. Albuquerque, R. F. Araujo, I. V. de Bessa, L. C. Cordeiro, and E. B. de Lima Filho. Opt CE: A Counterexample-Guided Inductive Optimization Solver. In SBMF , volume 10623 of Lecture Notes in Computer Science , pages 125–141. Springer, 2017.
6[6] R. F. Araujo, H. F. Albuquerque, I. V. de Bessa, L. C. Cordeiro, and J. E. C. Filho. Counterexample guided inductive optimization based on satisfiability modulo theories. Sci. Comput. Program. , 165:3–23, 2018.
7[7] R. Araújo, I. Bessa, L. C. Cordeiro, and J. E. C. Filho. SMT-based Verification Applied to Non-convex Optimization Problems. In 2016 VI Brazilian Symposium on Computing Systems Engineering (SBESC) , pages 1–8, Nov 2016.
8[8] N. Bjorner and A.-D. Phan. ν Z 𝜈 𝑍 \nu{}Z - Maximal Satisfaction with Z 3. In Proc International Symposium on Symbolic Computation in Software Science , Gammart, Tunisia, December 2014. Easy Chair Proceedings in Computing (E Pi C).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Optimization Modulo the Theories of Signed Bit-Vectors and Floating-Point Numbers

Abstract

1 Introduction

2 Background

Bit-Vectors.

Floating-Point.

Example 1

3 Theoretical Framework

3.1 Bit-Vector Optimization

Definition 1

Definition 2

Example 2

Definition 3

Theorem 3.1

Proof

Example 3

3.2 Floating-Point Optimization

Definition 4

Example 4

Definition 5

Lemma 1

Proof

Example 5

Definition 6

Example 6

Lemma 2

Proof

Theorem 3.2

Proof

4 OMT(FP)\text{OMT}(\mathcal{FP})OMT(FP) Procedures

4.1 OMT-based Approach

4.2 Floating-Point Optimization with Binary Search

4.3 Search Enhancements

5 Experimental Evaluation

Experiment Setup.

Experiment Results.

6 Conclusions and Future Work

4 $\text{OMT}(\mathcal{FP})$ Procedures