TL;DR
This paper introduces SynNNF, a new normal form for Boolean formulas that enables polynomial-time synthesis and quantification, improving the efficiency of Boolean functional synthesis and solving benchmarks beyond current tools.
Contribution
The paper presents SynNNF, a novel normal form that guarantees efficient synthesis and quantification, and proposes an algorithm to convert CNF formulas into SynNNF for practical synthesis.
Findings
SynNNF can be more succinct than existing normal forms.
The conversion algorithm enables solving complex benchmarks.
Prototype implementation outperforms state-of-the-art tools.
Abstract
Given a Boolean formula F(X,Y), where X is a vector of outputs and Y is a vector of inputs, the Boolean functional synthesis problem requires us to compute a Skolem function vector G(Y)for X such that F(G(Y),Y) holds whenever \exists X F(X,Y) holds. In this paper, we investigate the relation between the representation of the specification F(X,Y) and the complexity of synthesis. We introduce a new normal form for Boolean formulas, called SynNNF, that guarantees polynomial-time synthesis and also polynomial-time existential quantification for some order of quantification of variables. We show that several normal forms studied in the knowledge compilation literature are subsumed by SynNNF, although SynNNFcan be super-polynomially more succinct than them. Motivated by these results, we propose an algorithm to convert a specification in CNF to SynNNF, with the intent of solving the Boolean…
| Benchmarks | Compiled By | BDD | Total | ||
|---|---|---|---|---|---|
| (Total) | Stage I | Stage II | Total | compilation | in SynNNF |
| QBFEval (402) | 103 | 82 | 185 | 153 | 283 |
| FA.QD (6) | 0 | 6 | 6 | 6 | 6 |
| Bench | vs Cadet | vs bfss | |||
| ( | |||||
| mark | Cadet | bfss | bfss) | ||
| QBFEval | 77 | 105 | 83 | 78 | 74 |
| FA.QD | 2 | 0 | 3 | 0 | 2 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Knowledge Compilation for Boolean Functional Synthesis· youtube
Knowledge Compilation for Boolean Functional Synthesis
S. Akshay, Jatin Arora, Supratik Chakraborty, S. Krishna, Divya Raghunathan and Shetal Shah
Indian Institute of Technology Bombay, Mumbai, India
Abstract
Given a Boolean formula , where is a vector of outputs and is a vector of inputs, the Boolean functional synthesis problem requires us to compute a Skolem function vector for such that holds whenever holds. In this paper, we investigate the relation between the representation of the specification and the complexity of synthesis. We introduce a new normal form for Boolean formulas, called SynNNF, that guarantees polynomial-time synthesis and also polynomial-time existential quantification for some order of quantification of variables. We show that several normal forms studied in the knowledge compilation literature are subsumed by SynNNF, although SynNNF can be super-polynomially more succinct than them. Motivated by these results, we propose an algorithm to convert a specification in to SynNNF, with the intent of solving the Boolean functional synthesis problem. Experiments with a prototype implementation show that this approach solves several benchmarks beyond the reach of state-of-the-art tools.
I Introduction
Boolean functional synthesis is the problem of synthesizing outputs as Boolean functions of inputs, while satisfying a declarative relational specification between inputs and outputs. Also called Skolem function synthesis, this problem has numerous applications including certified QBF solving, reactive control synthesis, circuit and program repair and the like. While variants of the problem have been studied since long [17, 3], there has been significant recent interest in designing practically efficient algorithms for Boolean functional synthesis. The resulting breed of algorithms [14, 23, 22, 11, 25, 18, 13, 2, 1, 15, 7, 24] have been empirically shown to work well on large collections of benchmarks. Nevertheless, there are not-so-large examples that are currently not solvable within reasonable resources by any known algorithm. To make matters worse, it is not even fully understood what properties of a Boolean relational specification or of its representation make it amenable to efficient synthesis. In this paper, we take a step towards answering this question. Specifically, we propose a new sub-class of negation normal form called SynNNF, such that every Boolean relational specification in SynNNF admits polynomial-time synthesis. Furthermore, a Boolean relational specification admits polynomial-time synthesis (by any algorithm) if and only if there exists a polynomial-sized refinement of the specification in SynNNF.
To illustrate the hardness of Boolean functional synthesis, consider the specification , where , and denotes multiplication of -bit unsigned integers. This specification asserts that , viewed as a -bit unsigned integer, is the product of and , each viewed as an -bit unsigned integer different from . The specification can be easily represented as a circuit of AND, OR, NOT gates with gates. However, synthesizing and as functions of requires us to obtain a circuit that factorizes a -bit unsigned integer into factors different from , whenever possible. It is a long-standing open question whether such a circuit of size polynomial in exists. Thus, although the relational specification is succinctly representable, the outputs expressed as functions of the inputs may not have any known succinct representation.
It was recently shown [1] that unless some long-standing complexity theoretic conjectures are falsified, Boolean functional synthesis must necessarily require super-polynomial (or even exponential) space and time. In the same work [1], it was also shown that if a specification is represented in weak decomposable negation normal form wDNNF, synthesis can be accomplished in time polynomial in the size of the specification. While this was a first step towards identifying a normal form with the explicit objective of polynomial-time synthesis, experimental results in [1] indicate that wDNNF doesn’t really characterize specifications that admit efficient synthesis. Specifically, experiments in [1] showed that a polynomial-time algorithm intended for synthesis from wDNNF specifications ends up solving the synthesis problem for a large class of specifications not in wDNNF. This motivates us to ask if there exists a weaker (than wDNNF) sub-class of Boolean relational specifications that admit polynomial-time synthesis.
We answer the above question affirmatively in this paper, the polynomial dependence being quadratic in the number of outputs and the size of the specification. En route, we also show that the weaker normal form, viz. SynNNF, admits polynomial-time existential quantifier elimination of a set of variables for some (not all) order of quantification of variables. Applications of such quantifier elimination abound in practice, viz. image computation in symbolic model checking, synthesis of QBF certificates, computation of interpolants etc. Note that ensuring efficient quantifier elimination for some ordering of variables is simpler than ensuring efficient quantifier elimination for all orderings of variables – the latter having been addressed by normal forms like DNNF [9].
Our primary contributions can be summarized as follows:
- •
We present a new sub-class of negation normal form, called SynNNF, that admits polynomial-time synthesis and quantifier elimination for a set of variables.
- •
We show that SynNNF is super-polynomially (in some cases, exponentially) more succinct than several other sub-classes studied in the literature (viz. wDNNF, dDNNF, DNNF, FBDD, ROBDD), unless some long-standing complexity theoretic conjectures are falsified.
- •
We show that by suitably weakening SynNNF, we can precisely characterize the class of Boolean specifications that admit polynomial-time synthesis by a simple algorithm originally proposed in [1].
- •
We define a natural notion of refinement of specifications w.r.t synthesis and show that every specification that admits polynomial-time synthesis necessarily has a polynomial-sized refinement that is in SynNNF.
- •
We present a novel algorithm for compiling a Boolean relational specification in to a refined specification in SynNNF. We call this knowledge compilation for synthesis and quantifier elimination.
- •
Finally, we present experimental results that show that synthesis by compiling to SynNNF solves a large set of benchmarks, including several benchmarks beyond the reach of existing tools.
Related Work: The literature on knowledge compilation of Boolean functions is rich and extensive [6, 9, 20, 10]. While existential quantification or forgetting of propositions has been studied in [16, 10], neither Boolean functional synthesis nor existential quantification for some (not all) ordering of variables has received attention in earlier work on knowledge compilation. Sub-classes of negation normal forms like DNNF and other variants [10] admit efficient existential quantification for all orders in which variables are quantified. However, if we are interested in only the result of existentially quantifying a given set of variables, these forms can be unnecessarily restrictive and exponentially larger. Recent work on Boolean functional synthesis [13, 14, 18, 24, 11, 2, 1, 8] has focused more on algorithms to directly synthesize outputs as functions of inputs. Some of these algorithms (viz. [11, 1, 8]) exploit properties of specific input representations for optimizing the synthesis process. This has led to the articulation of sufficient conditions on representation of specifications for efficient synthesis. For example, [15] suggested using input-first ROBDDs for efficient synthesis, and a quadratic-time algorithm for synthesis from input-first ROBDDs was presented in [11]. This result was subsequently generalized in [1], where it was shown that specifications in wDNNF (which strictly subsumes ROBDDs) suffice to give a quadratic-time algorithm for synthesis. As we show later, wDNNF can itself be generalized to SynNNF. In another line of investigation, it was shown [8] that if a specification is decomposed into an input-part and an output-part, then synthesis can be achieved in time linear in the size of the specification and , where is the smaller of the count of maximal falsifiable subsets (MFS) of the input-part and the count of maximal satisfiable subsets (MSS) of the output-part. However, this does not yield an algorithm whose running time is polynomial in the size of the representation of .
The paper is organized as follows. After preliminaries, we present the new normal form SynNNF and its properties in Section III. In Section IV, we introduce the idea of refinement, which allows us to simplify the specification. In Section V, we describe an algorithm to compile any function into our normal form, followed in Section VI by experimental results, before ending with a conclusion. Proofs of lemmas and theorems are mostly deferred to the appendix.
II Preliminaries and notations
A Boolean formula on variables is a mapping . The set of variables is called the support of the formula, and denoted . We normally use to denote the sequence . For notational convenience, we will also use to denote a set of variables, when there is no confusion. A satisfying assignment or model of is a mapping of variables in to such that evaluates to under this assignment. If is a model of , we write and use to denote the value assigned to by . If is a subsequence of , we use to denote the projection of on , i.e. , where . We use to denote the conjunction of literals (i.e. variables or their negation) corresponding to . For example, if assigns to and [math] to and , then .
II-1 Negation normal form ()
This is the class of Boolean formulas in which (i) the only operators used are conjunction (), disjunction () and negation (), and (ii) negation is applied only to variables. Every Boolean formula can be converted to a semantically equivalent formula. Moreover, this conversion can be done in linear time for representations like s, ROBDDs, Boolean circuits etc.
II-2 Unate formulas
Let (resp. ) denote the positive (resp. negative) cofactor of with respect to . Then, is positive unate in iff . Similarly, is negative unate in iff . A literal is said to be pure in an formula iff has at least one instance of but no instance of . If (resp. ) is pure in , then is positive (resp. negative) unate in .
II-3 Independent support and functionally defined variables
A subsequence of is said to be an independent support of iff every pair of satisfying assignments of that agree on the assignment of variables in also agree on the assignment of all variables in . Variables not in are said to be functionally defined by the independent support. Effectively, the assignment of variables in uniquely determine that of functionally defined variables, when satisfying . encodings of Boolean functions originally specified as circuits, ROBDDs, s etc. often use Tseitin encoding [26], which introduces a large number of functionally defined variables.
II-4 Boolean functional synthesis
Unless mentioned otherwise, we use to denote a sequence of Boolean outputs, and to denote a sequence of Boolean inputs. The Boolean functional synthesis problem, henceforth denoted , asks: given a Boolean formula specifying a relation between inputs and outputs , determine functions such that holds whenever holds. Thus, must be a tautology. The function is called a Skolem function for in , and is called a Skolem function vector for in .
For , we use to denote the subsequence . If , we sometimes use interchangeably with for notational convenience. Let denote . It has been argued in [14, 11, 2, 12] that the problem for can be solved by first ordering the outputs, say as , and then synthesizing a function for each . This ensures that . Once all such ’s are obtained, one can substitute through for through respectively, in to obtain a Skolem function for as a function of . The primary problem of using this approach as-is is the exponential blow-up incurred in the size of the Skolem functions.
II-5 DAG representations
For an formula , its DAG representation is naturally induced by the structure of . Specifically, if is simply a literal , its DAG representation is a leaf labeled . If is where , its DAG representation is a node labeled with two children, viz. the DAG representations of and . W.l.o.g. we assume that a DAG representation of is always in a simplified form, where , , and are replaced by , is replaced by 0 and is replaced by for every node . We use for the node count in the DAG representation of .
and are well-known representations of Boolean formulas and we skip their definitions. We briefly recall the definitions of DNNF, dDNNF and wDNNF below. Let be the subformula represented by an internal node (labeled by or ) in a DAG representation of an NNF formula . We use to denote the set of literals labeling leaves that have a path to the node representing in the DAG representation of . We also use to denote the underlying set of variables in that appear in . For each -labeled internal node in the DAG of with being the subformula represented by , if for all distinct indices , , then is said to be in DNNF [9]. If, instead, for all distinct indices , , then is said to be in wDNNF [1]. Finally is said to be in deterministic DNNF(or dDNNF) [10] if is in DNNF and for each -labeled internal node in the DAG of with being the subformula represented by , is a contradiction for all distinct indices .
II-6 Positive form of input specification
Given a specification in , we denote by the formula obtained by replacing every occurrence of in with a fresh variable . This is also called the positive form of the specification and has been used earlier in [2]. Observe that for any in , is positive unate (or monotone) in all variables in and . For , we sometimes split into two parts, and , and represent as . For , let (resp. ) denote a vector of ’s (resp. ’s). For notational convenience, we use to denote .
III A New Normal Form for Efficient Synthesis
In [1], it was shown that if is represented as a ROBDD/FBDD or in DNNF or in wDNNF form, Skolem functions can be synthesized in time polynomial in . In this section, we define a new normal form called SynNNF that subsumes and is more succinct than these other normal forms, and yet guarantees efficient synthesis of Skolem functions.
Definition 1**.**
Given a specification , for every we define the -reduct of , denoted , to be . We also define to be .
Note that is the same as , and for .
Example 1**.**
Consider the formula . Then . Thus, we have and .
Next, we define a useful property for the -reduct, which will be crucial for efficient synthesis of Skolem functions.
Definition 2**.**
Given , let denote , where . We say that is -unrealizable if is unsatisfiable.
Intuitively, we wish to say that there is no assignment to and such that is equivalent to . The formula captures this semantic condition. Indeed, if an assignment makes true, then it also makes equivalent to (i.e., for having values , but not for , , ). Note that since is positive unate in and , is satisfiable iff is satisfiable; we need not conjoin in the definition of .
A sufficient condition for to be -unrealizable is that in the DAG representation of , there is no pair of paths – one from and the other from – which meet for the first time at an -labeled node. In Example 1, is -unrealizable since there is no leaf labeled in its DAG representation. Similarly, is -unrealizable as there is no leaf labeled in the DAG representation of (although such a leaf exists in the DAG representation of ).
Example 2**.**
Let . Then . Using the notation in Definition 2, , and . There is an assignment ( such that is satisfiable. Hence is not -unrealizable (equivalently, it is -realizable). However, ; hence it is vacuously -unrealizable.
Definition 3**.**
A formula is said to be in synthesizable (or SynNNF ) wrt the sequence if is in , and for all , is -unrealizable.
In Examples 1, 2, is in SynNNF, while is not. Also neither of them are in DNNF or wDNNF. Additionally, the functions as presented do not correspond to ROBDD/FBDD representations either. We now show three important properties of SynNNF which motivate our proposal of SynNNF as a normal form for synthesis and existential quantification.
III-1 SynNNF leads to efficient quantification and synthesis
Our first result is that existentially quantifying and synthesizing are easy for SynNNF.
Theorem 1**.**
Suppose is in SynNNF. Then,
- (i)
* for ,* 2. (ii)
Skolem function vector for can be computed in time and space, where .
Proof.
The proof of Part (i) is similar to that of Theorem 2(a) in [1], and follows by induction on . For , (by positive unateness of in ). Conversely, as is in SynNNF, is -unrealizable, which implies that with notation as in Definition 2, , i.e., . This give us the proof in the reverse direction, i.e., .
Suppose the statement holds for . We will show that it holds for as well. By inductive hypothesis and definition of existential quantification, . Again, using unateness of in and in one direction, and using the defining property of SynNNF () in the other direction, we obtain .
Part(ii): For , let denote . Further, from to , we recursively define and . We can now show that is indeed a correct Skolem function for in . Starting from to , we know from the preliminaries that gives a correct Skolem function for in . From part (i) above, . Hence gives a correct Skolem function for in . For any , assuming that gives a correct Skolem function vector for in , the same argument shows that is a correct Skolem function for in .
Finally, note that is at most , which is in . A DAG representation of requires a fresh copy of , but can re-use the DAG representations of for as sub-DAGs. Thus, is in . Hence, if we use a multi-rooted DAG to represent all Skolem functions together, we need only nodes. The time required is in since the resulting DAG has edges (root of connects to a leaf of every for ). ∎
The above polynomial-time strategy based on was used in [1] for computing over-approximations of Skolem functions for each . Specifically, it was shown that over-approximates and over-approximates a Skolem function for in . In the remainder of this paper, we refer to the functions used in the proof of Part (ii) above as functions (after the author names of [1]). We use to denote the (Skolem) function vector .
III-2 Succinctness of SynNNF
SynNNF strictly subsumes many known representations used for efficient analysis of Boolean functions. In the following theorem, sizes and times are in terms of the number of input and output variables.
Theorem 2**.**
- (i)
Every specification in ROBDD/FBDD, dDNNF, DNNF or wDNNF form is either already in SynNNF or can be compiled in linear time to SynNNF. 2. (ii)
There exist poly-sized SynNNF specifications that only admit
- (a)
exponential sized FBDD representations. 2. (b)
super-polynomial sized dDNNF representations, unless 3. (c)
super-polynomial sized wDNNF and DNNF representations, unless . 3. (iii)
There exist poly-sized NNF-representations that only admit super-polynomial sized SynNNF representations, unless the polynomial hierarchy collapses.
In the above, is the algebraic analogue of [27]. Also, (iii) shows that we cannot always hope to obtain a succinct SynNNF representation.
III-3 SynNNF “almost” characterizes efficient synthesis using functions
We now show that SynNNF precisely characterizes specifications that admit linear-time existential quantification of output variables strengthening Theorem 1(i). Further, a slight weakening of SynNNF condition by restricting assignments on gives us a necessary and sufficient condition for poly-time synthesis using functions.
Theorem 3**.**
Given a relational specification ,
- (i)
* is in SynNNF iff * 2. (ii)
The -function vector is a Skolem function vector for in iff is -unrealizable for all .
In [14], it was shown that an error formula for , defined as is unsatisfiable iff is a Skolem function vector for . Therefore, an (un)satisfiability check for serves to check if is -unrealizable for all . Further, in [1], it was observed experimentally, that functions give correct Skolem functions, even when the specifications are not in wDNNF. This surprising behavior, which was left unexplained in [1], can now be explained using SynNNF, thanks to Theorem 3(ii).
Note that Theorem 3(ii) weakens the requirement of SynNNF since are constrained to take only the values defined by . For an example of a specification not in SynNNF for which functions are correct Skolem functions, consider again from Example 2, which we saw was not in SynNNF. In this case, and . Therefore, . It can be verified that is indeed a correct Skolem function vector for in . Also, satisfies the condition of Theorem 3(ii) since [\widehat{{H}}]_{1}[x_{2}\mapsto\psi_{2},\overline{x}_{2}\mapsto\neg\psi_{2}]=\overline{{x}}_{1}\mathrel{{\ooalign{\not\phantom{"}\Leftrightarrow}}}(x_{1}\wedge\overline{{x}}_{1}), and .
IV Refinement for Synthesis
Given a specification , sometimes it is easier to solve the problem for a “simpler” specification such that a solution for also serves as a solution for . While “simplifications” of this nature have been used in earlier work [14, 1, 22, 7], we formalize this notion below as one of refinement.
Definition 4**.**
Let and be Boolean relational specifications on the same input and output vectors. We say that refines w.r.t. synthesis, denoted , iff the following conditions hold: (a) , and (b) .
Informally, condition (a) specifies that doesn’t restrict the set of input valuations (i.e. ) over which the specification can be satisfied, and condition (b) specifies that for all such input valuations , any that satisfies also satisfies .
Lemma 4**.**
If , every Skolem function vector for in is also a Skolem function vector for in .
We say refines w.r.t. synthesis because the set of all Skolem function vectors for in is a subset of that for in . Note that Definition 4 provides a direct 2QBF-SAT based check of whether refines without referring to the details of how is obtained from .
Example 3**.**
Let and . Although , both conditions (a) and (b) of Definition 4 are satisfied; hence .
The following are easy consequences of Definition 4.
Proposition 5**.**
* is a reflexive and transitive relation on all Boolean relational specifications on .* 2. 2.
If and , then . 3. 3.
If , then . 4. 4.
If is positive (resp. negative) unate in , then (resp. ) . 5. 5.
If and , then
- (a)
. 2. (b)
* if the output supports of and , and similarly of and , are disjoint.*
Propositions 5(2) and 5(3) effectively require to be semantically (but not necessarily syntactically) independent of and respectively. While these may appear to be degenerate cases, we will soon see that both these propositions turn out to be useful when recursively compiling a specification into refined SynNNF specification. Interestingly, a version of Proposition 5(4) was used in a pre-processing step of BFSS [1], although the precise notion of refinement w.r.t. synthesis was not defined there. Thanks to Definition 4, we can now generalize Proposition 5(4) to refine a specification even when is not unate in any output variable. We discuss below how this can be done.
Suppose the specification uniquely defines an output variable as a function of other input and output variables. For example, if , then . Such specifications arise naturally when a non- Boolean formula is converted to via Tseitin encoding [26]. Variables like above are said to be functionally determined (henceforth called ) in , and implied functional dependencies like are called functional definitions (henceforth called f-defs) of variables in .
Let be a set of output variables in , and let be the conjunction of f-defs of all variables in . We say that is an acyclic system of f-defs if no variable in transitively depends on itself via the functional definitions in . In other words, induces an acyclic system of functional dependencies between variables in . For , define to be the formula , where and is a sequence of fresh variables . Informally, asserts that if the specification can be satisfied by setting a non- output to , then it can also be satisfied by setting to the complement value (), while preserving the values of all other non- outputs. The outputs in must of course be set as per the functional definitions in .
Lemma 6**.**
Let be an acyclic system of f-defs in .
If , then . 2. 2.
If , then for every , we have:
If is a tautology, then . Similarly, if is a tautology, then .
If , Lemma 6(2) simply reduces to Proposition 5(4). However, if (as is often the case), Lemma 6(2) shows that (resp. ) can refine even if is not positive (resp. negative) unate in . As an illustration, the specification in Example 3 is not unate in either or . However, with and , we have . Hence, . When is refined by an application of Lemma 6(2), we say that is refined by pivoting on .
Lemma 7**.**
Let and be acyclic systems of f-defs in , where and . For , if is a tautology, then so is .
Lemma 7, along with Lemma 6(2), shows that if , the system of acyclic f-defs potentially provides more opportunities for refinement compared to . Hence, it is advantageous to augment the set of outputs (and correspondingly ) whenever possible.
The following theorem suggests that compiling a given specification to a refined SynNNF specification (as opposed to an equivalent SynNNF specification) holds promise for Boolean functional synthesis.
Theorem 8**.**
*For every relational specification , there exists a polynomial-sized Skolem function vector for in iff there exists a SynNNF specification such that and is polynomial-sized in . *
Theorem 8 guarantees that whenever a polynomial-sized Skolem function vector exists for a specification , there is also a polynomial-sized refined specification in SynNNF. It is therefore interesting to ask if we can compile to a “small enough” SynNNF specification that refines . In the next two sections, we present such a compilation algorithm and results of our preliminary experiments using this algorithm. Note that as shown in [1], there exist problem instances for which there are no polynomial-sized Skolem function vectors, unless the Polynomial Hierarchy () collapses. Thus, any algorithm for compilation to SynNNF must incur super-polynomial blow-up (unless collapses). Nevertheless, as our experiments show, the compilation-based approach works reasonably well in practice, even solving benchmarks beyond the reach of existing state-of-the-art tools.
V A Refining to SynNNF Compiler
We now describe – an algorithm that takes as input a specification given as a set of clauses, and outputs a DAG representation of a SynNNF specification that refines w.r.t. synthesis. Given a set of clauses, we use to denote the formula .
Let be a set of clauses. Abusing notation introduced in Section II, let . We define an undirected graph , where and iff and . Thus, there exists an edge iff and share an output atom. Let be the set of maximally connected components (henceforth called s) of . It is easy to see that ; moreover, the output supports of for are mutually disjoint. We use to denote that clauses and are in the same of . We will soon see how factoring based on s of allows us to decompose the -to-SynNNF compilation problem into independent sub-problems, thanks to Proposition 5(5)b. Note that factoring based on s has also been used in DSharp [20] for converting a formula to dDNNF. However, unlike above, the underlying graph in DSharp has an edge between every pair of clauses that shares any atom, including input variables. Thus, has potentially fewer edges, and hence smaller s, than the corresponding graph constructed by DSharp.
Before delving into Algorithm , we first discuss some important sub-routines used in the algorithm. Sub-routine FDRefine takes as inputs a set of clauses and a (possibly empty) acyclic system of f-defs in . It returns a (possibly augmented) acyclic system of f-defs and a set of clauses such that and . Sub-routine FDRefine works by iteratively finding new ouptut variables and refining the specification using Lemma 6(2) whenever possible. In the pseudo-code of FDRefine (see Algorithm 1), sub-routine FindFD matches a pre-defined set of clause-patterns in to identify new output variables not already in . The patterns currently matched correspond to encodings of the input-output relation of common Boolean functions, viz. , , , , , , and . For example, we match the pattern , where are place-holders, to identify the functional definition . Each new output variable thus identified is added to and the corresponding functional definition is added to unless this introduces a cyclic dependency among the f-defs already in . Assuming all patterns used by FindFD to determine functional dependencies are sound, the (possibly augmented) computed by FindFD is a system of acyclic f-defs in . In lines - of Algorithm 1, we next check if Lemma 6(2) can be applied to refine by pivoting on some variable . The refinement, if applicable, is easily done by replacing each clause by (resp. ) and by adding the unit clause (resp. ) to . The pivot is also added to and the corresponding functional definition ( or as the case may be) is added to .
In general, identifying an acyclic system of f-defs in potentially enables refinement of via Lemma 6(2), which in turn, can lead to augmenting the acyclic system of f-defs further. Therefore, the loop in lines - of Algorithm 1 is iterated until no new outputs or additional refinements are obtained. Once this happens, subroutine FDRefine returns the resulting acyclic system of f-defs and the resulting set of refined clauses .
Two other important sub-routines used in are GetCkt and GetDefCkt. Sub-routine GetCkt takes as input an formula and returns the DAG representation of . Sub-routine GetDefCkt takes as input a system of acyclic f-defs , where (i.e. is the entire output support of ). It returns a DAG representation of a SynNNF specification equivalent to . Without loss of generality, let be a linear ordering of the output variables in such that the functional definition of in does not depend on any for . Such an ordering always exists since is an acyclic system of f-defs. Let be the functional definition of in , where is a Boolean function identified via clause-pattern matching in sub-routine FindFD. For each in -order in , we now construct a DAG representing in . While constructing , we ensure that every that is also an argument of is replaced by the root, say , of the DAG . Since is an acyclic system of f-defs, this is always possible. Finally, we construct the overall DAG, say , representing . It is easy to see that for every , there are no paths from and that meet for the first time at an -labeled node in . Abusing notation and using to denote the specification represented by the above DAG, we therefore have is -unrealizable for all ; hence is in SynNNF.
We are now in a position to describe Algorithm . The algorithm is recursive and takes as inputs a set of clauses, a (possibly empty) system of acyclic f-defs in , and the recursion level . Initially, is invoked with given set of clauses, , and . The pseudocode of , shown in Algorithm 2, first computes the output support of , and then checks a few degenerate cases (lines -) to determine if a refined SynNNF specification can be easily obtained. In case these checks fail, sub-routine FDRefine is invoked to augment the set of functionally dependent outputs and their corresponding acyclic f-defs , and also to obtain a (possibly) refined set of clauses. If all outputs in get functionally determined by this, Lemma 6(1) guarantees that ; hence an invocation of GetDefCkt() gives the desired result in line . Otherwise, we check in lines - if Theorem 3(ii) can be applied. Recall that Theorem 3(ii) relaxes the requirements of the SynNNF definition by requiring -unrealizability only when functions are substituted for the variables. As discussed in Section III-3, the relaxed requirement can be checked by testing the unsatisfiability of the error formula for the function vector . If is indeed unsatisfiable, is a Skolem function vector for in , and hence refines .
If is satisfiable, we use a sub-routine ChooseOutputVar that heuristically chooses an output variable on which to branch. Currently, we use a [19] score based heuristic, similar to that used in DSharp [20], to rank variables in , and then choose the variable with the highest score. This allows us to represent as , so that we can refine the two disjuncts independently, thanks to Proposition 5(5)a. However, this may lead to some duplicate processing of clauses. We can avoid this by factoring out the subset of clauses whose satisfiability is independent of whether is set to or [math]. Let (resp. ) be the subset of clauses in that are in the same of as some that has (resp. ) as a literal. Let be the set of all clauses in that are neither in nor . By definition of , the sub-specifications and (and similarly, and ) do not share any output variable in their supports, and can be refined independently. This is exactly what algorthm does in lines -. The roots of the DAGs resulting from the recursive calls in lines , and are finally combined as in line to yield the desired DAG representation.
Theorem 9**.**
For every set of clauses, \mathsf{C2Syn}$$\left(\mathcal{S},\emptyset,1,0\right) always terminates and returns a DAG representation of a SynNNF specification such that .
VI Experimental results
We ran Algorithm on a suite of specifications comprised of benchmarks from the Prenex 2QBF track of QBFEval 2018 [21], and the .qdimacs version of Factorization benchmarks [1], which we will refer to as FA.QD. By Theorem 2(i), a ROBDD/FBDD specification can be compiled to an equivalent SynNNF specification in linear time. Therefore, any algorithm that compiles a specification to an ROBDD can be viewed as an alternative to for compiling a specification to SynNNF (albeit without refinement). We compare the performance of with that of a BDD compiler and two state-of-the-art boolean function synthesis tools, namely, the AIG-NNF pipeline of bfss [1] with ABC’s MiniSat as the SAT solver and Cadet [22, 24]. For the BDD Compiler, the .qdimacs input was converted to an AIG using simple Tseitin variable detection; this AIG was then simplified and ROBDDs built using dynamic variable ordering (of all input and output variables) – this is part of the BDD pipeline of bfss [1], henceforth called . We also ran DSharp [20] which compiles a formula into dDNNF (and hence SynNNF by Theorem 2(i)), but it was successful on very few of our benchmarks; hence we do not present its performance. Each tool took as input the same .qdimacs file. Experiments were performed on a cluster with cores and GB memory per node, each core being a GHz Intel Xeon processor running CentOS6.5. Each run was performed on a single core, with timeout of hour and main memory limited to GB.
For , several benchmarks were solved in the initial part of the Algorithm 2 before line 17, i.e., before any recursive calls are made. Table I presents the results for , divided into those that succeeded at recursion level zero (Stage-I) and those that required recursions (Stage-II), as well as the comparison with . Since BDDs are also in SynNNF, the total number of benchmarks in QBFEval which could be compiled into SynNNF (by either compiler) is a whopping .
Figure 1 compares the run-times of and : for most QBFEval benchmarks that were solved by both, took less time, while for FA.QD, took more time. There were QBFEval benchmarks that solved by couldn’t, whereas were solved by but not . This indicates that the two approaches to SynNNF compilation have orthogonal strengths.
We next compare with Cadet and bfss. Cadet (resp. bfss) solved (resp. ) benchmarks in QBFEval and (resp. ) in FA.QD. Table II gives a comparison in terms of number of benchmarks solved by each tool but not by others. Figure 2 (left, right) compares the run-times of and those of Cadet and bfss, respectively. As expected, since does complete compilation, it takes more time than Cadet and marginally more than bfss on many benchmarks, though for most of these, the time taken is less than a minute. In fact for FA.QD, takes less time than bfss on all benchmarks. Overall, appears to have strengths orthogonal to , bfss and Cadet, and adds to the repertoire of state-of-the-art tools for Boolean functional synthesis.
To validate our experimental results, we also developed an independent approach to verify if the output of is (i) in SynNNF and (ii) a refinement of the original specification (which by Theorem 1 and Lemma 4 suffices to efficiently generate Skolem functions). For (i), we check a stronger than required, syntactic condition for being in SynNNF, namely, for every output variable , there is no pair of paths from and in the DAG output by that meet at an -node. Note that this is the sufficient that was described just after Definition 2 in Section III. While this requirement is stronger than the semantic requirement for SynNNF, we choose to use this because of the efficient manner in which this can be checked.
For (ii), we just check the two semantic conditions in Definition 4. Checking condition (a) requires the use of a 2QBF solver, while condition (b) can be checked using a propositional (un)satisfiability solver. Of the 185 benchmarks on which C2Syn was successful, our verifier successfully verified 183 benchmarks, ran out of memory on 1 and out of time on another benchmark (time limit: 1 hour, main memory limit : 16GB).
Finally, we note that pre-processing techniques are known to effectively simplify several QBF problem instances. Stage-I of can be seen as subsuming several simple QBF preprocessing techniques, e.g., unit clause and pure literal detection, semantic unateness and identifying Tseitin variables. Using more aggressive QBF preprocessing could further improve the performance of our tool, and we leave this for future work.
VII Conclusion
We presented a new sub-class of called SynNNF that admits quadratic-time synthesis and linear-time existential quantification of a set of variables. Our prototype compiler is able to handle several benchmarks that cannot be handled by other state-of-the-art tools. Since representations like ROBDDs, DNNF and the like are either already in or easily transformable to SynNNF, our work is widely applicable and can be used in tandem with other techniques. As future work, we intend to work on optimizing our SynNNF compiler.
Appendix A Material from Section III
A-A Proof of Theorem 2 of Section III
This section is dedicated to the proof of Theorem 2. We show that SynNNF is a space-efficient DAG-based representation of boolean functions, when compared with other representations using , DNNF and dDNNF.
First, observe that Part(i) is easy. That is, it has been shown, e.g., in [9] that can be converted to DNNF with a linear complexity blowup. Now, focussing on dDNNF, DNNF, wDNNF, an examination of their definitions immediately gives us that each of these forms is already in SynNNF. Further, from the definition again it is clear that dDNNFis subsumed by DNNF, which is further subsumed by wDNNF, as depicted in Figure 3. To show strictness, it suffices to consider Example 1, which is in SynNNF but not in wDNNF since and indeed meet up at an -node in . This completes Part (i).
For part (ii), we start by noting that it has been shown in [9] that the DNNF representation is exponentially more succinct than . We now show that SynNNF is super-polynomially (resp. exponentially) more succinct than dDNNF, DNNF and wDNNF (resp. ) representations, unless some long-standing complexity conjectures are falsified. To do this, we describe a family of specifications having a polynomial sized SynNNF representation, but for which the other representations are necessarily super-polynomially larger, unless these complexity conjectures are falsfied.
Consider the family of specifications defined as follows. Let , and let , be arbitrary boolean functions in over . We define the family , parametrized by , and as
[TABLE]
Lemma 10**.**
Let be a function in the family of specifications . Then
If , then is in SynNNF. 2. 2.
If , then is in SynNNF.
Proof.
Let be any function in the family with all the . It is easy to see that is in SynNNF, using the sufficient condition in Section III. That is, in , there is no , so we never have a and meeting at the root. Further, after replacing with 1, the leftmost subtree rooted at having children is no longer there after constant propagation. In the rest of the tree, we have no occurrences of , hence no way for and to meet at the root. Thus, for each , the argument is similar, since on replacing with 1 and doing constant propagation, the remaining DAG will not have and together, which shows that is in SynNNF. 2. 2.
Let be any function in the family with all the . Note that in this case, we cannot use the sufficient condition as above, clearly, meet at a in . Nevertheless is in SynNNF, if we consider for , and consider the root node with children and , after substituting to 1, to , and constant propagation, it is easy to see that , and for and some . Thus is unsatisfiable. Thus, is unrealizable for any .
∎
Theorem 11** (Restatement of Theorem 2(ii)).**
- (a)
There are functions which admit polynomial sized SynNNF representations, yet admit only exponential sized representations. 2. (b)
Unless , there are functions which admit polynomial sized SynNNF representations, yet admit only super-polynomial sized dDNNF representations. 3. (c)
Unless , there are functions which admit polynomial sized SynNNF representations, yet admit only super-polynomial sized wDNNF and DNNF representations.
Proof.
We use the family of specifications defined above, with different instantiations to obtain all three results. Set , , for , obtaining . Let . As seen in Lemma 10, is in SynNNF. In each of the subparts below, we define appropriately.
Item (a): Succinctness w.r.t . Let . We use the -bit multiplier function over in the construction of . The two bit arguments to the multiplier are respectively, and with being the most significant bits, and being the least significant bits. Let be the boolean function representing the th bit of the -bit multiplier function. The size of is quadratic in , since the size of any multiplier circuit consisting of gates is quadratic in (sum of partial products). For this , the size of is .
Let be a representation of using , by fixing a certain variable order. Set . This assignment makes . Indeed, the representation obtained as a restriction of with respect to this truth assignment is simpler [4]. It is known [5] that any , representations for is exponential in . This establishes the exponential succinctness of SynNNF over .
Item (b): **Succinctness w.r.t **dDNNF. We use a CNF encoding of the perfect matchings of a bipartite graph (denoted ) in the construction of . Given a bipartite graph with two parts and , we can define a 0-1 matrix such that iff there is an edge between and . It is easy to see from the definition of the permanent of (denoted ) that . Likewise, the number of perfect matchings of a bipartite graph is the permanent of its incidence matrix. Set as the function which encodes .
Let be the dDNNF representation of . As in the first case, choose an assignment obtaining . Then it can be seen that the number of solutions of is exactly the number of perfect matchings of the bipartite graph . Fixing the assignment results in a simpler dDNNF representation (say ) for (now ). Counting the models of can be done in time polynomial in the size of [10]. This implies that we can find the number of perfect matchings of the underlying bipartite graph in time polynomial in the size of . Unless , cannot have a polynomial representation, since otherwise, we would obtain a polynomial time solution for computing . This shows the super-polynomial succinctness of SynNNF over dDNNF, unless .
Item(c): **Succinctness w.r.t wDNNF and DNNF ** Let , for , obtaining . As shown in Lemma 10, is in SynNNF, where is an arbitrary SAT formula. If we can obtain a poly-sized DNNF representation for the function , then using the assignment in , we obtain a DNNF representation for . But it is known [10] that consistency checking is poly-time for DNNF representations. A polynomial sized DNNF representation for would imply a polynomial time solution for the satisfiability checking of an arbitrary SAT formula. Thus, unless , any DNNF representation for will necessarily be super polynomial. ∎
This completes the proof of Part (ii) of Theorem 2.
Part(iii). By Theorem 1 of [1], we know that there exist instances of poly-sized NNF formulas whose Skolem functions are necessarily super-polynomial size (resp. exponential) unless the polynomial hierarchy collapses (resp. the non-uniform exponential hypothesis is falsified). For any such instance, suppose we were able to obtain a poly-sized SynNNF representation, then by Theorem 1, we will be able to synthesize polynomial-sized Skolem functions, which contradicts the above.
To see a concrete example where SynNNF is not likely to be succinct, we refer to Theorem 1 of [1], where a constructive reduction of the parameterized clique problem to was given. The specification, in this case, has a polynomial-sized representation, but unless some long-standing complexity-theory conjectures are violated, it was shown that any Skolem function must have exponential/super-polynomial size. Thus, unless these conjectures are violated, the same specification in SynNNF must also be exponential/super-polynomial sized.
This proves Part (iii) and completes the proof of this theorem.
Essentially this means that though we obtain succinctness with respect to several known forms (using classical complexity-theoretic results), it is not the case that SynNNF will always be able to produce a poly-sized representation.
A-B Proof of Theorem 3
Let us recall the characterization theorem from Section III. See 3
Proof.
Part 1): The forward direction follows from Theorem 1. For the reverse direction, we will prove the contrapositive: if is not in SynNNF, i.e., if is not -unrealizable for some , we will show that for some , . Fix any , i.e., it is a realizable valuation of inputs. Consider to be the largest index such that is not -unrealizable, i.e., the corresponding is satisfiable. As a result, we have , i.e., . On the other hand and . By monotonicity, every assignment of will also result in 0 in , which implies that . Thus for this , , which completes the proof.
Part 2): Forward direction: We will prove the contrapositive, i.e., if is not -unrealizable for some , we will show that is not a correct Skolem function vector for in . Fix any , i.e., it is a realizable valuation of inputs. Consider to be the largest index such that is not -unrealizable, i.e., the corresponding is satisfiable.
We claim that one of the must be an incorrect skolem function for this . Suppose not, i.e., suppose all of them are correct. Then we have
[TABLE]
However, because at , is satisfiable, we have and . By monotonicity, every assignment of will also result in 0 in . But this contradicts (1). Hence all the skolem functions cannot be correct for this , proving the forward direction.
Reverse direction: Again, we prove by taking the contrapositive. Suppose, is not a correct Skolem function vector. In [14], it was shown that for any function vector , it is a Skolem function vector for iff the error formula is unsatisfiable. We will use this characterization now, i.e., since is not a correct Skolem function vector, the error formula must be satisfiable.
Hence, we start by considering which gives a satisfying assignment for the error formula . That is,
[TABLE]
Let be the highest such such that the above statement holds. That is, after , the Skolem functions given by are correct, and at they are incorrect. Then, we observe that the value at must be 1, i.e.,
[TABLE]
To see this, observe that along with maximality of implies that , which in turn implies that
[TABLE]
Now, if , this implies . But then, setting is indeed correct, which would imply that there is no error at , which violates the assumption on . Thus we must have .
Now, we know that this is an incorrect assignment to , which implies that the correct assignment is a [math] and we know that is a correct assignment to . Hence, we must have
[TABLE]
The fact that equations (3), (4) hold together imply that the Skolem function is wrong at level , since it gives value 1, but fixing , there is no way to set lower variables to get 1. The rest of the proof is a careful case-analysis, where we either show that (with Skolem functions assigned according to ) at level is satisfiable, i.e., is not -unrealizable and hence the proof terminates, or we show that these equations are satisfied at a lower level (i.e., there is an error at a lower level). Since number of levels is finite this procedure will terminate. We describe the different cases now:
- Case 1:
The first case is if
[TABLE]
then, behaves as an AND gate, i.e.,
[TABLE]
which implies that (with the Skolem functions assigned according to ) will be satisfiable at and hence this terminates the proof.
- Case 2:
This case is if
[TABLE]
In this case, note that and from Equation (4), we have
[TABLE]
Thus the Skolem function is wrong at level , since it gives 1 but fixing , there is no way to set lower variables to 1. In other words, we have reduced the problem by one level and can recursively apply this argument at level .
- Case 3:
[TABLE]
Note that this case is possible only if . But if this is not the case, i.e., if , and , this implies that there exists no counter-example which contradicts Equation (2). Now we have three subcases:
- Case 3(a):
[TABLE]
But this case reduces to Case 1 above, i.e., we can see that behaves as an AND gate (i.e., it is not -unrealizable), and so it terminates.
- Case 3(b):
[TABLE]
which as in Case 2, reduces the problem by two levels.
- Case 3(c):
[TABLE]
But reduces to Case 3 at level , thus ensuring strict progress in this case as well.
Together this completes the proof.
∎
A-C Proofs from Section IV
See 4
Proof.
Let be a Skolem function vector for in . From condition (a) of Definition 4, we know that . Further, from condition (b) of Definition 4 and using for , we have . This shows that is a Skolem function vector for in . ∎
See 5
Proof.
The reflexivity of follows trivially from Definition 4. To see why is transitive, suppose and . It follows from transitivity of that . This proves condition (a) of . To prove condition (b) of , notice that by condition (a) of . Additionally, by condition (b) of . Finally, by condition (b) of , it follows that . Putting all the parts together and by transitivity of , we have . This proves condition (b) of . 2. 2.
Suppose and . Then is semantically independent of and holds. Therefore, . Since trivially, it follows that . Therefore condition (a) of Definition 4 is satisfied. Condition (b) of Definition 4 follows from the observation that since and is semantically independent of , we have and . 3. 3.
Suppose . Then is semantically independent of . Substituting for in condition (a) of Definition 4, we get a tautology. Similarly, substituting for in condition (b) of Definition 4, we get . Since is semantically independent of , the above formula is also a tautology. Hence condition (b) of Definition 4 is also satisfied. 4. 4.
If is positive unate in , then . It follows that . Therefore, . This proves condition (a) of Definition 4. To show that condition (b) of the definition also holds, note that is trivially a tautology. The proof for the case when is negative unate in is analogous to the one above. 5. 5.
Suppose and .
- (a)
Since and , condition (a) of Definition 4 follows immediately. To see why condition (b) of the definition holds, notice that . By condition (b) for and , it follows that and . Hence, by transitivity of , we get . Since this holds for all and , condition (b) of Definition 4 is satisfied. 2. (b)
Since the output supports of and , and similarly of and , are disjoint, we have , and . Therefore, condition (a) of Definition 4 follows immediately.
To see why condition (b) of the definition holds, notice that . By condition (b) for and , it follows that and . Hence, by transitivity of , we get . Since this holds for all and , condition (b) of Definition 4 is satisfied.
∎
See 6
Proof.
To prove part (1), notice that . Hence, whenever is satisfied, each of the functional definitions in are also satisfied. Therefore, condition (a) of Definition 4 is satisfied. For condition (b) of Definition 4, notice that for every value of , only when the value of is as given by , does evaluate to . For these values of , if is such that holds, then must also hold since and .
To prove part (2), consider to be a tautology; the proof for the case of being a tautology is analogous. We show below that (a) , and (b) . Let be an arbitrary element in . To see why (a) holds, suppose . If , we set and it follows that . If , we set for every , set and set the value of every for according its functional definition in . Since is a tautology, it follows that . To see why (b) holds, suppose . It follows trivially that must be set to , and . ∎
See 7
Proof.
Observe that for any system of acyclic f-defs in , since , the formula is a tautology iff is a tautology. It is now easy to see that if and is valid, then is valid as well. ∎
See 8
Proof of Theorem 8.
The reverse direction is proved by first applying Theorem 1(ii) to , and then noting that since , every Skolem function vector for in is also a Skolem function vector for in . For the forward direction, let be a Skolem function vector for in such that the size of an AND/OR/NOT gate circuit representation of (denoted ) is polynomial in . As mentioned in Section II, every such circuit can be converted to NNF in time . Hence the NNF representation of is of size at most polynomial in . Therefore, w.l.o.g we consider to be in NNF. Now consider the specification . Since no paths from and () meet at an -labeled node in the circuit representation of , it follows that is in SynNNF. Furthermore, by construction of , every Skolem function vector for in is necessarily component-wise semantically equivalent to , which is itself a Skolem function vector for in . Therefore, conditions (a) and (b) in Definition 4 are satisfied by , and hence . ∎
A-D Proof from Section V
See 9
Proof.
To see that always terminates, notice that every time the recursion level in Algorithm 2 increases, the set of output variables in the remaining set of clauses reduces by . Hence, the maximum value of can only be , and the recursion always terminates. To see why FDRefine (Algorithm 1) terminates, notice that every time changes, its size increases by at least , and hence can change at most times. Similarly, every time changes, at least one variable is added to , and hence cannot change more than times.
To see that the returned specification refines , notice that each of the return statements in Algorithm 2 (i.e., lines , , , , and ) uses one of the properties of refinement already discussed in Section IV. Specifically, the correctness of line is trivial. The correctness of lines and use Propositions 5(2) and 5(2). The correctness of lines and use Lemma 6(1). The correctness of line uses Proposition 5(5). ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Akshay, Supratik Chakraborty, Shubham Goel, Sumith Kulal, and Shetal Shah. What’s Hard About Boolean Functional Synthesis? In Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, Flo C 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I , pages 251–269, 2018.
- 2[2] S. Akshay, Supratik Chakraborty, Ajith K. John, and Shetal Shah. Towards Parallel Boolean Functional Synthesis. In TACAS 2017 Proceedings, Part I , pages 337–353, 2017.
- 3[3] G. Boole. The Mathematical Analysis of Logic . Philosophical Library, 1847.
- 4[4] R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput. , 35(8):677–691, August 1986.
- 5[5] Randal E. Bryant. On the complexity of VLSI implementations and graph representations of boolean functions with application to integer multiplication. IEEE Trans. Computers , 40(2):205–213, 1991.
- 6[6] Marco Cadoli and Francesco M. Donini. A survey on knowledge compilation. AI Commun. , 10(3-4):137–150, 1997.
- 7[7] Supratik Chakraborty, Dror Fried, Lucas M. Tabajara, and Moshe Y. Vardi. Functional synthesis via input-output separation. In 2018 Formal Methods in Computer Aided Design, FMCAD 2018, Austin, TX, USA, October 30 - November 2, 2018 , pages 1–9, 2018.
- 8[8] Supratik Chakraborty, Dror Fried, Lucas M. Tabajara, and Moshe Y. Vardi. Functional synthesis via input-output separation. In 2018 Formal Methods in Computer Aided Design, FMCAD 2018, Austin, TX, USA, October 30 - November 2, 2018 , pages 1–9, 2018.
