Compression with wildcards: Abstract simplicial complexes

Marcel Wild

arXiv:1812.02570·cs.DS·March 10, 2021

Compression with wildcards: Abstract simplicial complexes

Marcel Wild

PDF

Open Access

TL;DR

This paper introduces a new algorithm called Facets-To-Faces for efficiently compressing and representing abstract simplicial complexes, with applications in various computational fields.

Contribution

The paper presents a novel algorithm for compressing simplicial complexes using wildcards, improving efficiency over existing methods, and introduces a new way to compute face numbers from facets.

Findings

01

Facets-To-Faces outperforms Mathematica's BooleanConvert and Python BDDs in compression.

02

The algorithms can be parallelized for enhanced performance.

03

Applications include reliability analysis, combinatorial topology, and frequent set mining.

Abstract

Despite the more handy terminology of abstract simplicial complexes SC, in its core this article is about antitone Boolean functions. Given the maximal faces (=facets) of SC, our main algorithm, called Facets-To-Faces, outputs SC in a compressed format. The degree of compression of Facets-To-Faces, which is programmed in high-level Mathematica code, compares favorably to both the Mathematica command BooleanConvert, and to the BDD's provided by Python. A novel way to calculate the face-numbers from the facets is also presented. Both algorithms can be parallelized and are applicable (e.g.) to reliability analysis, combinatorial topology, and frequent set mining.

Equations6

(15) ∣ r_{5} ∣ + ∣ r_{7} ∣ + ∣ r_{8} ∣ + ∣ r_{10} ∣ + ∣ r_{11} ∣ + ∣ r_{13} ∣ + ∣ r_{14} ∣ = 16 + 8 + \dots + 12 = 52 = ∣ S C_{1} ∣.

(15) ∣ r_{5} ∣ + ∣ r_{7} ∣ + ∣ r_{8} ∣ + ∣ r_{10} ∣ + ∣ r_{11} ∣ + ∣ r_{13} ∣ + ∣ r_{14} ∣ = 16 + 8 + \dots + 12 = 52 = ∣ S C_{1} ∣.

ρ_{i} \subseteq P (F_{t + 1}) \cup \dots \cup P (F_{h}) \Leftrightarrow X \in P (F_{t + 1}) \cup \dots \cup P (F_{h}) \Leftrightarrow (\exists j \in {t + 1, \dots, h}) X \subseteq F_{j} .

ρ_{i} \subseteq P (F_{t + 1}) \cup \dots \cup P (F_{h}) \Leftrightarrow X \in P (F_{t + 1}) \cup \dots \cup P (F_{h}) \Leftrightarrow (\exists j \in {t + 1, \dots, h}) X \subseteq F_{j} .

R\cdot\bigg{(}O(s(h-t)w)+O((s+1)w^{2})\bigg{)}=R\cdot\bigg{(}O(Rhw)+O(Rw^{2})\bigg{)}=O(R^{2}w^{2}h).\ \square

R\cdot\bigg{(}O(s(h-t)w)+O((s+1)w^{2})\bigg{)}=R\cdot\bigg{(}O(Rhw)+O(Rw^{2})\bigg{)}=O(R^{2}w^{2}h).\ \square

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Polynomial and algebraic computation · Constraint Satisfaction and Optimization

Full text

Compression with wildcards:

Abstract simplicial complexes

Marcel Wild

Abstract. Despite the more handy terminology of abstract simplicial complexes ${\cal S}{\cal C}$ , in its core this article is about antitone Boolean functions. Given the maximal faces (=facets) of ${\cal S}{\cal C}$ , our main algorithm, called Facets-To-Faces, outputs ${\cal S}{\cal C}$ in a compressed format. The degree of compression of Facets-To-Faces, which is programmed in high-level Mathematica code, compares favorably to both the hardwired Mathematica command BooleanConvert, and to the hardwired BDD’s provided by Python. A novel way to calculate the face-numbers from the facets is also presented. Both algorithms can be parallelized and are applicable (e.g.) to reliability analysis, combinatorial topology, and frequent set mining.

**Key words: **Abstract simplicial complex, face-numbers, antitone Boolean function, exclusive sum of products, binary decision diagram, compressed enumeration, wildcards, reliability polynomial, partitionability conjecture

1 Introduction

While the present article focuses on bare algorithmics, four application areas are outlined at the end of this introduction, in Subsection 1.3. We start with a broad (1.1), and then more detailed (1.2) outline of the article.

1.1 An abstract111The adjective ’abstract’ is sometimes added to make a distincion to the simplicial complexes considered in topological combinatorics. For the sake of brevity we henceforth drop ’abstract’. simplicial complex (also called set ideal) based on a set $W$ is a family ${\cal S}{\cal C}$ of subsets $X\subseteq W$ (called faces) such that from $X\in{\cal S}{\cal C}$ , $Y\subseteq X$ , follows $Y\in{\cal S}{\cal C}$ . Without further mention, in this article all structures will be finite. In particular all simplicial complexes ${\cal S}{\cal C}$ contain maximal faces, called the facets of ${\cal S}{\cal C}$ . Henceforth we mostly stick to $W=[w]:=\{1,2,\cdots,w\}$ . A face of cardinality $k$ is called a $k$ -face, and the set of all $k$ -faces is denoted as ${\cal S}{\cal C}[k]$ . The numbers $N_{k}:=|{\cal S}{\cal C}[k]|$ are the face-numbers of the simplicial complex. The purpose of this article is to retrieve the following data from the facets:

$(E)$ an enumeration of ${\cal S}{\cal C}$ ;

$(E_{k})$ an enumeration of ${\cal S}{\cal C}[k]$ for one arbitrary $k\in[w]$ ;

$(C)$ the cardinality $N:=|{\cal S}{\cal C}|$ ;

$(C_{\forall k})$ the face-numbers $N_{k}$ for all $k\in[w]$ .

Although our four tasks can be phrased in terms of Boolean functions, speaking of simplicial complexes is, for the most part, more illuminating. While task $(E)$ matches $(C)$ , there is a mismatch between $(E_{k})$ and $(C_{\forall k})$ . Here is why: If we change $(C_{\forall k})$ to the calculation of one $N_{k}$ , then this (essentially) is just as hard. Throughout the article the simplicial complex ${\cal S}{\cal C}_{1}\subseteq{\cal P}[9]$ whose facets are

(1) $F_{1}=\{1,2,3,9\},\ F_{2}=\{3,5,7,9\},\ F_{3}=\{2,7,9\},\ F_{4}=\{3,6,8,9\},\ F_{5}=\{2,4,8,9\}$ ,

serves to illustrate our algorithms.

The theoretic complexity of at least three of the problems is well known. To witness, according to [V] it is $\#P$ -hard to calculate the number of models of a Boolean function $f$ given in DNF, even if $f$ is antitone222Strictly speaking that follows by de Morgan duality since Valiant only speaks about CNF’s and monotone Boolean functions. Recall that $f$ is monotone if $x\leq y\Rightarrow f(x)\leq f(y)$ , and antitone if $x\leq y\Rightarrow f(x)\geq f(y)$ . . Since (C) can be modelled by such $f$ (see (3)), this implies the $\#P$ -hardness of $(C)$ and a fortiori $(C_{\forall k})$ . Like most (unfortunately not all) authors we take enumeration as a synonym for generation, thus not to be confused with mere counting. It might be counter-intuitive333More ’philosophy’ on this matter follows in Subsection 6.3. that enumerating should ever be more tractable than counting. Yet $(E)$ amounts to enumerate the models of a specific DNF, and enumerating the models of any DNF works in ’benign’ polynomial total time, whereas (C) is $\#P$ -hard. Perhaps the complexity of $(E_{k})$ was known before but the author could not pinpoint a reference; the matter is settled anew in Theorem 2. Our main contributions are however on the practical side; when computational efficiency lacks a theoretic underpinning (which is to be expected in view of Valiant’s results) it will be evidenced by numerical experiments. The main effort will go into $(E)$ and $(E_{k})$ . That is because we strive for a compressed enumeration in both cases.

1.1.1 Compression starts with the don’t-care symbol ’2’ (other authors write $\ast$ ) which say in $(1,0,2,0)$ signifies that both bitstrings (=01-rows) $(1,0,0,1)$ and $(1,0,1,0)$ are allowed. This leads to 012*-rows*. For instance, the modelset of a (Boolean) term $T$ like $x_{2}\wedge\overline{x_{4}}\wedge\overline{x_{7}}$ is the 012-row $r(T)=(2,1,2,0,2,2,0)$ (assuming there are 7 Boolean variables altogether). Conversely, any 012-row $r^{\prime}$ of length $w$ yields a unique term $T(r^{\prime})$ with at most $w$ Boolean variables. As usual $\{0,1\}^{w}$ is isomorphic to the powerset ${\cal P}[w]$ of $[w]$ . Thus $r(T)$ above can be viewed as a 16-element interval (also called ’cube’) of ${\cal P}[7]$ , with smallest element $\{2\}$ and largest element $\{1,2,3,5,6\}$ . Suppose a Boolean function $f$ has a DNF which is orthogonal in the sense that the conjunction $T_{i}\wedge T_{j}\ (i\neq j)$ of any two terms in it is insatisfiable. Then the modelset $Mod(f)\subseteq\{0,1\}^{w}$ is a disjoint union of the 012-rows $r(T_{i})$ . Although ’orthogonal DNF’ and ’exclusive sum of products (ESOP)’ are often used synonymously, in the present article ESOP always refers to a representation of $Mod(f)$ as a disjoint union of 012-rows.

Apart from ’2’ novel types of wildcards will be introduced. We mainly deal with 012e-rows but in the last Section glimpse at 012men-rows like $(e,n,m,2,n,e,m,0,m,e)$ (Table 11). Here $e..e,n..n,m..m$ respectively mean: at least one 1 here (so $(e,e):=\{(0,1),(1,0),(1,1)\}$ ); at least one 0 here; at least one 1 and one 0 here.

1.2 Here comes the Section break-up. Section 2 deals with (C). After dispensing with inclusion-exclusion we turn to so-called Binary Decision Diagrams (BDD’s), that will accompany us throughout the article. We use ${\cal S}{\cal C}_{1}$ to illustrate the basic structure of BDD’s and how they solve (C). The third method handling (C) applies the e-algorithm of [W2], whose main features are quickly reviewed. Section 3 is dedicated to $(C_{\forall k})$ . Inclusion-exclusion can still be used but remains awfully slow. As to BDD’s, an elegant method of Knuth is mentioned. The third method (with the prosaic name e+rp+sub) again exploits the $e$ -algorithm and adds another gadget. The core Section 4 deals with (E). We start with two naive (yet intriguing) methods solving (E). Then come binary decision diagrams, which offer some compression via 012-rows. Our Facets-To-Faces algorithm does better by employing 012e-rows and, as opposed to BDD’s, it has a theoretic backbone (Theorem 1). Connections to combinatorial topology and convex polytopes are pointed out. The numerical experiments in Section 5 show that Facets-To-Faces always compresses better than the Mathematica command BooleanConvert, and way better than BDD’s. Timewise Facets-To-Faces keeps at bay BDD’s but yields to BooleanConvert if instead of few large facets there are many small facets. The fact that Facets-To-Faces is programmed in high-level Mathematica, whereas BooleanConvert is ’hardwired’, admittedly does not fully account for this. But then again, it matters little since Facets-To-Faces is easy to parallelize. Section 6 offers two algorithms for $(E_{k})$ . While polynomial total time can be proven for one, the other performs better in practice (due to compression).

The last two Sections can be viewed as ’side-shows’. Section 7 investigates what happens when instead of the facets the minimal non-faces of a simplicial complex are given. The four problems $(E),(E_{k}),(C),(C_{\forall k})$ can then be handled in a more or less dual fashion. Section 8 harks back to Section 4 and makes first strides to lift Facets-To-Faces from antitone DNF’s (=simplicial complexes) to arbitrary DNF’s.

1.3 Here come four areas of application; the latter two are currently of a more tentative nature.

First Reliability Analysis. In this domain the usual name for ’simplicial complex’ is ’coherent system’ (or ’independence system’). The reliability polynomial of a coherent system ${\cal S}{\cal C}\subseteq{\cal P}[w]$ is defined as $RP(z):=\sum_{k=0}^{w}N_{k}z^{k}(1-z)^{w-k}$ where the $N_{k}$ ’s are the face-numbers of ${\cal S}{\cal C}$ (see above). In several areas of engineering (e.g. network analysis or stack filters for nonlinear signal processing) it is important to calculate $RP(z)$ fast, and many methods have been proposed in the last six decades. Some of them (like our e+rp+sub) target the face-numbers. In another vein, a partitioning of ${\cal S}{\cal C}$ into few intervals (=012-rows) would yield $RP(z)$ immediately. Such a partitioning was found in [BN] for matroid-complexes, i.e simplicial complexes consisting of all independent sets of a matroid. Our Facets-To-Faces succeeds for every simplicial complex and uses more powerful 012e-rows.

This leads to Combinatorial Topology. Namely, the number of 012-rows used in [BN] is as small as it can possibly be; it equals the number of bases of the matroid. Generally a simplicial complex with $h$ facets is called partitionable if it can be represented as a disjoint union of $h$ many 012-rows. This is a popular concept in combinatorial topology. Many deep connections to other concepts have been established. For instance: $matroid$ - $complex\Rightarrow shellable\Rightarrow Cohen$ - $Macaulay$ and $shellable\Rightarrow partitionable$ . The long conjectured implication $Cohen$ - $Macaulay\Rightarrow partitionable$ was falsified in [DKM]. A few ideas on how Facets-To-Faces and e+rp+sub may touch upon these matters follow in Section 4.4.

Third, consider the classic Inclusion-Exclusion formula with its exponentially many summands. It is vexing that many summands are often zero, but pleasant that the nonzero summands match a simplicial complex (aka ’nerve’). Isolation and compression of the nerve speed up classic inclusion-exclusion. See arXiv:1309.6927.

Last but hardly least, a prominent area of data mining is Frequent Set Mining. Specifically, Facets-To-Faces can compress all frequent sets from a knowledge of either the maximal frequent sets (i.e. the facets), or the minimal infrequent sets (Sec.7). Many algorithms (e.g. the A priori method, listed in [WK]) have been proposed for these problems; all proceeding one-by-one. See arXiv1910.14508, which also discusses how to get the maximal frequent sets in the first place.

2 Calculating the cardinality of ${\cal S}{\cal C}$ from its facets

After inclusion-exclusions (2.1) and BDD’s (2.2), a novel method to solve (C) is introduced 2.3.

2.1 Consider the simplicial complex ${\cal S}{\cal C}_{1}$ whose $h=5$ facets are listed in (1). Using inclusion-exclusion one finds

$(2)\quad|{\cal S}{\cal C}_{1}|=|F_{1}|+\cdots+|F_{5}|-|F_{1}\cap F_{2}|-\cdots-|F_{4}\cap F_{5}|+\cdots-|F_{1}\cap\cdots\cap F_{5}|=52.$

Having complexity $O(2^{h}w)$ , this method is only efficient for small $h$ , but for such $h$ has the advantage that the cardinalities of the faces $F_{i}$ hardly matter, as opposed to competing methods.

2.2 Another established method uses Binary Decision Diagrams (BDD’s); we recommend [K,Sec.7.1.4] as a general reference. To warm up with Boolean functions and to survey the essentials of BDD’s, consider this (antitone) Boolean function:

$(3)\quad\psi_{1}(x_{1},\cdots,x_{9}):=(\overline{x}_{4}\wedge\overline{x}_{5}\wedge\overline{x}_{6}\wedge\overline{x}_{7}\wedge\overline{x}_{8})\vee(\overline{x}_{1}\wedge\overline{x}_{2}\wedge\overline{x}_{4}\wedge\overline{x}_{6}\wedge\overline{x}_{8})\\ \hskip 28.45274pt\vee(\overline{x}_{1}\wedge\overline{x}_{3}\wedge\overline{x}_{4}\wedge\overline{x}_{5}\wedge\overline{x}_{6}\wedge\overline{x}_{8})\vee(\overline{x}_{1}\wedge\overline{x}_{2}\wedge\overline{x}_{4}\wedge\overline{x}_{5}\wedge\overline{x}_{7})\vee(\overline{x}_{1}\wedge\overline{x}_{3}\wedge\overline{x}_{5}\wedge\overline{x}_{6}\wedge\overline{x}_{7}).$

The models $x$ of $\psi_{1}$ (i.e. the bitstrings $x\in\{0,1\}^{9}$ with $\psi_{1}(x)=1$ ) match the faces of ${\cal S}{\cal C}_{1}$ . For instance $\{2,8\}\in{\cal S}{\cal C}_{1}$ since $\{2,8\}\subseteq F_{5}$ . Accordingly

$\psi_{1}(0,1,0,0,0,0,0,1,0)=(1\wedge 1\wedge 1\wedge 1\wedge 0)\vee(1\wedge 0\wedge 1\wedge 1\wedge 0)\vee(1\wedge 1\wedge 1\wedge 1\wedge 1\wedge 0)\vee\\ \hskip 119.50148pt(1\wedge 0\wedge 1\wedge 1\wedge 1)\vee(1\wedge 1\wedge 1\wedge 1\wedge 1)\\ \hskip 102.43008pt=0\vee 0\vee 0\vee 0\vee 1=1.$

On the other hand, $\{1,2,8\}\not\in{\cal S}{\cal C}_{1}$ and accordingly $\psi_{1}(1,1,0,0,0,0,0,1,0)=0\vee 0\vee 0\vee 0\vee 0=0.$

Whether or not a bitstring $x$ is a model of a Boolean function can (excluding trivial cases) be decided faster by feeding $x$ to the BDD than by evaluating a potentially large Boolean formula. The BDD of $\psi_{1}$ is rendered in Figure 1. If $x=(x_{1},..,x_{9})=(0,0,1,0,0,0,0,0,1)$ , then $x_{1}=x[1]=0$ tells us that at the top node (=root) $x[1]$ of the BDD we must take the dashed branch (it being labelled by 0). It leads us to one of the two sons of $x[1]$ , i.e. the one labelled $x[2]$ . Since $x_{2}=0$ , the dashed path leads us to a node labelled $x[3]$ . Since $x_{3}=1$ we now take the solid path (it being labelled by 1), which brings us to the rightmost node labelled $x[4]$ . Because $x_{4}=x_{5}=x_{6}=0$ , three dashed paths bring us to a node labelled $x[7]$ . Because $x_{7}=0$ , the dashed path brings us to the leaf 1 (distinguished from ordinary nodes by a square frame). By construction of the BDD that signifies $\psi_{1}(x)=1$ . (Notice that the values of $x_{8},x_{9}$ were irrelevant.) One checks that indeed $\{3,9\}\in{\cal S}{\cal C}_{1}$ . If the value of $x_{4}$ had been 1 instead of 0, then we would have reached the leaf 0 (with square frame) at once, indicating that $\psi_{1}(0,0,1,1,0,0,0,0,1)=0$ .

2.2.1 BDD’s allow to determine the number of models fast. For this purpose we assign in a recursive manner a probability to each node. One starts by assigning probability 0 to the leaf 0, and probability 1 to the leaf 1. Working one’s way from bottom to top, if $\alpha$ has sons $\beta,\gamma$ with probabilities $p_{\beta},p_{\gamma}$ , assign to it probability $p_{\alpha}:={\frac{1}{2}}p_{\beta}+{\frac{1}{2}}p_{\gamma}$ . For $\psi_{1}$ in the end the root gets probability $\frac{13}{128}$ . Since the total number of length 9 bitstrings is $2^{9}$ , a moment’s thought shows that the cardinality of the model set $Mod(\psi_{1}):=\{x\in\{0,1\}^{9}:\ \psi_{1}(x)=1\}$ is ${\frac{13}{128}}\cdot 2^{9}=52$ , which matches (2). The cost of calculating (C) this way is linear in the size of the BDD (=number of its nodes).

Figure 1: One (of many) BDD of $\psi_{1}$ in (3)

2.3 The third way to settle (C) is based on a certain e-algorithm, which in turn is based on 012e-rows. Extending the concept of a 012-row (Introduction), by definition a 012e-row contains one or more wildcards of type $(e,e,..,e)$ , each one of which demanding ’at least one 1 here’. Thus the 012e-row $(e,0,1,e,2)$ is the set of bitstrings $(e,0,1,e,{\bf 0})\cup(e,0,1,e,{\bf 1})$ , where e.g. $(e,0,1,e,0):=\{({\bf 0},0,1,{\bf 1},0),({\bf 1},0,1,{\bf 0},0),({\bf 1},0,1,{\bf 1},0)\}$ . If several wildcards occur, they are distinguished by subscripts. Calculating the number of bitstrings contained in a 012e-row is easy, say $|(1,e_{1},0,e_{1},e_{1},2,e_{2},e_{1},2,e_{2})|=2^{2}\cdot(2^{4}-1)\cdot(2^{2}-1)=180.$

Recall that a transversal of a hypergraph (=set system) ${\cal H}\subseteq{\cal P}(W)$ is a subset $X\subseteq W$ such that $X\cap H\neq\emptyset$ for all $H\in{\cal H}$ . Let ${\cal T}({\cal H})$ be the set of all transversals. The (transversal) e-algorithm, fully described in [W2], represents ${\cal T}({\cal H})$ as a disjoint union of $R$ many 012e-rows in polynomial total time $O(Rh^{2}w^{2})$ .

**2.3.1 ** Consider now any simplicial complex ${\cal S}{\cal C}\subseteq{\cal P}(W)$ with facets $F_{1},F_{2},...$ and so on. Putting $Z^{c}:=W\setminus Z$ for any $Z\subseteq W$ it holds for all $X\subseteq W$ that

(4) $X\not\in{\cal S}{\cal C}\ \Leftrightarrow\ (\forall i)X\not\subseteq F_{i}\ \Leftrightarrow\ (\forall i)(X\cap F^{c}_{i}\neq\emptyset)$ .

To fix ideas, take ${\cal S}{\cal C}={\cal SC}_{1}$ , whose five facets $F_{i}$ are listed in (1). If we apply the $e$ -algorithm to ${\cal H}=\{H_{1},\ldots,H_{5}\}:=\{F^{c}_{1},\cdots,F^{c}_{5}\}$ then it outputs ${\cal T}({\cal H})$ as a disjoint union of seven $012e$ -rows:

[TABLE]

Table 1: Compressing ${\cal P}(W)\setminus{\cal SC}_{1}$ with the transversal $e$ -algorithm

According to (4), ${\cal T}({\cal H})$ coincides with the set filter ${\cal S}{\cal F}_{1}:={\cal P}[9]\setminus{\cal SC}_{1}$ . It follows that

(5) $|{\cal SC}_{1}|=2^{9}-|{\cal S}{\cal F}_{1}|=2^{9}-|r^{\prime}_{1}|-|r^{\prime}_{2}|-\cdots-|r^{\prime}_{7}|$

$=2^{9}-2^{3}(2^{5}-1)-2^{3}-2^{3}\cdot 7\cdot 3-2-2^{3}-2^{3}\cdot 3-2=512-460=52$ ,

which matches the number obtained in 2.1 and 2.2. As will be seen in 3.3.1, inclusion-exclusion stands no chance against the method of 2.3. The bottleneck in 2.2 is the calculation of the BDD itself. That’s because the expected444To be fair, in many scenarios the occuring Boolean functions do not represent a random sample of all $2^{2^{w}}$ many Boolean functions, and the BDD-size can be moderate then. size of the BDD of a Boolean function $\{0,1\}^{w}\to\{0,1\}$ is $2^{w}/w$ , and hence the calculation of BDD’s cannot be done in polynomial total time; the numerical experiments in Section 5 will speak the same language.

3 Calculating the face-numbers of ${\cal S}{\cal C}$ from its facets

Here we settle $(C_{\forall k})$ by refining the three methods of Section 2.

3.1 Generalizing (2) the principle of inclusion-exclusion als applies to calculate the face-numbers $N_{k}$ . Thus for ${\cal S}{\cal C}={\cal S}{\cal C}_{1}$ we find

$(6)\quad N_{k}={|F_{1}|\choose k}+\cdots+{|F_{5}|\choose k}-{|F_{1}\cap F_{2}|\choose k}-\cdots+{|F_{4}\cap F_{5}|\choose k}+\cdots+{|F_{1}\cap\cdots\cap F_{5}|\choose k}.$

For say $k=3$ this gives

$(7)\quad N_{3}={4\choose 3}+{4\choose 3}+{3\choose 3}+{4\choose 3}+{4\choose 3}-0-\cdots+0=17.$

**3.2 ** While calculating the number of models of a Boolean function from its555We mention in passing that ’its’ is unprecise. A Boolean function has a unique BDD only once a linear ordering of the Boolean variables has been fixed. The BDD in Figure 1 is based on the (popular default) ordering $x_{1},...,x_{w}$ . BDD is well known, using BDD’s to calculate the number of models of fixed Hamming weight (here: faces of fixed cardinality) is less known. Somewhat streamlining the account of Knuth [K,p.260,Exercise 25], details can be found in the preprint arXiv.1703.08511v5. Suffice it to say that the sought numbers $N_{k}$ fall out as the coefficients of a certain polynomial that is calculated recursively by processing the BDD bottom-up, in much the same way as in 2.2.

**3.3 ** One ingredient of the third method for $(C_{\forall k})$ will also exploit coefficients of polynomials, but they are different from Knuth’s polynomials. The main ingredient is, as in 2.3, the $e$ -algorithm. Consider thus a generic $012e$ -row

(8) $r\quad=\quad(\underbrace{0,\cdots,0}_{\alpha},\underbrace{1,\cdots,1}_{\beta},\underbrace{2,\cdots,2}_{\gamma},\underbrace{e_{1},\cdots,e_{1}}_{\varepsilon_{1}},\cdots,\underbrace{e_{t},\cdots,e_{t}}_{\varepsilon_{t}})$

It is easy to see that the number Card $(r,k)$ of $k$ -element sets in $r$ equals the coefficient of $x^{k}$ in the row-polynomial

(9) $p(x):=x^{\beta}\cdot(1+x)^{\gamma}\cdot[(1+x)^{\varepsilon_{1}}-1]\cdot[(1+x)^{\varepsilon_{2}}-1]\cdots[(1+x)^{\varepsilon_{t}}-1].$

Details on the complexity of calculating these coefficients can be found in [W2, Theorem 1]. Here we simply apply the Mathematica command Expand to the polynomial induced by $r=r_{3}^{\prime}$ in Table 1 and obtain

(10) $p(x)=(1+x)^{3}(3x+3x^{2}+x^{3})(2x+x^{2})=6x^{2}+27x^{3}+50x^{4}+49x^{5}+27x^{6}+8x^{7}+x^{8}$ .

Thus e.g. Card $(r^{\prime}_{3},3)=27$ . Recall from 2.3.1 that $H_{i}:=F_{i}^{c}$ . Let $\tau_{k}$ be the number of $k$ -element transversals of $\{H_{1},\cdots,H_{5}\}$ , i.e. the number of $k$ -element sets of ${\cal S}{\cal F}_{1}$ . By the above, all numbers $\tau_{k}$ are readily calculated as

(11) $\tau_{k}=\,\mbox{Card}(r^{\prime}_{1},k)+\mbox{Card}(r^{\prime}_{2},k)+\cdots+\mbox{Card}(r^{\prime}_{7},k).$

Hence the face-numbers $N_{k}$ of ${\cal S}{\cal C}_{1}$ (or any simplicial complex) can be calculated with this ’subtraction trick’:

(12) $N_{k}={9\choose k}-\tau_{k}\quad(1\leq k\leq 9).$

For instance $N_{3}={9\choose 3}-\tau_{3}=84-(25+3+27+1+3+7+1)=17$ , which matches (7). In view of the #P-hardness of $(C_{\forall k})$ and the costly calculation of BDD’s we consider our threefold approach

$\hbox{\sl e-algorithm}\ +\ \hbox{\sl row-polynomials}\ +\ \hbox{\sl subtraction trick (e+rp+sub)}$

a nice way to get the face numbers from the facets. The e-algorithm is easy to parallelize (by the same reason as in [W4, sec.6.5]), and therefore also e+rp+sub. In contrast, the calculation of a BDD from a Boolean formula can hardly be parallelized.

3.3.1 In a previous version of the present article (arXiv:1302.1039v4) e+rp+sub was pitted against inclusion-exclusion on random simplicial complexes of type $(w,h,fs)$ , i.e. the $h$ facets $F_{i}\subseteq{\cal P}[w]$ all had facet-size $|F_{i}|=fs$ . Predictably inclusion-exclusion took time almost proportional to $2^{h}$ ; thus $(w,h,fs)=(1200,15,200)$ took 164 seconds and $(1200,20,200)$ needed to $5265\approx 2^{5}\cdot 164$ seconds. In contrast, e+rp+sub took 1896 sec for the latter, and handled $(1200,40,200)$ (which triggered about 2 billion 012e-rows) in 64606 seconds . The corresponding time for inclusion-exclusion measures in centuries.

4 The Facets-To-Faces algorithm

Here we tackle the main task (E), i.e. given the facets, enumerate (preferably compressed) all faces! Section 4.1 describes two naive algorithms. The first is everybody’s first temptation, but outputs the faces one-by-one. Although the second has the potential for compression (using 012-rows), it nevertheless can be inferior. Knowing a BDD of ${\cal S}{\cal C}$ one can use 012-rows more efficiently for compression. Section 4.3 introduces the novel Facets-To-Faces algorithm which displays ${\cal S}{\cal C}$ as a disjoint union of more powerful 012e-rows. Section 4.4 relates Facets-To-Faces to facets and faces of topological simplicial complexes and convex polytopes.

**4.1 ** We put in front some definitions for 4.1.2. If ${\cal S}{\cal C}\subseteq{\cal P}[w]$ , call a 012-row $r$ of length $w$ feasible if $r\cap{\cal S}{\cal C}\not=\emptyset$ (which amounts to $ones(r)\subseteq F_{i}$ for some $i$ ). Further call $r$ final if $r\subseteq{\cal S}{\cal C}$ (which amounts to $ones(r)\cup twos(r)\subseteq F_{i}$ for some $i$ ).

**4.1.1 ** The First Naive Algorithm (FNA) for $(E)$ enumerates ${\cal S}{\cal C}_{1}$ simply as ${\cal S}{\cal C}_{1}={\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{5})$ . As to ’simply’, trouble is that multiple occurencies of faces (such as $\{2,9\}\in{\cal P}(F_{1})\cap{\cal P}(F_{3})\cap{\cal P}(F_{5})$ need to be pruned. Specifically, by induction suppose that for any ${\cal S}{\cal C}$ we have obtained ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{k})=\{A_{1},...,A_{t}\}$ such that $A_{i}\neq A_{j}$ for $i\neq j$ . Then only the members of ${\cal P}(F_{k+1})$ distinct from all $A_{i}$ ’s are added to the list. Here comes a confession: This is how FNA de iure must be programmed. De facto we exploited two shortcuts provided by Mathematica. First, enumerating a powerset (such as ${\cal P}(F_{k+1})$ ) is more subtle than it looks; see [K,sec. 7.2.1.1]. We circumvented that issue with the Mathematica command Subsets, which behaves as follows: Subsets $[\{3,7\}]$ directly outputs $\{\{\},\{3\},\{7\},\{3,7\}\}$ . Second, the command Union automatically prunes multiple occurencies (and orders the output); thus Union $[\{4,11,9,2\},\{2,11,7\}]$ outputs $\{2,4,7,9,11\}$ . Therefore, if ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{k})=\{A_{1},...,A_{t}\}$ has been computed, we get a pruned listing of ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{k+1})$ with Union $[\{A_{1},...,A_{t}\},{\tt Subsets}[F_{k+1}]]$ .

4.1.2 The Second Naive Algorithm (SNA) uses variable-wise branching (a better name being pivotal decomposition, as argued in [W4, sec. 2.5]). Initially our Last-In-First-Out (LIFO) stack only contains the feasible row $(2,2,\ldots,2)$ . Generally always the top 012-row $r$ of the LIFO-stack is picked. The ”first” occuring digit 2 (with respect to a fixed ordering of the index set $\{1,2,...,w\}$ ) is turned to 0 and 1 respectively. This yields 012-rows $r_{0}$ and $r_{1}$ . By induction $r$ was feasible. Since subsets of faces are faces, it follows that $r_{0}$ is feasible, but not necessarily $r_{1}$ . These one or two feasible 012-rows replace $r$ on the LIFO stack (except that final rows go to an initially empty ’final stack’). As soon as the LIFO stack is empty, the union of the 012-rows in the final stack is disjoint and equals ${\cal S}{\cal C}$ . (Theorem 2 fine-tunes the above in a more sophisticated setting.)

**4.1.3 ** Here comes an experimental comparison of the two naive algorithms. For various random instances $(w,h,fs)$ (see 3.3.1) we recorded the times (rounded to full seconds) $T_{1},\ T_{2}$ needed for FNA and SNA respectively to enumerate the ensuing simplicial complex ${\cal S}{\cal C}$ . The number $R_{2}$ of final 012-rows produced by SNA is recorded as well. In contrast, FNA offers no compression but, recall, its advantage is that all faces contained in a facet $F_{i}$ , are ’instantaneously’ produced by Subsets[ $F_{i}$ ]. This advantage wins out in the $(20,100,10)$ instance. The $(20,10,18)$ -instance lets SNA catch up because it compresses on average roughly 16 faces per 012-row, whereas FNA outputs faces one-by-one and invests considerable time (despite the hardwired Union command!) to prune duplicated faces. These two trends increase in the $(20,1000,16)$ -instance to the extent that SNA is more than twice as fast as FNA. The tables are turning again in the extrapolated $(24,1000,16)$ -instance because the compression of SNA is just too poor (about 2 faces per 012-row). Finally, the extrapolated $(50,100,20)$ -instance with its 109’437’738 faces666Several of the methods dscussed in Section 5 can get this number very fast provides a Pyrrhus victory for SNA: The FNA ran out of memory while executing Union $[\{...\},{\tt Subsets}[F_{19}]]$ .

[TABLE]

Table 2: Comparison of the two naive algorithms

**4.1.4 ** For SNA the number of final 012-rows which are proper (i.e. not 01-rows) heavily depends on the particular ordering of the index set $\{1,2,..,9\}$ . For instance, using the natural ordering $1,2,..,9$ the SNA represents our 52-element example ${\cal S}{\cal C}_{1}$ as a disjoint union of 19 rows. The minimum (=13) and maximum (=44) number of final 012-rows are obtained (e.g.) for the orderings $1,2,4,5,7,6,8,3,9$ and $1,2,3,4,5,8,9,7,6$ respectively.

**4.2 ** Figure 1 shows the BDD of the Boolean function $\psi_{1}$ of (3). Recall from 2.2 that $\psi_{1}(x)=1$ for $x=(0,0,1,0,0,0,0,0,1)$ , and so feeding $x$ to the BDD traced a path from the root $x[1]$ to the 1-leaf. The fact that the values $x_{8},x_{9}$ were irrelevant for reaching the 1-leaf shows that $(0,0,1,0,0,0,0,2,2)\subseteq Mod(\psi_{1})$ . Generally each path from $x[1]$ to the 1-leaf yields a 012-row contained in $Mod(\psi_{1})$ , and distinct paths induce disjoint 012-rows (why?). Therefore, if there are $t$ such paths, then $Mod(\psi_{1})$ can be written as union of $t$ disjoint 012-rows. We call this the BDD-induced ESOP of $\psi_{1}$ (see 1.1.1). How to find these $t$ paths efficiently?

[TABLE]

Table 3: The ESOP induced by the BDD of $\psi_{1}$

A look at Figure 1 shows that the only $\psi_{1}$ -models $x$ with $x_{1}=1$ are the ones in the 012-row $\beta_{1}$ of Table 3. All other $\psi_{1}$ -models must fit the pattern of $\beta_{0}$ . One can get rid of the first ’?’ in $\beta_{0}$ by splitting $\beta_{0}$ as $\beta_{0,0}\uplus\beta_{0,1}$ . However, continuing in this manner can create many dead-end paths. It is better to embrace a bottom-up approach akin to 2.2.1. This would show that $\beta_{0,0}=\beta_{0,0,1}\uplus\cdots\uplus\beta_{0,0,9}$ and $\beta_{0,1}=\beta_{0,1,1}\uplus\cdots\uplus\beta_{0,1,4}$ . Hence $t=1+9+4=14$ . Adding up the cardinalities of the fourteen rows $\beta_{1},\beta_{0,0,1},\ldots,\beta_{0,1,4}$ yields $8+4+\cdots+2=52$ , as was to be expected.

4.2.1 The mere number $R_{BDD}$ of 012-rows in a BDD-induced ESOP can be predicted without having to calculate the ESOP. Namely, proceeding bottom-up, assign integers (instead of probabilities as in 2.2.1) to the BDD-nodes as follows. The 0-leaf and 1-leaf receive 0 and 1 respectively. If node $\alpha$ has sons $\beta,\gamma$ with assigned integers $i_{\beta},i_{\gamma}$ , assign to it $i_{\alpha}:=i_{\beta}+i_{\gamma}$ . The last number $i_{root}$ equals $R_{BDD}$ . The reader is invited to verify that for the BDD in Figure 1 this procedure indeed yields $R_{BDD}=14$ .

4.3 We now embark on the third method for solving (E), it being the core of our article. Suppose ${\cal S}{\cal C}$ has facets $F_{1}$ to $F_{h}$ , and by induction we have obtained for some $t\in[h-1]$ a representation

(13) ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{t})=\rho_{1}\uplus\rho_{2}\uplus\cdots\uplus\rho_{s}$

with $012e$ -rows $\rho_{i}$ . If $r$ is the $02$ -row matching ${\cal P}(F_{t+1})$ then evidently

(14) ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{t+1})=(\rho_{1}\setminus r)\uplus(\rho_{2}\setminus r)\uplus\cdots\uplus(\rho_{s}\setminus r)\uplus r$ ,

and so the key problem is this: For a given $012e$ -row $\rho\ (=\rho_{i})$ and $02$ -row $r$ recompress the set difference $\rho\setminus r$ as disjoint union of $012e$ -rows. Let us do away with the two extreme cases first. First, $\rho\setminus r=\rho$ iff $\rho\cap r=\emptyset$ thus iff either a 1 or $e$ -wildcard of $\rho$ falls into zeros $(r)$ . Second, $\rho\setminus r=\emptyset$ iff $\rho\subseteq r$ , thus iff zeros $(r)\subseteq\mbox{zeros}(\rho)$ . For instance $(e,e,1,2,0,0)\setminus(2,2,2,2,2,0)=\emptyset$ .

[TABLE]

Table 4: Recompressing the type $(012e)\setminus(02)$ set difference $\rho\setminus r$

In all other cases we focus on the flexible (i.e. $\neq 0,\,1$ ) symbols of $\rho$ , thus for $\rho$ in Table 4 the symbols on the positions 1 to 11. The only way for $X\in\rho$ to detach itself from (the ’plebs’ in) $r$ is to employ those flexible symbols of $\rho$ that are “above” a [math] of $r$ , in the sense that they occupy a position which in $r$ is occupied by [math]. For the particular $\rho$ and $r$ in Table 4 a bitstring $X\in r$ detaches itself from $r$ iff ones $(X)\cap[7]\neq\emptyset$ . Depending on whether the smallest element of ones $(X)\cap[7]$ belongs to $\{1,2\},\{3,4\},\{5\},\{6,7\}$ (this partition is dictated by the wildcard pattern of $\rho$ ), the bitstring $X$ belongs to exactly one of the sons $\rho_{1}^{\prime},\rho_{2}^{\prime},\rho_{3}^{\prime},\rho_{4}^{\prime}$ .

The powersets of the five facets $F_{i}$ of ${\cal S}{\cal C}_{1}$ (see (1)) are listed as the first five $02$ -rows $r_{i}$ in Table 5. Applying detachment repeatedly yields:

$\begin{array}[]{rll}r_{1}\cup r_{2}&=&(r_{1}\setminus r_{2})\uplus r_{2}=:r_{6}\uplus r_{2}\\ \\ (r_{6}\uplus r_{2})\cup r_{3}&=&(r_{6}\setminus r_{3})\uplus(r_{2}\setminus r_{3})\uplus r_{3}=:(r_{7}\uplus r_{8})\uplus r_{9}\uplus r_{3}\\ \\ (r_{7}\uplus\cdots\uplus r_{3})\cup r_{4}&=&(r_{7}\setminus r_{4})\uplus(r_{8}\setminus r_{4})\uplus(r_{9}\setminus r_{4})\uplus(r_{3}\setminus r_{4})\uplus r_{4}\\ \\ &=:&r_{7}\uplus r_{8}\uplus(r_{10}\uplus r_{11})\uplus r_{12}\uplus r_{4}\\ \\ (r_{7}\uplus\cdots\uplus r_{4})\cup r_{5}&=&(r_{7}\setminus r_{5})\uplus(r_{8}\setminus r_{5})\uplus(r_{10}\setminus r_{5})\uplus(r_{11}\setminus r_{5})\uplus(r_{12}\setminus r_{5})\uplus(r_{4}\setminus r_{5})\uplus r_{5}\\ \\ &=:&r_{7}\uplus r_{8}\uplus r_{10}\uplus r_{11}\uplus r_{13}\uplus r_{14}\uplus r_{5},\end{array}$

From Table 5 follows, as it must, that

[TABLE]

We call this algorithm Facets-To-Faces.

[TABLE]

Table 5: Compressing ${\cal S}{\cal C}_{1}$ with Facets-To-Faces

Theorem 1: Let $F_{1},\ldots,F_{h}\subseteq[w]$ be the facets of a simplicial complex ${\cal S}{\cal C}$ . Then Facets-To-Faces enumerates ${\cal S}{\cal C}$ as a union777In view of the 012-rows entering the definition of ’ESOP’, one could call this kind of union a ’fancy ESOP’ for ${\cal S}{\cal C}$ or, more precisely, for its underlying antitone Boolean function $f$ (such as (3) for ${\cal S}{\cal C}={\cal S}{\cal C}_{1}$ ). of $R$ disjoint 012e-rows in time $O(R^{2}w^{2}h)$ .

Proof. By induction assume that for some $t<h$ the decomposition (13) has been achieved. If some 012e-row $\rho_{i}$ is contained in ${\cal P}(F_{t+1})\cup\cdots\cup{\cal P}(F_{h})$ then neither $\rho_{i}$ nor any of its sons and grandsons will survive in the long run. Thus $\rho_{i}$ is a dud, i.e. causing work without benefit. Moreover, unless $\rho_{i}$ is cancelled right away, it is impossible to predict the algorithm’s total time. Fortunately, letting $X=X(i)$ be the unique largest set in $\rho_{i}$ (thus $X$ is obtained by setting all $2$ ’s and $e$ ’s to $1$ ), it holds that

[TABLE]

Testing for all $1\leq i\leq s$ whether $\exists\ j\in\{t+1,\ldots,h\}$ with $X(i)\subseteq F_{j}$ costs $O(s(h-t)w)$ . In other words, that is the cost of pruning the righthand side of (13) from duds. What is the cost to proceed from a (pruned) representation (13) to a (not yet pruned) representation (14)? Because $\rho_{i}\setminus r$ has at most $w$ sons (which is clear from Table 4), and ’writing down’ each son is obvious (i.e. costs $O(w)$ ), the asked for cost is $O((s+1)w^{2})$ . Hence the overall cost is

[TABLE]

4.3.1 Suppose Facets-To-Faces has advanced to representing ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{s-1})$ as a disjoint union of 012e-rows. At one’s digression one can then embark on distributing the computation to $t$ satellite stations. Say $t=3$ and

${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{s-1})=(r_{1}^{(1)}\uplus\cdots\uplus r_{\alpha}^{(1)})\uplus(r_{1}^{(2)}\uplus\cdots\uplus r_{\beta}^{(2)})\uplus(r_{1}^{(3)}\uplus\cdots\uplus r_{\gamma}^{(3)}),$

where $\alpha,\beta,\gamma$ are approximately equal. Putting $r:={\cal P}(F_{s})$ the control sends $r_{1}^{(1)},...,r_{\alpha}^{(1)};r$ to satellite 1, and $r_{1}^{(2)},...,r_{\beta}^{(2)};r$ to satellite 2, and $r_{1}^{(3)},...,r_{\gamma}^{(3)};r$ to satellite 3. After a while the control receives from satellite 1 some 012e-rows $\rho_{1}^{(1)},...,\rho_{\alpha^{\prime}}^{(1)}$ such that $(r_{1}^{(1)}\setminus r)\uplus\cdots\uplus(r_{\alpha}^{(1)}\setminus r)=\rho_{1}^{(1)}\uplus\cdots\uplus\rho_{\alpha^{\prime}}^{(1)}$ . Satellite 2 and satellite 3 send analogous rows $\rho_{1}^{(2)},...,\rho_{\beta^{\prime}}^{(2)}$ and $\rho_{1}^{(3)},...,\rho_{\gamma^{\prime}}^{(3)}$ .(Note that $\alpha^{\prime},\beta^{\prime},\gamma^{\prime}$ may differ significantly in magnitude.) The control pools the received rows, adds row $r$ , and divides the $\alpha^{\prime}+\beta^{\prime}+\gamma^{\prime}+1$ rows in three approximately equal-sized parts. The three parts, each augmented by $r^{\prime}:={\cal P}(F_{s+1})$ , are sent back to the satellites. And so forth.

4.4. This Subsection links the above to convex polytopes. We begin with the framework of $\cap$ -subsemilattices ${\cal L}\subseteq{\cal P}(W)$ , i.e. $X,Y\in{\cal L}\Rightarrow X\cap Y\in{\cal L}$ . If the set ${\cal M}({\cal L})$ of meet-irreducibles (or any $\cap$ -generating set) is known, then ${\cal L}$ can be generated one-by-one in polynomial total time by a variety of algorithms. These algorithms e.g. are of interest in Formal Concept Analysis [GO]. Ganter’s NextClosure algorithm [GO,p.44] was the first and is still popular.

Consider now a convex polytope $P\subseteq\mathbb{R}^{d}$ . Quoting from [FR,p.192]: …the combinatorial face enumeration problem (CFEB) is to enumerate all faces of $P$ in terms of their representations without duplications. What Fukuda and Rosta mean by the ’representation’ of a face $F$ is the set of facets in which $F$ is contained. Let $W$ be the set of vertices of $P$ , identify each face of $P$ with the set of vertices it contains, and let ${\cal L}\subseteq{\cal P}(W)$ be the set of all faces. In this setting CFEP reduces to enumerating ${\cal L}$ from the set ${\cal M}({\cal L})$ of facets. (As to how the facets themselves can be found, see 4.4.2.) In [KP], which improves upon results in [FR], and which was inspired by NextClosure, not only the individual faces but all covering pairs of faces are enumerated from the facets in polynomial total time.

4.4.1 A convex polytope is a simplex if any subset of (the vertex set of) any face is (the vertex set of) a face. For instance, the simplices in $\mathbb{R}^{3}$ are exactly the tetrahedrons. Gluing together simplices yields (topological) simplicial complexes888Since we are only concerned with abstract simplicial complexes (defined in 1.1) we can dispense with a formal definition of topological simplicial complexes. We mention in passing: Other than convex polytopes, simplicial complexes ${\cal S}{\cal C}$ which are not simplices have meet-irreducible faces which are not facets (which?). However this is irrelevant since ${\cal S}{\cal C}$ is already determined by its facets.. As is to be expected, the [KP]-algorithm accelerates for simplicial complexes, yet still enumerates one-by-one. In [BM], which similarly caters for combinatorial topologists, the individual faces are organized in a tree-structure. This supports various combinatorial operations (such as contracting edges), but again offers no compression.

Enter Facets-To-Faces. Apart from the practical aspects of compression, there is a connection to an important theoretical concept. Namely, in any disjoint representation of ${\cal S}{\cal C}$ by 012e-rows each facet must be the largest member in the row $r_{i}$ it happens to belong to. In particular, if there are $h$ facets then any disjoint representation comprises at least $h$ many 012e-rows. In combinatorial topology a simplicial complex is called partitionable if one can do with $h$ many 012-rows (yet other terminology is used). The relevance of this concept has been indicated in 1.3. Here are two veins for further research. Can the methods in arXiv:1811.11689 (which concern shellability) be adapted to find necessary or sufficient conditions for the partitionability of a random simplicial complex given by its facets? Defining ${\cal S}{\cal C}$ to be e-partitionable if ${\cal S}{\cal C}$ is a disjoint union of $h$ many 012e-rows, is this notion strictly weaker than being partitionable?

4.4.2 Notice that $\cap$ -subsemilattices ${\cal L}\subseteq{\cal P}(W)$ (e.g. arising from convex polytopes) are not easily compressed; it seems one needs an implicational base $\Sigma$ of the closure system ${\cal L}\cup\{W\}$ , but calculating $\Sigma$ from ${\cal M}({\cal L})$ is usually hard [W3, sec. 3.6]. Is there nevertheless a place for compression in the context of convex polytopes $P\subseteq\mathbb{R}^{d}$ ? To answer, recall the two fundamental representations of $P$ . The H-representation views $P$ as an intersection of $h$ closed half-spaces, each one of which given by an inequality $a_{i,1}x_{1}+\cdots+a_{i,d}x_{d}\leq b_{i}$ . TheV-representation gives the vertex set $W\subseteq\mathbb{R}^{d}$ , viewing that $P$ is the convex hull of $W$ . Much research999In particular, various methods have been proposed to get the ’combinatorial’ facets (i.e the sets of incident vertices) from either the H- or the V-representation. has been devoted going from one kind of representation to the other. As to ’a place for compression’, given the H-representation of $P$ , there is hope to compress the set of interior 0,1-points (i.e. $P\cap\{0,1\}^{d}$ ) as a disjoint union of 012e-rows (work in progress).

5 Numerical experiments

Theorem 1 only implies $R\leq|{\cal S}{\cal C}|$ (due to the disjointness of 012e-rows), yet the numerical experiments below show that often $R<<|{\cal S}{\cal C}|$ . The meaning of a random instance $(w,h,fs)$ is as in 3.3.1 and 4.1.3. The number $R$ of final 012e-rows spawned by the Facets-To-Faces algorithm, and its running time $T$ in seconds are recorded. In our implementation of Facets-To-Faces the precaution to avoid duds (see the proof of Theorem 1) was omitted because for the instances in Table 7 its incorparation would outweigh the benefits. For instance the (50,240,20)-instance features 460631 final versus 13244 wasteful 012e-rows. In the other instances the proportion wasteful/final is even smaller. In all instances more than half of the final 012e-rows were proper, i.e. not 012-rows. In the (2000,70,192)-instance only 1157 out of 70551 many 012e-rows were improper.

After introducing the two competitors of Facets-To-Faces (5.1), we assess the three algorithms’ compression capabilities in 5.2. Then we compare with respect to CPU time (5.3), and finally with respect to memory requirements (5.4).

5.1 Mathematica uses BDD’s only behind the scenes, in particular for the command SatisfiabilityCount which outputs the number101010Once Facets-To-Faces terminates, the cardinality $|{\cal S}{\cal C}|$ is easily determined (see 5.5), and it always coincided with the number produced by SatisfiabilityCount. Hence Facets-To-Faces very likely works correctly. of models of any Boolean function fed to it. Therefore I am grateful to Maximilian Vides, who helped me to access BDD’s via Python. Specifically, for each instance $(w,h,fs)$ the matching antitone Boolean function was translated from Mathematica notation to Python notation, then fed to the Python command expr2bdd which calculates a BDD, then $R_{BDD}$ was calculated as described in 4.2.1. The first competitor expr2bdd always uses the natural default ordering $x_{1},...,x_{w}$ of variables. This may be part of the explanation why it is much slower than SatisfiabilityCount. In any case, even if the BDD underlying SatisfiabilityCount $[f]$ , undergoes minimisation, it seems to induce an ESOP that compresses poorly111111 Much research has gone into optimizing the variable ordering in order to reduce (or even minimize) the size $s(BDD)$ and whence the time to calculate it. However, there is little relation between $s(BDD)$ and $R_{BDD}$ because it is the structure rather than the sheer number $s(BDD)$ of nodes that determines $R_{BDD}$ . For instance $s(BDD)=4601>R_{BDD}$ for $(50,30,10)$ but $s(BDD)=765^{\prime}720<R_{BDD}$ for $(50,40,30)$ . It seems that no research has gone into optimizing the variable ordering to make $R_{BDD}$ small. It may well be that this is fruitless since competing methods will keep on compressing $Mod(f)$ better. Likewise the compression achieved by Faces-To-Facets heavily depends on the order in which the facets are processed, and so far no research in this regard exists. We conclude: Facets-To-Faces is just as disadvantaged by the default ordering of faces as expr2bdd is by the default variable ordering. . Why else would Mathematica use a command which is not based on BDD’s (as confirmed by the author) to calculate an ESOP of a Boolean function? This command is BooleanConvert (option ’ESOP’), and it is the second competitor of Facets-To-Faces.

5.2 In Table 6 the parameter $R$ is the number of 012e-rows produced by Facets-To-Faces, $R_{BC}$ is the number of 012-rows in the ESOP calculated with BooleanConvert, and $R_{BDD}$ was defined above. One sees that for all instances $(w,h,fs)$ it holds that $R<R_{BC}<R_{BDD}$ . Let us fix $w$ and $h$ and observe what happens when $fs$ increases. If say $(w,h)=(50,10)$ , then $(50,10,{\bf 10})\rightarrow(R,R_{BC},R_{BDD})=(39,200,440)$ and $(50,10,{\bf 25})\rightarrow(285,5992,62503)$ and $(50,10,{\bf 40})\rightarrow(429,220^{\prime}432,2^{\prime}389^{\prime}705)$ . Likewise fixing $(w,h):=(50,300)$ and taking $fs=10,17,20,26$ gives a similar picture, except that expr2bdd couldn’t finish in reasonable time. In brief, all else being equal, all three suffer from increasing $fs$ , expr2bdd more than BooleanConvert, and BooleanConvert more than Facets-To-Faces. We leave it to the reader to draw conclusions (although the data is sparse) from fixing $w,fs$ and increasing $h$ , respectively fixing $h,fs$ and increasing $w$ .

5.3 What concerns CPU-times, the state of affairs is not so clear-cut. In a nutshell, the Facets-To-Faces algorithm dislikes many short facets, but likes few large facets. As to few but large facets, in such situations it may not only best the time of BooleanConvert but even SatisfiabilityCount: It took Facets-To-Faces $T=1114$ seconds to squeeze $10^{92}$ faces (contained in 70 facets of size 300) into a mere 707518 many 012e-rows, whereas SatisfiabilityCount (which we only asked to count the faces) was aborted after fourteen hours. When there are many small facets (such as for $(w,h,fs)=(50,1000,10)$ ) then $T_{BC}$ is smaller than $T$ , but $T_{BDD}$ stays higher. The fact that Facets-To-Faces is implemented in high-level Mathematica code, whereas BooleanConvert is hardwired, is only part of the explanation. Fortunately according to 4.3.1 Facets-To-Faces can be parallelized. Thus, simply put, Facets-To-Faces can be accelerated by any factor $t$ , provided one is fit to ’control’ a network of $t$ colleagues who lend their PC’s.

5.4 Not only is $R_{BC}$ always larger than $R$ , the Mathematica command MemoryInUse (whatever its units) shows that BooleanConvert is also more memory-intensive than the Facets-To-Faces algorithm. For example, in a random instance of type $(50,200,20)$ the before/after measurements were ${\tt MemoryInUse}={\bf 307}^{\prime}572^{\prime}224$ and ${\tt MemoryInUse}={\bf 928}^{\prime}179^{\prime}088$ for Facets-To-Faces, but ${\tt MemoryInUse}={\bf 307}^{\prime}339^{\prime}408$ and ${\tt MemoryInUse}={\bf 3^{\prime}434}^{\prime}044^{\prime}152$ for BooleanConvert. As to SatisfiabilityCount, in the (2000,70,192)-example it also started with a modest ${\tt MemoryInUse}={\bf 62}^{\prime}485^{\prime}184$ but ended with a hefty ${\tt MemoryInUse}={\bf 5^{\prime}698}^{\prime}713^{\prime}009$ . This may be related to why ${\tt Timing[BooleanConvert]}$ and Timing[SatisfiabilityCount] were not reliable: In the $(50,240,20)$ instance, say, the claim ${\tt Timing[BooleanConvert]=47sec}$ was contradicted by a hand-stopped time of 410 seconds. In the $(2000,70,192)$ instance the claim Timing[SatisfiabilityCount]=157 was contradicted by a hand-stopped time of 785 seconds. In contrast, for the Facets-To-Faces algorithm ${\tt Timing[...]}$ always matched the hand-stopped time.

5.5 From the fancy ESOP calculated by Facets-To-Faces (say in time $T$ ) one can compute all face-numbers in a fraction of $T$ . This method may even beat e+rp+sub. Which method excels depends on the number and structure of facets and needs further investigation. For instance, e+rp+sub was slightly faster on the (50,1000,10)-example (901 seconds) but much slower on the (2000,70,192)-example which was stopped after an hour.

[TABLE]

Table 6: Facets-To-Faces versus BooleanConvert and expr2bdd

6 Two ways to enumerate ${\cal S}{\cal C}[k]$ from the facets of ${\cal S}{\cal C}$

From among the four tasks listed in 1.1, only $(E_{k})$ remains to be dealt with. We offer two methods. Method 1 (in 6.1) is faster, but Method 2 (in 6.2) boasts a theoretic assessment.

6.1 In what follows any representation of ${\cal S}{\cal C}$ as disjoint union of 012e-rows can be used as prerequisite for a compressed enumeration of ${\cal S}{\cal C}[k]$ . For instance, the output of the Facets-To-Faces algorithm in Table 5 would do, but for variation121212This representation actually stems from a variant of Faces-To-Facets discussed in Section 8. we chose to illustrate (see Table 7) our method on the representation ${\cal S}{\cal C}_{1}=\alpha_{1}\uplus\cdots\uplus\alpha_{6}$ ; this equality can be verified ad hoc131313 Note that $|\alpha_{1}|+\cdots+|\alpha_{6}|=16+12+\cdots+2=52$ as it must. It thus suffices to show that each $\alpha_{i}$ is contained in $F_{1}\cup\cdots\cup F_{5}$ , which is easy..

Additionally to the e-wildcard we now need the g-wildcard $(g(t),...,g(t))$ which means ’exactly $t$ many 1’s here’. Accordingly 01g-rows are defined, e.g. $(1,0,g(2),g(2),g(2))$ is

$\{(1,0,{\bf 1},{\bf 1},0),\ (1,0,{\bf 1},0,{\bf 1}),\ (1,0,0,{\bf 1},{\bf 1})\}$ . Distinct g-wildcards within a 01g-row are distinguished by subscripts. Note that $t$ must be strictly smaller than the number of symbols $g(t)$ because $(g(3),g(3))$ is impossible, and instead of $(g(3),g(3),g(3))$ we stick to $(1,1,1)$ .

6.1.1 We now describe the g-algorithm which, given ${\cal S}{\cal C}$ and $k\geq 0$ , represents ${\cal S}{\cal C}[k]$ as a disjoint union of 01g-rows. To fix ideas, let us target ${\cal SC}_{1}[3]$ . The subset $\alpha_{1}[3]\subseteq\alpha_{1}$ of all $3$ -faces (=bitstrings of Hamming weight 3) can be written as $\gamma_{1}=(g(3),g(3),g(3),0,0,0,0,0,g(3))$ (Table 7). Expressing $\alpha_{2}[3]$ similarly is a bit subtler. But writing $\alpha_{2}=(0,0,2,0,{\bf 1},0,{\bf 1},0,2)\uplus(0,0,2,0,{\bf g(1)},0,{\bf g(1)},0,2)$ one sees that the sets of 3-faces in the first and second part of $\alpha_{2}$ are $\gamma_{2,1}$ and $\gamma_{2,2}$ respectively. Likewise one verifies that $\alpha_{3}[3]=\gamma_{3},\ \alpha_{4}[3]=\gamma_{4,1}\uplus\gamma_{4,2},\ \alpha_{5}[3]=\gamma_{5},\ \alpha_{6}[3]=\gamma_{6,1}\uplus\gamma_{6,2}$ .

[TABLE]

Table 7: Compressing ${\cal SC}_{1}[3]$ with the $g$ -algorithm

What happens if, other than in Table 7, the $012e$ -rows $\alpha$ that constitute ${\cal S}{\cal C}$ feature several $e$ -wildcards per row? For instance if

(16) $r=(e_{1},e_{1},e_{1},e_{1},e_{1},\ e_{2},e_{2},e_{2},\ e_{3},e_{3},e_{3},\ e_{4},e_{4},\ e_{5},e_{5},\ e_{6},e_{6},\ 2,1,1,0),$

how can one represent $r[15]$ as disjoint union of preferably few $01g$ -rows? Although even one-by-one enumeration of $r[15]$ is non-trivial (this type of task is solvable in output-linear time [W2, sec. 3.2]), proceeding differently one can actually get a compressed enumeration. This is carried out (on a dual example) in a previous version of our article [arXiv:1812.02570v2, sec. 3.3.2], and it is fairly clear that matters generalize.

6.1.2 More important than giving a formal proof of ’fairly clear’ is another issue. Suppose our target had been ${\cal S}{\cal C}_{1}[4]$ , not ${\cal S}{\cal C}_{1}[3]$ . Then $\alpha_{5}[4]=\emptyset$ . Generally in the worst case a representation of ${\cal S}{\cal C}$ by disjoint 012e-rows $\alpha_{i}$ might be such that $99\%$ of all rows $\alpha_{i}$ have $\alpha_{i}[k]=\emptyset$ . Although this empty-row-issue prevents a (neat) theoretic assessment of the $g$ -algorithm, this does not preclude a good performance in practise.

6.2 Here we fine-tune the Second Naive Algorithm of 4.1.2 from outputting all faces to outputting all $k$ -faces.

Theorem 2: Suppose the $h$ facets of the simplicial complex ${\cal S}{\cal C}\subseteq{\cal P}[w]$ are given. Then for any fixed $k\in[w]$ the $R$ many $k$ -faces can be enumerated in time $O(Rhw^{2})$ .

Proof. Starting with $(2,2,\cdots,2)={\cal P}[w]$ one maintains an oscillating (LIFO) stack of $k$ -feasible $012$ -rows $r$ (i.e. $r\cap{\cal S}{\cal C}[k]\neq\emptyset$ ) until the stack is emptied. The topmost row $r$ of the stack is always processed as follows. Let $r_{0}$ and $r_{1}$ be the rows obtained from $r$ by turning its first 2 to 0 and 1 respectively. (Here ’first’ refers to some previosly fixed linear ordering of the indices $1,2,...,w$ .) Row $r_{0}$ is $k$ -feasible iff for at least one facet $F_{i}$ one has

$(17)\quad ones(r_{0})\subseteq F_{i}\ \hbox{\rm and}\ |\mbox{\it ones}(r_{0})|\leq k\leq|F_{i}\setminus zeros(r_{0})|.$

Likewise for $r_{1}$ . At least one of $r_{0}$ and $r_{1}$ is $k$ -feasible because $r$ is $k$ -feasible and $r=r_{0}\uplus r_{1}$ . The feasible row(s) is (are) put back on the stack. That is unless (say) $r_{0}$ is a bitstring, i.e. twos $(r_{0})=\emptyset$ . In this case we found a $k$ -face $r_{0}$ , which is output.

As to the cost, creating $r_{0},\ r_{1}$ from $r$ and recycling at least one of them to the stack, costs $O(wh)$ . Each output $k$ -face has at most $w$ recycled ancestors. It follows that the overall cost is $O(R\cdot w\cdot wh)$ . $\square$

6.3 Notwithstanding Theorem 2, in practise Method 2 (which like SNA in 4.1.3 suffers mediocre compression) is often inferior to Method 1 (which laughs away its empty-row-issue). At this point the author may be forgiven for reflecting more broadly about one-by-one, compression, and optimization. As mentioned in 4.1.1 enumeration of a powerset (one-by-one) is non-trivial. Even more so enumerating all $k$ -sets of a set [K, sec. 7.1.1.3]. Apart from the fun of it, arguably the only purpose of enumerating all objects of a given type, is to find the best object (e.g. one or all $f$ -minimal object(s) with respect to a target function $f$ ). Compression with multi-valued rows (be it 012e, 01g, or other kinds) serves that purpose better than one-by-one enumeration. If say $f(\{a_{1},...,a_{t}\}):=a_{1}+\cdots+a_{t}$ , then the $f$ -minimal set $A$ within the 01g-row below is quickly found to be $A=\{2,4,7,9,10\}$ (for brevity $g_{i}:=g_{i}(1)$ ):

[TABLE]

More about the interplay of compression and optimization can be found in [W4].

7 Assessing ${\cal S}{\cal C}$ from its minimal non-faces

Our results on $(E),(E_{k}),(C),(C_{\forall k})$ will be adapted (in that order) to the situation where not the faces but the minimal non-faces of a simplicial complex ${\cal S}{\cal C}$ are given. For instance let ${\cal S}{\cal C}$ be the family of all independent sets of a matroid. Then the facets are the bases and the minimal non-faces are the circuits of the matroid. Dual to the $e$ -widcard the $n$ -wildcard $(n,n,...,n)$ means ’at least one 0 here’, and 012n-rows are defined dually to 012e-rows. Generally if $r$ is a $012n$ -row with $\gamma:=|\mbox{twos}(r)|$ and with $s$ many $n$ -wildcards of length $\delta_{1},\cdots,\delta_{s}$ respectively, then

(18) $|r|=2^{\gamma}\cdot(2^{\delta_{1}}-1)\cdot(2^{\delta_{2}}-1)\cdots(2^{\delta_{s}}-1)$ .

If ${\cal G}\subseteq{\cal P}(W)$ is any hypergraph, then $X\subseteq W$ is a ${\cal G}$ -noncover if $X\not\supseteq G$ for all $G\in{\cal G}$ . The (noncover) n-algorithm of [W1] displays the set ${\cal N}({\cal G})$ of all ${\cal G}$ -noncovers as a disjoint union of 012n-rows. (More details about the $n$ -algorithm follow in the proof of Theorem 3.)

**7.1 ** Here we settle (E). Suppose that ${\cal SC}_{1}$ was given not by its facets listed in (1), but by its minimal nonfaces, which are these:

(19) $G_{1}=\{1,5\},\ G_{2}=\{2,5\},\ G_{3}=\{1,7\},\ G_{4}=\{2,3,7\},\ G_{5}=\{1,6\},\ G_{6}=\{2,6\}$ ,

$G_{7}=\{5,6\},\ G_{8}=\{6,7\},\ G_{9}=\{1,8\},\ G_{10}=\{5,8\},\ G_{11}=\{7,8\},\ G_{12}=\{1,4\}$ ,

$G_{13}=\{3,4\},\ G_{14}=\{4,5\},\ G_{15}=\{4,6\},\ G_{16}=\{4,7\},\ G_{17}=\{2,3,8\}$ .

For instance, $G_{17}$ is not a subset of any $F_{i}$ in (1), but each 2-element subset of $G_{17}$ is contained in some $F_{i}$ . Hence $G_{17}$ is a minimal non-face. Generally, let ${\cal S}{\cal C}$ be given by its minimal nonfaces $G_{1},G_{2},..$ , and so forth. It then holds that

(20) $X\in{\cal S}{\cal C}\ \Leftrightarrow\ (\forall i)X\not\supseteq G_{i}\ \Leftrightarrow\ (\forall i)(X^{c}\cap G_{i}\neq\emptyset)$ .

From the first equivalence it follows that ${\cal SC}_{1}={\cal N}({\cal G}_{1})$ where ${\cal G}_{1}=\{G_{1},\cdots,G_{17}\}$ . Applying the $n$ -algorithm to ${\cal G}_{1}$ delivers ${\cal SC}_{1}$ as a disjoint union of the $012n$ -rows $r_{1},\cdots,r_{7}$ in Table 8. (Incidently only $r_{2}$ is a proper 012n-row.)

[TABLE]

Table 8: Compressing ${\cal SC}_{1}$ with the noncover $n$ -algorithm

Theorem 3: Assume the $h$ minimal non-faces of the simplicial complex ${\cal S}{\cal C}\subseteq{\cal P}[w]$ are known. Then ${\cal S}{\cal C}$ can be represented as a disjoint union of $R$ many $012n$ -rows in polynomial total time $O(Rh^{2}w^{2})$ .

Proof. The minimal non-faces $G_{i}$ in (17) suggest to view ${\cal SC}_{1}$ (or any ${\cal S}{\cal C})$ as the model set $\mbox{Mod}(\varphi_{1}):=\{u\in\{0,1\}^{9}:\varphi_{1}(u)=1\}$ of the Boolean function141414Because of $Mod(\varphi_{1})=Mod(\psi_{1})={\cal S}{\cal C}_{1}$ we have $\varphi_{1}=\psi_{1}$ , despite appearances.

(21) $\varphi_{1}(x_{1},\cdots,x_{9}):=(\overline{x}_{1}\vee\overline{x}_{5})\wedge(\overline{x}_{2}\vee\overline{x}_{5})\wedge\cdots\wedge(\overline{x}_{2}\vee\overline{x}_{3}\vee\overline{x}_{8})$

This is a Horn-CNF since each clause has at most one positive literal (in fact none). Generally, if $\varphi:\{0,1\}^{w}\rightarrow\{0,1\}$ is a Horn-CNF with $h$ clauses then the Horn- $n$ -algorithm of [W1, Cor.6] enumerates $\mbox{Mod}(\varphi)$ as a union of $R$ many disjoint $012n$ -rows in total polynomial time $O(Rh^{2}w^{2})$ . $\square$

When the Horn-CNF has only negative clauses, the Horn $n$ -algorithm simplifies and was called ’noncover $n$ -algorithm’ in [W1]. The impression from (20) that the noncover n-algorithm is related to the transversal e-algorithm is justified; in fact a moment’s thought reveals that upon switching the roles of 0 and 1 the n-algorithm becomes the e-algorithm, and vice versa.

One application of Theorem 3 was alluded to in Section 1.3: From a knowledge of all minimal infrequent sets, one can compress the simplicial complex of all frequent sets.

7.1.2 As to problem $(E_{k})$ , i.e. the enumeration of all $k$ -faces from the minimal non-faces, this can be handled by applying the (dual) $g$ -algorithm to the individual 012n-rows in Table 8. Trouble is, as in 6.1 this does not yield a polynomial total time algorithm because of the empty-row-issue. It remains an open question whether the analogon of Theorem 2 holds. More precisely by ’analogon’ we mean the statement that ensues from Theorem 2 when the part ’the $h$ facets’ is replaced by ’the $h$ minimal non-faces’. The problem is that (17) does not translate smoothly from facets $F_{i}$ to minimal non-faces $G_{i}$ .

7.2 As to the counting problem (C), the cardinality of ${\cal SC}_{1}$ is readily obtained from Table 8:

(22) $|{\cal SC}_{1}|=|r_{1}|+\cdots+|r_{7}|=16+6+8+8+4+2+8=52.$

As to problem $(C_{\forall k})$ , each face-number $N_{k}$ of ${\cal SC}_{1}$ can be calculated from Table 8 by matching each 012n-row with some auxiliary polynomial, akin to 3.3. We hence call this method n+rp+sub.

**7.3 ** Hypergraph Dualization (HD) is the task to calculate the set ${\cal M}{\cal T}({\cal H})$ of all minimal transversals of a hypergraph ${\cal H}\subseteq{\cal P}[w]$ . This has plenty applications. As to HD in the present situation, let ${\cal S}{\cal C}$ be a simplicial complex. Then by (20), the complements of its facets $F_{i}$ ’s are exactly the minimal transversals of its minimal non-faces $G_{i}$ ’s, and vice versa. Thus if HD was easy, one could switch back and forth between the $F_{i}$ ’s and $G_{i}$ ’s at one’s convenience; for instance discarding the seventeen $G_{i}$ ’s in (19) in favor of the five $F_{i}$ ’s in (1).

Unfortunately HD is hard. Despite partial successes it remains an open problem whether HD can be solved in polynomial total time. We stress that the e-algorithm computes the set ${\cal T}({\cal H})$ of all transversals. Extra work151515This can actually be done, not in total polynomial time, but whilst maintaining compression to some degree; this is work in progress, arXiv:2008.08996. In one special case this worked particularly well: If ${\cal H}$ is the set of all minimal cutsets of a graph $G$ , then ${\cal T}({\cal H})$ is the set of all connected edge-sets. In arXiv:2002.09707 it is shown how the family ${\cal M}{\cal T}({\cal H})$ of all trees can be compressed. is required to ’sieve’ ${\cal M}{\cal T}({\cal H})$ from ${\cal T}({\cal H})$ .

8 Can one go from simplicial complexes to general DNFs ?

Suppose ${\cal S}{\cal C}$ has facets $F_{1}$ to $F_{h}$ , and by induction we have obtained for some $t\in[h-1]$ a type (18) representation. In Section 8 we handle the newcomer $012$ -row $r:={\cal P}(F_{t+1})$ in dual fashion:

(23) ${\cal P}(F_{1})\cup\cdots\cup{\cal P}(F_{t+1})=\rho_{1}\uplus\cdots\uplus\rho_{s}\uplus(r\setminus(\rho_{1}\uplus\cdots\uplus\rho_{s})).$

We keep the notation $r_{i}={\cal P}(F_{i})$ for $i\leq 5$ , and refer to Table 9 for the definition of $r^{\prime}_{i}\ (i\geq 6)$ . Furthermore, put say $A\setminus B\setminus C\setminus D:=((A\setminus B)\setminus C)\setminus D$ . Based on (23) our Tentative Facets-To-Faces algorithm proceeds as follows in our toy example ${\cal S}{\cal C}_{1}={\cal P}(F_{1})\cup\cdots\uplus{\cal P}(F_{5})$ :

$\begin{array}[]{rll}r_{1}\cup r_{2}=r_{1}\uplus(r_{2}\setminus r_{1})&=:&r_{1}\uplus r^{\prime}_{6}\\ \\ r_{1}\uplus r^{\prime}_{6}\uplus(r_{3}\setminus(r_{1}\uplus r^{\prime}_{6}))=r_{1}\uplus r^{\prime}_{6}\uplus(r_{3}\setminus r_{1}\setminus r^{\prime}_{6})&=:&r_{1}\uplus r^{\prime}_{6}\uplus r^{\prime}_{7}\\ \\ r_{1}\uplus r^{\prime}_{6}\uplus r^{\prime}_{7}\uplus(r_{4}\setminus r_{1}\setminus r^{\prime}_{6}\setminus r^{\prime}_{7})&=:&r_{1}\uplus r^{\prime}_{6}\uplus r^{\prime}_{7}\uplus r^{\prime}_{8}\\ \\ r_{1}\uplus r^{\prime}_{6}\uplus r^{\prime}_{7}\uplus r^{\prime}_{8}\uplus(r_{5}\setminus r_{1}\setminus r^{\prime}_{6}\setminus r^{\prime}_{7}\setminus r^{\prime}_{8})&=:&r_{1}\uplus r^{\prime}_{6}\uplus r^{\prime}_{7}\uplus r^{\prime}_{8}\uplus\rho^{\prime}_{1}\uplus\rho^{\prime}_{2}\\ \end{array}$

Note that $r_{4}\setminus r_{1}$ is disjoint from $r^{\prime}_{6}$ and $r^{\prime}_{7}$ , and hence $r_{4}\setminus r_{1}\setminus r^{\prime}_{6}\setminus r^{\prime}_{7}=r_{4}\setminus r_{1}=:r^{\prime}_{8}$ . Likewise $r_{5}\setminus r_{1}$ being disjoint from $r^{\prime}_{6}$ and $r^{\prime}_{7}$ implies $r_{5}\setminus r_{1}\setminus r^{\prime}_{6}\setminus r^{\prime}_{7}=r_{5}\setminus r_{1}=:\rho^{\prime}$ . The detachment of $\rho_{8}^{\prime}$ from $r_{8}^{\prime}$ is of type $012e\setminus 012e$ as opposed to $012e\setminus 02$ in Section 4. Before we look at type $012e\setminus 012e$ detachments more systematically we argue ad hoc as follows. Since $\rho^{\prime}\cap r^{\prime}_{8},\ \rho^{\prime}_{1},\ \rho^{\prime}_{2}$ are contained in $\rho^{\prime}$ , and are mutually disjoint, and their cardinalities sum up to $2+4+6=|\rho^{\prime}|$ , it follows that $\rho^{\prime}\setminus r^{\prime}_{8}=\rho^{\prime}_{1}\uplus\rho^{\prime}_{2}$ . One checks that

(24) $|r_{1}|+|r^{\prime}_{6}|+|r^{\prime}_{7}|+|r^{\prime}_{8}|+|\rho^{\prime}_{1}|+|\rho^{\prime}_{2}|=16+12+2+12+4+6=52,$

which matches the cardinality $|{\cal S}{\cal C}_{1}|$ (which we previously derived in various ways).

[TABLE]

Table 9: Compressing ${\cal S}{\cal C}_{1}$ with a Tentative Facets-To-Faces algorithm

8.1 We saw that initial $02\setminus 02$ detachments can quickly ’deteriorate’ to $012e\setminus 012e$ detachments such as $\rho^{\prime}\setminus r_{8}^{\prime}$ . While $\rho^{\prime}\setminus r_{8}^{\prime}$ was handled ad hoc, let us now dig deeper. Namely, by definition $mm..m$ means ’at least one 1 and at least one 0 here’. Let $\rho$ and $r$ be as in Table 10. With our new wildcard the row difference $\rho\setminus r$ can be neatly expressed as $\rho_{1}\uplus\rho_{2}$ . Indeed, clearly $\rho_{1}\uplus\rho_{2}\subseteq\rho\setminus r$ . If there was $x\not\in\rho\setminus r$ with $x\not\in\rho_{1}\uplus\rho_{2}$ then $x_{4}=x_{5}=x_{6}=0$ leads to the contradiction ${\bf x}\in(2,2,1,0,0,0)\subseteq r$ .

Table 10: Using the $mm...m$ wildcard to recompress $\rho\setminus r$

As appealing as this may look, the downside is that embracing $012men$ -rows forces us to cope with detachments of type $012men\setminus 012men$ . Table 11 must suffice as indication that things do not get out of hand. The verification that indeed $\rho\setminus r=\rho_{1}\uplus\cdots\uplus\rho_{13}$ is left to the dedicated reader.

[TABLE]

Table 11 : Recompression of a set difference $\rho\setminus r$ of type $(012men)\setminus(012en)$

Once $012men\setminus 012men$ detachments are mastered, any DNF can be transformed to a fancy ESOP that uses $012men$ -rows. Given a CNF instead of a DNF, a wholly different method to transform the CNF to a fancy ESOP (using 012n-rows) is presented in [W4].

References

[BM]

J.D. Boissonnat, C. Maria, The simplex-tree: An efficient data structure for general simplicial complexes, Algorithmica 70 (2014) 406-427. 2. [BN]

M.O. Ball, G.L. Nemhauser, Matroids and reliability analysis problem, Math. of Oper. Res. 4 (1979) 132-143. 3. [DKM]

A.M. Duval, J. Klivans, J.L. Martin, The partitionability conjecture, Notices of the AMS 64 (2017) 117-122. 4. [FR]

K. Fukuda, V. Rosta, Combinatorial face enumeration in convex polytopes, Computational Geometry 4 (1994) 191-198. 5. [GO]

B. Ganter, S. Obiedkov, Conceptual Exploration, Springer 2016. 6. [KP]

V. Kaibel, M.E. Pfetsch, Computing the face lattice of a polytype from its vertex-facet incidences, Computational Geometry 23 (2002) 281-290. 7. [K]

D. Knuth, The Art of Computer Programming, Volume 4A, Addison-Wesley 2011. 8. [V]

L.G. Valiant, The complexity of enumeration and reliability problems, SIAM J. Comput. 8 (1979) 410-421. 9. [W1]

M. Wild, Compactly generating all satisfying truth assignments of a Horn formula, J. Satisf. Boolean Model. Comput. 8 (2012) 63-82. 10. [W2]

M. Wild, Counting or producing all fixed cardinality transversals, Algorithmica 69 (2014) 117-129. 11. [W3]

M. Wild, The joy of implications, aka pure Horn formulas: Mainly a survey, Theoretical Computer Science 658 (2017) 264-292. 12. [W4]

M. Wild, Compression with wildcards: From CNFs to orthogonal DNFs by imposing the clauses one-by-one, to appear in The Computer Journal. 13. [WK]

X. Wu, V. Kumar, The top ten algorithms in data mining, Chapman and Hall 2009. .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Compression with wildcards:

1 Introduction

2 Calculating the cardinality of SC{\cal S}{\cal C}SC from its facets

3 Calculating the face-numbers of SC{\cal S}{\cal C}SC from its facets

4 The Facets-To-Faces algorithm

5 Numerical experiments

6 Two ways to enumerate SC[k]{\cal S}{\cal C}[k]SC[k] from the facets of SC{\cal S}{\cal C}SC

7 Assessing SC{\cal S}{\cal C}SC from its minimal non-faces

8 Can one go from simplicial complexes to general DNFs ?

References

2 Calculating the cardinality of ${\cal S}{\cal C}$ from its facets

3 Calculating the face-numbers of ${\cal S}{\cal C}$ from its facets

6 Two ways to enumerate ${\cal S}{\cal C}[k]$ from the facets of ${\cal S}{\cal C}$

7 Assessing ${\cal S}{\cal C}$ from its minimal non-faces