Partial Function Extension with Applications to Learning and Property   Testing

Umang Bhaskar; Gunjan Kumar

arXiv:1812.05821·cs.DS·December 17, 2018

Partial Function Extension with Applications to Learning and Property Testing

Umang Bhaskar, Gunjan Kumar

PDF

TL;DR

This paper investigates the problem of extending partial functions to satisfy properties like subadditivity, submodularity, and convexity, providing complexity bounds, algorithms, and applications in learning and property testing.

Contribution

It offers new complexity results, algorithms, and testing methods for extending partial functions with key properties, advancing understanding in optimization and learning.

Findings

01

Extension for subadditive functions is coNP-complete with tight approximability bounds.

02

Algorithms for submodular extension are effective in specific cases, but the general complexity remains open.

03

Determining the existence of convex function extensions is efficient, but computing their values is NP-hard.

Abstract

In partial function extension, we are given a partial function consisting of $n$ points from a domain and a function value at each point. Our objective is to determine if this partial function can be extended to a function defined on the domain, that additionally satisfies a given property, such as convexity. This basic problem underlies research questions in many areas, such as learning, property testing, and game theory. We formally study the problem of extending partial functions to satisfy fundamental properties in combinatorial optimization, focusing on upper and lower bounds for extension and applications to learning and property testing. (1) For subadditive functions, we show the extension problem is coNP-complete, and we give tight bounds on the approximability. We also give an improved lower bound for learning subadditive functions, and give the first nontrivial testers for…

Equations40

Pr_{S_{1},\dots,S_{l}\sim D^{*}}\big{[}Pr_{S\sim D^{*}}[f^{*}(S)\leq f(S)\leq\alpha f^{*}(S)]\geq 1-\epsilon\big{]}\geq 1-\delta.

Pr_{S_{1},\dots,S_{l}\sim D^{*}}\big{[}Pr_{S\sim D^{*}}[f^{*}(S)\leq f(S)\leq\alpha f^{*}(S)]\geq 1-\epsilon\big{]}\geq 1-\delta.

w_{A} + w_{B} \geq w_{A \cup B} + w_{A \cap B} \forall A, B \subseteq [m], w_{A} = f_{A} \forall A \in D .

w_{A} + w_{B} \geq w_{A \cup B} + w_{A \cap B} \forall A, B \subseteq [m], w_{A} = f_{A} \forall A \in D .

w_{A} + w_{B} \geq w_{A \cup B} + w_{A \cap B} \forall A, B \subseteq [m], w_{A} = f_{A} \forall A \in D .

w_{A} + w_{B} \geq w_{A \cup B} + w_{A \cap B} \forall A, B \subseteq [m], w_{A} = f_{A} \forall A \in D .

B \sum y_{{A, B}} - P, Q : P \cup Q = A \sum y_{{P, Q}} - P, Q : P \cap Q = A \sum y_{{P, Q}}

B \sum y_{{A, B}} - P, Q : P \cup Q = A \sum y_{{P, Q}} - P, Q : P \cap Q = A \sum y_{{P, Q}}

\displaystyle\sum_{A\in\mathcal{D}}f_{A}\Big{(}\sum_{P,Q:P\cup Q=A}y_{\{P,Q\}}+\sum_{P,Q:P\cap Q=A}y_{\{P,Q\}}-\sum_{B}y_{\{A,B\}}\Big{)}

y_{{A, B}}

f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y) .

f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y) .

\overset{g}{^} (x) = in f ⎩ ⎨ ⎧ y \in C \sum λ_{y} f (y) : λ \geq 0, y \in C \sum λ_{y} = 1, \mbox an d y \in C \sum λ_{y} y = x ⎭ ⎬ ⎫ .

\overset{g}{^} (x) = in f ⎩ ⎨ ⎧ y \in C \sum λ_{y} f (y) : λ \geq 0, y \in C \sum λ_{y} = 1, \mbox an d y \in C \sum λ_{y} y = x ⎭ ⎬ ⎫ .

\tilde{g} (x) = sup {λ f (y) + (1 - λ) f (z) : x = λ y + (1 - λ) z, y, z \in Conv (C), λ \geq 1} .

\tilde{g} (x) = sup {λ f (y) + (1 - λ) f (z) : x = λ y + (1 - λ) z, y, z \in Conv (C), λ \geq 1} .

\mbox C o n v e x - P : min i = 1 \sum n λ_{i} f_{i}, \mbox s . t . i = 1 \sum n λ_{i} T_{i} = x, i = 1 \sum n λ_{i} = 1, \mbox an d λ_{i} \geq 0 \forall i \in [n] .

\mbox C o n v e x - P : min i = 1 \sum n λ_{i} f_{i}, \mbox s . t . i = 1 \sum n λ_{i} T_{i} = x, i = 1 \sum n λ_{i} = 1, \mbox an d λ_{i} \geq 0 \forall i \in [n] .

\mbox C o n v e x - D : max μ + j = 1 \sum m x_{j} y_{j} \mbox s . t . j = 1 \sum m (T_{i})_{j} y_{j} + μ \leq f_{i} \forall i \in [n] .

\mbox C o n v e x - D : max μ + j = 1 \sum m x_{j} y_{j} \mbox s . t . j = 1 \sum m (T_{i})_{j} y_{j} + μ \leq f_{i} \forall i \in [n] .

\tilde{f} (x) = 1 \leq i \leq N max {j = 1 \sum m x_{j} y_{j}^{i} + μ^{i}} .

\tilde{f} (x) = 1 \leq i \leq N max {j = 1 \sum m x_{j} y_{j}^{i} + μ^{i}} .

\tilde{g} (x) = sup {λ \overset{g}{^} (y) + (1 - λ) \overset{g}{^} (z) : x = λ y + (1 - λ) z, y, z \in Conv (D), λ \geq 1} .

\tilde{g} (x) = sup {λ \overset{g}{^} (y) + (1 - λ) \overset{g}{^} (z) : x = λ y + (1 - λ) z, y, z \in Conv (D), λ \geq 1} .

min α

min α

f_{i} \leq w_{i}^{T} χ (T_{i}) \leq α f_{i} \forall i \in [n]

f_{i} \leq w_{i}^{T} χ (T_{i}) \leq α f_{i} \forall i \in [n]

w_{i}^{T} χ (T_{i}) \geq w_{j}^{T} χ (T_{i}) \forall i, j \in [n]

w_{i}^{T} χ (T_{i}) \geq w_{j}^{T} χ (T_{i}) \forall i, j \in [n]

w_{i} \in R_{+}^{m}

w_{i} \in R_{+}^{m}

α \geq 1

α \geq 1

L (w, x) = z \to w lim (λ \overset{g}{^} (w) + (1 - λ) \overset{g}{^} (z)) .

L (w, x) = z \to w lim (λ \overset{g}{^} (w) + (1 - λ) \overset{g}{^} (z)) .

L (w, x) = z \to w lim λ (\overset{g}{^} (w) - \overset{g}{^} (z)) + \overset{g}{^} (z) = \overset{g}{^} (w) + z \to w lim λ j = 1 \sum m (w_{j} - z_{j}) y_{j}^{k} .

L (w, x) = z \to w lim λ (\overset{g}{^} (w) - \overset{g}{^} (z)) + \overset{g}{^} (z) = \overset{g}{^} (w) + z \to w lim λ j = 1 \sum m (w_{j} - z_{j}) y_{j}^{k} .

L (w, x) = \overset{g}{^} (w) + z \to w lim λ j = 1 \sum m (w_{j} - z_{j}) y_{j}^{k} = \overset{g}{^} (w) + z \to w lim j = 1 \sum m (x_{j} - z_{j}) y_{j}^{k} = \overset{g}{^} (w) + j = 1 \sum m (x_{j} - w_{j}) y_{j}^{k} = j = 1 \sum m x_{j} y_{j}^{k} + μ^{k} .

L (w, x) = \overset{g}{^} (w) + z \to w lim λ j = 1 \sum m (w_{j} - z_{j}) y_{j}^{k} = \overset{g}{^} (w) + z \to w lim j = 1 \sum m (x_{j} - z_{j}) y_{j}^{k} = \overset{g}{^} (w) + j = 1 \sum m (x_{j} - w_{j}) y_{j}^{k} = j = 1 \sum m x_{j} y_{j}^{k} + μ^{k} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\newfloatcommand

capbtabboxtable[][\FBwidth]

Partial Function Extension with Applications to Learning and Property Testing

Umang Bhaskar Work supported in part by a Ramanujan fellowship. Email: [email protected] Tata Institute of Fundamental Research, Mumbai

Gunjan Kumar Email: [email protected] Tata Institute of Fundamental Research, Mumbai

Abstract

In partial function extension, we are given a partial function consisting of $n$ points from a domain and a function value at each point. Our objective is to determine if this partial function can be extended to a function defined on the domain, that additionally satisfies a given property, such as convexity. This basic problem underlies research questions in many areas, such as learning, property testing, and game theory. We formally study the problem of extending partial functions to satisfy fundamental properties in combinatorial optimization, focusing on upper and lower bounds for extension and applications to learning and property testing.

•

For subadditive functions, we show the extension problem is coNP-complete, and we give tight bounds on the approximability. We also give an improved lower bound of $\Omega(\sqrt{m})$ for learning subadditive functions. Previously, Balcan et al. (2012) gave a lower bound of $\Omega(\sqrt{m}/\log m)$ for this problem. We also give the first nontrivial testers for subadditive and XOS functions.

•

For submodular functions, we show that if a partial function can be extended to a submodular function on the lattice closure111The lattice closure $LC(\mathcal{D})$ of set of points $\mathcal{D}$ is the minimal set that contains $\mathcal{D}$ and is closed under union and intersection. of the partial function, it can be extended to a submodular function on the entire domain. We obtain algorithms for determining extendibility in a number of cases, including if $n$ is a constant, or the points are nearly the same size. The result uses a combinatorial certificate for non-extendibility which we call a square certificate. Seshadhri and Vondrak (2014) previously give a characterization in terms of path certificates. The complexity of extendibility is in general unresolved.

•

Lastly, for convex functions in $\mathbb{R}^{m}$ , we show an interesting juxtaposition: while we can determine the existence of an extension efficiently, computing the value of a widely-studied convex extension at a given point is strongly NP-hard.

1 Introduction

A partial function consists of a set $\mathcal{D}$ of points from a domain, and a real value at each of the points. Given a property $P$ , the partial function extension problem is to determine if there exists a total function $f$ ( $f$ is defined on the entire domain) that extends the partial function ( $f$ equals the given value at each point in $\mathcal{D}$ ) and satisfies $P$ . E.g., property $P$ could be linearity, and we are required to determine if there exists a linear function that extends the given partial function. In this paper, we study partial function extension when $\mathcal{D}$ is finite, to fundamental properties in combinatorial optimization.

The problem of partial function extension underlies research and techniques in a number of different areas, and is hence intensely studied. We mention three such areas. Firstly, in property testing, a function is given by an oracle, and the problem is to determine with high probability by querying the oracle whether the function satisfies a required property, or is far from it. The focus in property testing is on algorithms with optimal query-complexity. Typically a testing algorithm cleverly queries some sets and rejects if the values at the queried sets cannot be extended to a function with the required property. Clearly characterizing when a partial function can be extended plays an important role here. The problem of partial function extension, and its connection to property testing, is also explicitly raised by Seshadhri and Vondrak [31].

Secondly, in learning theory, the goal is to understand if a family of functions can be learned by random samples. That is, does there exist an efficient algorithm that for any target function in the family takes as input the function values at a set of sampled points, and returns a function that is “close” to the target function? Here, partial function extension can be used to give lower bounds on the learnability of various function classes (and has been used thus in previous papers, e.g., Balcan, Constantin, Iwata and Wang [3]).

Thirdly, in economics, given data from experiments regarding agent behaviour (such as the purchases made by an agent, or the bids of a (truthful) bidder in auctions), revealed preference theory studies whether the data is rationalizable by utility functions with a particular property. That is, whether there exists a utility function with a particular property that is consistent with the observed data. A natural assumption for utility functions is “diminishing marginal returns”, which translates to submodularity for indivisible goods, and concavity for divisible goods [29]. Other assumptions on utility functions are also common, e.g., subadditivity, XOS, etc [3, 11]. The problem of deciding rationalizability by utility functions with these properties is exactly the problem we study.

In each of the above areas, a basic step towards a solution is often to determine if a given partial function can be extended to a total function. In this paper, rather than a means to an end, we study the problem of partial function extension itself. We focus on the complexity of deciding if a partial function can be extended to functions satisfying fundamental properties — subadditivity, XOS, submodularity, and convexity. These represent perhaps the most commonly studied classes of functions in all of combinatorial optimization. In obtaining our results on partial function extension, we show that the structural lemmas can be used to obtain several results for property testing and learning, thus validating a direct study of partial function extension.

Formally, a partial function is a set of duples $H=\{(T_{1},f_{1}),$ $(T_{2},f_{2}),$ $\dots,$ $(T_{n},f_{n})\}$ , with $T_{i}$ in the domain $\{0,1\}^{m}$ or $\mathbb{R}^{m}$ , and $f_{i}\in\mathbb{R}$ the observed function value at $T_{i}$ . Additionally, we are given a property $P$ . The $P$ -Extension problem is to determine if there exists a total function $f$ defined on the domain $\{0,1\}^{m}$ or $\mathbb{R}^{m}$ that satisfies property $P$ and extends the given partial function $H$ , i.e., $f(T_{i})=f_{i}$ for all $i\in\{1,\dots,n\}$ . We also consider the Approximate $P$ -Extension problem, where we want to determine the minimum multiplicative error for a given partial function to extend to a function that satisfies the given property. That is, in Approximate $P$ -Extension, we want to find the minimum $\alpha\geq 1$ such that a function $f$ satisfies property $P$ and additionally, $f_{i}\leq f(T_{i})\leq\alpha f_{i}$ for all $i\in\{1,\dots,n\}$ . If Approximate $P$ -Extension is computationally hard, we are interested in approximating $\alpha$ .

Note that in our case, our input is $H$ . An algorithm is efficient if it runs in time polynomial in the size of $H$ , which may be exponential in the dimension $m$ .

We note a basic difference between partial function extension on the one hand, and property testing and learning on the other. In partial function extension, there is no target function $f^{*}$ ; we are interested in determining if any total function that extends the given partial function has the required property. In property testing and learning, there exists a target function $f^{*}$ which we access via an oracle (in property testing) or via samples from a distribution (in learning).

A function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is subadditive if $f(A)+f(B)\geq f(A\cup B)$ for all sets $A$ and $B$ . A function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is an XOS function if it can be expressed as the maximum of $k$ linear functions for some $k\geq 1$ . XOS functions are a subclass of subadditive functions. A function $f$ is submodular if $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ for all $A,B\subseteq[m]$ .

For notation, $\mathcal{D}:=\{T_{i}\}_{i\in[n]}$ is the set of points in the given partial function $H$ . These are called defined points, and $\mathcal{U}:=2^{[m]}\setminus\mathcal{D}$ are undefined points. Points on the hypercube $\{0,1\}^{m}$ are naturally subsets of $[m]$ , and for $S\subseteq[m]$ , $\chi(S)\in\{0,1\}^{m}$ is its characteristic vector. We frequently use this correspondence. All missing proofs are in the appendices.

Our Contribution.

We show the following main results.

Result 1.

Subadditive Extension is coNP-complete. There is an $O(\log m)$ approximation algorithm for Approximate Subadditive Extension, and if $P\neq NP$ , this is tight.

The lower bounds in the theorem depend upon characterizations of partial functions that can be extended to subadditive functions. The upper bound uses the fact that for XOS functions, a well-studied subclass of subadditive functions, Approximate Extension (and hence Extension) can be solved in polynomial time. Further, any subadditive function can be approximated by an XOS function, by a factor of $O(\log m)$ [11, 20].

Our characterization for subadditive functions, as well as known characterizations for XOS functions, can be used to give the following results for learning and property testing.

Result 2.

Subadditive functions cannot be learned by a factor of $o(\sqrt{m})$ .

This improves upon a previous lower bound of $\Omega(\sqrt{m}/\log m)$ [3]. We combine the characterization of subadditive functions with recent results on the size of combinatorial families of sets called $r$ -cover free families [23, 22] for the above result.

Result 3.

Given $\epsilon>0$ , there are testers for subadditive and XOS functions that make $2^{m/2+O(\sqrt{m\log(1/\epsilon)})}$ queries. Further, there is a tester for nonmonotone subadditive functions that makes $2^{O(\sqrt{m\log(1/\epsilon)}\log m)}$ queries.

We thus obtain the first nontrivial testers for subadditive and XOS functions.

For submodular functions, we show the following main result. Given $\mathcal{F}\subseteq\{0,1\}^{m}$ , we say a function $f$ is submodular in $\mathcal{F}$ if $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ for all $A,B,A\cup B,A\cap B\in\mathcal{F}$ .

Result 4.

For partial function $H$ with defined points $\mathcal{D}$ , let $\mathcal{F}$ be the family of sets that are (i) both contained in and contained by some set in $\mathcal{D}$ , and (ii) obtained by the union and intersection of sets in $\mathcal{D}$ (are in the lattice closure of $\mathcal{D}$ ). Then the partial function is extendible to a submodular function in $\{0,1\}^{m}$ iff it can be extended to a submodular function in $\mathcal{F}$ .

Thus if $|\mathcal{F}|$ is $poly(m,|\mathcal{D}|)$ then Submodular Extension can be solved in polynomial time. This includes the case when $|\mathcal{D}|$ is a constant, when all points in $\mathcal{D}$ have size difference $O(\log m)$ , and when $\mathcal{D}$ is an antichain. Further, if $\mathcal{D}$ is an antichain, then $\mathcal{F}=\mathcal{D}$ and any assignment of values to the points in $\mathcal{D}$ is trivially submodular in $\mathcal{D}$ . Hence if $\mathcal{D}$ is an antichain, it can always be extended to a submodular function. Our results for submodular functions depend on a combinatorial characterization of nonextendibility, which we call a square certificate. Our proof of the above result and development of square certificates forms our main technical contribution. Seshadhri and Vondrak [31] also study property testing of submodular functions, and develop an alternative certificate called a path certificate. We believe that square certificates are more natural, and may lead to improved testers for submodular functions. In general the problem of Submodular Extension remains open. However, we can use our result for the extendibility of antichains to obtain the following result for learning submodular functions

Result 5.

Submodular functions cannot be learned.222In contrast, the class of nonnegative, monotone submodular functions can be learned with approximation ratio $O(\sqrt{m})$ [6].

Lastly, we consider convex functions. The problem of Convex Extension has been studied before in convex analysis (e.g., [21, 33]). We however show an interesting juxtaposition of results: While it can be determined in polynomial time if a partial function is extendible to a convex function, determining the value of a natural and widely-studied extension at a given point is NP-hard.

Result 6.

Approximate Convex Extension (and hence Convex Extension) is in $P$ . However, determining the value of a canonical extension at a given point is strongly NP-hard.

2 Subadditive and XOS Functions

We now consider the problem of extending a given partial function $H$ to monotone subadditive and XOS functions. A function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is subadditive if $f(A)+f(B)\geq f(A\cup B)$ for all sets $A$ and $B$ , and monotone if $f(A)\geq f(B)$ for all $A\supseteq B$ . A function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is an XOS function if it can be expressed as the maximum of $k$ linear functions for some $k\geq 1$ , i.e., there exist vectors $w_{i}\in\mathbb{R}_{\geq 0}^{m}$ for $1\leq i\leq k$ such that $f(S)=\max w_{i}^{T}\chi(S)$ for every $S\subseteq[m]$ . XOS functions are a subclass of subadditive functions and are equivalent to fractionally subadditive functions [24]. A function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is fractionally subadditive if $f(T)\leq\sum_{S}\lambda_{S}f(S)$ for all $T$ such that $\lambda_{S}\geq 0$ and $\sum_{S:s\in S}\lambda_{S}\geq 1$ for each $s\in T$ . Subadditive functions capture the important case of complement-free functions, for which no two subsets of the ground set $[m]$ “complement” each other. This is a natural assumption in many applications, and hence these functions and various subclasses, including XOS functions, are widely used in game theory [3, 11, 29].

For XOS functions, the following positive result follows from writing a linear program with the vectors $(w_{i})_{i\leq k}$ with $w_{i}\in\mathbb{R}^{m}_{+}$ as variables. While in general $k$ may be exponential, a partial function $H$ is extendible iff the linear program is feasible for $k=n$ .

Theorem 1.

Aproximate XOS Extension (and hence XOS Extension) can be solved efficiently.

We give the following characterization for subadditive functions, also implicit in Lemma 3.3 of [2].

Lemma 1 ([2]).

Partial function $H$ is extendible to a subadditive function iff $\sum_{i=1}^{r}f(T_{i})\geq f(T_{r+1})$ for all $T_{1},\dots,T_{r},T_{r+1}\in\mathcal{D}$ such that $\cup_{i=1}^{r}T_{i}\supseteq T_{r+1}$ .

Proof.

For the first direction, if there exist $T_{1},\dots,T_{r},T_{r+1}\in\mathcal{D}$ such that $\cup_{i=1}^{r}T_{i}\supseteq T_{r+1}$ and $\sum_{i=1}^{r}f(T_{i})<f(T_{r+1})$ then either $f(T_{r+1})>f(\cup_{i=1}^{r}T_{i})$ or $f(\cup_{i=1}^{r}T_{i})>\sum_{i=1}^{r}f(T_{i})$ , and hence either monotonicity or subadditivity is violated. For the other direction, first assume that $\cup_{i=1}^{n}T_{i}=[m]$ . Then the function $\hat{f}(S)=\min\{\sum_{i=1}^{r}f(T_{i})|S\subseteq\cup_{i=1}^{r}T_{i},\thinspace T_{i}\in\mathcal{D}\thinspace\forall i\in[r]\}$ can be seen to be monotone subadditive extension. If $\cup_{i=1}^{n}T_{i}\subsetneq[m]$ then the function $\tilde{f}$ is a monotone subadditive extension where $\tilde{f}(S)=\hat{f}(S)$ for all $S\subseteq\cup_{i=1}^{n}T_{i}$ , and otherwise $\tilde{f}(S)=\hat{f}(S^{\prime})$ where $S^{\prime}=S\cap\cup_{i=1}^{n}T_{i}$ . ∎

We immediately obtain the following result.

Corollary 1.

Subadditive Extension is in coNP, and be solved in $poly(m,2^{n})$ time.

We use the characterization to show that $\Theta(\log m)$ is a tight bound on the approximability of Approximate Subadditive Extension, unless $P=NP$ (and that the Extension problem is coNP-complete). For the lower bound, we give a reduction from Set-Cover. For the upper bound, we use earlier results which show that any subadditive function can be $O(\log m)$ -approximated by an XOS function [11, 20]. Since Approximate XOS Extension can be efficiently solved (Theorem 1), this gives us our upper bound.

Theorem 2.

Subadditive Extension is coNP-complete. There is an $O(\log m)$ approximation algorithm for Approximate Subadditive Extension, and if $P\neq NP$ , this is optimal.

Proof.

Recall the Set-Cover problem. An instance of Set-Cover is a universe $[m]$ , family of sets $V=\{S_{1},\dots,S_{n}\}$ such that $S_{i}\subseteq[m]$ and an integer $k$ . We need to determine if there exists a cover of universe $[m]$ of size at most $k$ .

First we prove that the Subadditive Extension is CoNP-hard by reduction from Set-Cover. Construct a partial function that is defined on each set in $V$ and $[m]$ . The value at each set $S_{i}\in V$ is $1$ , and the value at $[m]$ is $k+1$ . If this partial function can be extended then every cover of $[m]$ must have size at least $k+1$ . On the other hand, if the partial function can not be extended then there must exist a cover of size at most $k$ . Both of the above facts easily follow from Lemma 1.

The lower bound of $\Omega(\log m)$ for Approximate Extension, as before, the partial function is defined on sets $V\cup[m]$ , and value at each set in $V$ is $1$ , and at $[m]$ is $m$ . Suppose we have a $\rho$ approximation algorithm for Approximate Extension, which for this instance returns value $\beta$ . Then note that $\alpha^{*}\geq\beta/\rho$ , where $\alpha^{*}$ is the optimal value of $\alpha$ for the Approximate Extension.

Since the algorithm returns value $\beta$ , so there exists an extension $f$ such that $1\leq f(S_{i})\leq\beta$ for all $S_{i}\in V$ and $f([m])\geq m$ . Therefore, by Lemma 1, every cover of $[m]$ has size at least $m/\beta$ . Now we claim that there must exist a cover with size at most $m\rho/\beta$ . If not, then all covers of $[m]$ have size at least $\gamma>m\rho/\beta$ , and it is easy to see that the partial function $\{(S_{1},m/\gamma),\dots,(S_{n},m/\gamma),([m],m)\}$ is extendible by Lemma 1. This implies $m/\gamma\geq\alpha^{*}\geq\beta/\rho$ which is a contradiction. This then gives an $\rho$ -approximation algorithm for Set Cover, and since Set Cover cannot be approximated by a factor better than $(1-\epsilon)\log m$ [19], this is true of Approximate Extension also.

The upper bound for Approximate Extension is shown in Appendix A ∎

A lower bound on learning subadditive functions.

We now show that subadditive functions cannot be learned by a factor of $o(\sqrt{m})$ in the PMAC model of learning. The PMAC (Probably Mostly Approximate Correct) model seeks to determine for a family $\mathcal{F}$ of functions, if it is possible to efficiently obtain a function $f$ “close to” a target function $f^{*}\in\mathcal{F}$ , given samples from some distribution over $2^{[m]}$ and the value of $f^{*}$ at the sampled points. Formally, let $\mathcal{F}\subseteq\{f|f:2^{[m]}\rightarrow\mathbb{R}\}$ be a family of set functions (e.g., subadditive functions).

Definition 1 ([7]).

An Algorithm $\mathcal{A}$ PMAC-learns a family of functions $\mathcal{F}$ with approximation factor $\alpha$ , if for any distribution $\mu$ (on $2^{[m]}$ ) and any target function $f^{*}\in\mathcal{F}$ , and for any sufficiently small $\epsilon,\delta>0$ :

•

$\mathcal{A}$ * takes the sequence $\{(S_{i},f^{*}(S_{i}))\}_{1\leq i\leq l}$ as input where $l$ is $poly(m,1/\delta,1/\epsilon)$ and the sequence $\{S_{i}\}_{1\leq i\leq l}$ is drawn i.i.d. from the distribution $\mu$ ,*

•

$\mathcal{A}$ * runs in $poly(m,1/\delta,1/\epsilon)$ time,*

•

$\mathcal{A}$ * returns a function $f:2^{[m]}\rightarrow\mathbb{R}$ such that*

[TABLE]

That is, with at least $1-\delta$ probability (over examples drawn from $\mu$ ), the value of the returned function $f$ should be within an $\alpha$ factor of the target function $f^{*}$ for at least $1-\epsilon$ fraction of the probability mass according to $\mu$ . The following lemma makes explicit the connection between PMAC-learning and extending partial functions (we use it later in showing lower bounds for learning submodular functions as well). The lemma has been implicitly used earlier to obtain lower bounds on learning subadditive and submodular functions [4, 7].

Lemma 2.

Suppose there exists a family $\mathcal{D}=\{T_{1},\dots,T_{n}\}$ of subsets of $[m]$ such that $n$ is superpolynomial in $m$ , and the partial function $H=\{(T_{1},f_{1}),\dots,(T_{n},f_{n})\}$ is extendible to a function in $\mathcal{F}$ for any values of $f_{i}\in[1,r]$ (where $r\geq 1$ ), $i\in[n]$ . Then the family of functions $\mathcal{F}$ cannot be learned by any factor $<r$ .

The above bound holds even if the algorithm knows the distribution $\mu$ , is allowed unbounded computation and chooses samples adaptively.

Balcan et. al. [4] proved an upper bound of $O(\sqrt{m}\log m)$ and an $\Omega(\sqrt{m}/\log m)$ lower bound for learning subadditive functions. Using Lemmas 1, 2 and a known result for cover-free families [23, 22], we show an improved lower bound of $\Omega(\sqrt{m})$ .

A family of sets $\mathcal{F}\subseteq 2^{[m]}$ is called an $r$ -cover free family [23, 22] if for all distinct sets $A_{1},\dots,A_{r},A_{r+1}\in\mathcal{F}$ we have $A_{r+1}\not\subseteq\cup_{i=1}^{r}A_{i}$ . Let $f_{r}(m)$ be the cardinality of the largest $r$ -cover free family.

Theorem 3 ([22]).

$f_{r}(m)=2^{\Theta\left(\frac{m\log r}{r^{2}}\right)}$ .

Lemma 3.

If $\mathcal{D}=\{T_{1},\dots,T_{n}\}$ is an $r$ -cover free family then the partial function $\{(T_{1},f_{1}),\dots,(T_{n},f_{n})\}$ is extendible to a subadditive function for any value of $f_{i}\in[1,r+1],i\in[n]$ .

Proof.

Suppose the partial function is not extendible. Therefore, by Lemma 1, there exists sets $T_{1},\dots,T_{k},T_{k+1}$ for some $k\geq 1$ such that $T_{k+1}\subseteq\cup_{i=1}^{k}T_{i}$ and $f_{k+1}>\sum_{i=1}^{k}f_{i}$ . Therefore, we have $r+1\geq f_{k+1}>k$ which is a contradiction as $\mathcal{D}$ is an $r$ -cover free family. ∎

Theorem 4.

In the PMAC model, subadditive functions cannot be learned by any $o(\sqrt{m})$ factor.

Proof.

We have $f_{r}(m)\geq 2^{\frac{cm\log r}{r^{2}}}=m^{\frac{cm}{r^{2}}(\frac{1}{2}-\frac{\log(\sqrt{m}/r)}{\log m})}$ for some constant $c$ . For $r\leq m^{1/4}$ , $f_{r}(m)$ is clearly superpolynomial in $m$ . Also, for $r\geq m^{1/4}$ , $\frac{1}{2}-\frac{\log(\sqrt{m}/r)}{\log m}\geq\frac{1}{4}$ . Hence, $2^{\frac{cm\log r}{r^{2}}}$ is superpolynomial for $r=o(\sqrt{m})$ . Therefore, for any such $r$ , by Theorem 3 there exists a $r$ -cover free family $\mathcal{D}=\{T_{1},\dots,T_{n}\}$ such that $n$ is superpolynomial. The theorem is directly implied by Lemmas 2 and 3. ∎

Testers for subadditive and XOS functions.

We now describe testers for subadditive and XOS functions that make $2^{m/2+O(\sqrt{m\log(1/\epsilon)})}$ queries; these are the first non-trivial testers for either of these functions. For definitions, we focus on subadditive functions, but the corresponding definitions for XOS functions are obvious. A function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is $\epsilon$ -far from subadditive if for any subadditive function $g:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ , we have $|S\subseteq[m]:f(S)\neq g(S)|\geq\epsilon 2^{m}$ . A tester for subadditive functions is a randomized algorithm that takes distance parameter $\epsilon$ and oracle access to a function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ as inputs, and accepts if $f$ is subadditive, and rejects with constant probability if $f$ is $\epsilon$ -far from subadditive.

We describe the testers here. Let $\lambda=\sqrt{\log(4/\epsilon)}$ , and define $M_{\lambda}=\{S\subseteq[m]||S|\leq m/2+\lambda\sqrt{m}\}$ . The tester repeats the following steps $1/\epsilon$ times.

•

Randomly pick a set $T\in M_{\lambda}$ and query the sets $Q(T)=\{S|S\subseteq T\}$ .

•

(Subadditive tester) Reject if EITHER there exists $T^{\prime}\in Q(T)$ such that $f(T)<f(T^{\prime})$ OR there exist $T_{1},\dots,T_{r}\in Q(T)$ for some $r\geq 1$ such that $T=\cup_{i=1}^{r}T_{i}$ and $f(T)>\sum_{i=1}^{r}f(T_{i})$ .

•

(XOS tester) Reject if EITHER there exists $T^{\prime}\in Q(T)$ such that $f(T)<f(T^{\prime})$ OR there exist $T_{1},\dots,T_{r}\in Q(T)$ and $\alpha_{1},\dots,\alpha_{r}\in\mathbb{R}_{+}$ for some $r\geq 1$ such that for all elements $s\in T$ , $\sum_{j:s\in T_{j}}\alpha_{j}\geq 1$ and $f(T)>\sum_{j=1}^{r}\alpha_{j}f(T_{j})$ .

It is clear that tester makes $|Q(T)|/\epsilon$ queries and $|Q(T)|\leq 2^{m/2+\lambda\sqrt{m}}\leq 2^{m/2+O(\sqrt{m\log(1/\epsilon)})}$ , which is also a bound on the number of queries by the tester.

Recall that XOS functions are fractionally subadditive functions, i.e., a function $f:2^{[m]}\rightarrow\mathbb{R}_{\geq 0}$ is XOS iff for all $T$ and $\alpha_{S}\geq 0$ such that $\sum_{S:s\in S}\alpha_{S}\geq 1$ for each $s\in T$ , $f(T)\leq\sum_{S}\alpha_{S}f(S)$ . Clearly if the function $f$ is subadditive or XOS then the tester accepts. Now we need to show if the function $f$ is $\epsilon$ -far from subadditive or XOS then tester rejects with high probability. Before that we show the following characterization for extension of a partial function to an XOS function.

Lemma 4.

Partial function $H$ is extendible to a XOS function iff for all $T\in\mathcal{D}$ and all $\alpha_{S}\geq 0$ for all $S\in\mathcal{D}$ such that $\sum_{S\in\mathcal{D}:s\in S}\alpha_{S}\geq 1$ for each $s\in T$ , we have $f(T)\leq\sum_{S\in\mathcal{D}}\alpha_{S}f(S)$ .

Proof.

Obviously if there is a XOS extension then the above property must hold as XOS functions are also fractionally subadditive. For the other direction, consider $\hat{f}(T)=\min\{\sum_{S\in\mathcal{D}}\alpha_{S}f(S)|\\ \sum_{S\in\mathcal{D}}\alpha_{S}\chi(S)\geq\chi(T),\thinspace\alpha_{S}\geq 0\}$ . We claim that $\hat{f}(T)$ is a XOS extension of the partial function. Let $T\in\mathcal{D}$ . By the assumption, we have $f(T)\leq\hat{f}(T)$ . Also, if we set $\alpha_{T}=1$ and $\alpha_{S}=0$ for rest of $S$ then we get $\hat{f}(T)\leq f(T)$ . Therefore, $\hat{f}(T)$ is an extension. Now suppose for some $T,T_{1},\dots,T_{n}\subseteq[m]$ and $\beta_{1},\dots,\beta_{n}\geq 0$ , we have $\sum_{j=1}^{n}\beta_{j}\chi(T_{j})\geq\chi(T)$ . We will show that $\hat{f}(T)\leq\sum_{j=1}^{n}\beta_{j}\hat{f}(T_{j})$ which will complete the proof. Let $\hat{f}(T_{i})=\sum_{S\in\mathcal{D}}\alpha^{i}_{S}f(S)$ for all $i\in[n]$ . Here, $\{\alpha^{i}_{S}\}_{S\in\mathcal{D}}$ are optimal values as in definition of $\hat{f}$ and we have $\sum_{S\in\mathcal{D}}\alpha^{i}_{S}\chi(S)\geq\chi(T_{i})$ . Therefore, we need to show that $\hat{f}(T)\leq\sum_{j=1}^{n}\beta_{j}\sum_{S\in\mathcal{D}}\alpha^{j}_{S}f(S)=\sum_{S\in\mathcal{D}}(\sum_{j=1}^{n}\beta_{j}\alpha^{j}_{S})f(S)$ . Note that $\chi(T)\leq\sum_{j=1}^{n}\beta_{j}\chi(T_{j})\leq\sum_{j=1}^{n}\beta_{j}\sum_{S\in\mathcal{D}}\alpha^{i}_{S}\chi(S)=\sum_{S\in\mathcal{D}}(\sum_{j=1}^{n}\beta_{j}\alpha^{j}_{S})\chi(S)$ . Therefore, by definition of $\hat{f}$ , we have $\hat{f}(T)\leq\sum_{S\in\mathcal{D}}(\sum_{j=1}^{n}\beta_{j}\alpha^{j}_{S})f(S)$ . ∎

A set $T\in M_{\lambda}$ is called bad if it causes the tester to reject. For subadditive functions, the set of bad sets $\mathcal{B}$ consists of $T\in M_{\lambda}$ such that either there exists $T^{\prime}\subseteq T$ such that $f(T)<f(T^{\prime})$ or there exists $T_{1},\dots,T_{r}$ for some $r\geq 1$ such that $T=\cup_{i=1}^{r}T_{i}$ and $f(T)>\sum_{i=1}^{r}f(T_{i})$ . Similarly, for XOS functions, $T\in M_{\lambda}$ is in $\mathcal{B}$ if there exists $T^{\prime}\subseteq T$ such that $f(T)<f(T^{\prime})$ or there exists, for some $r\geq 1$ , $T_{1},\dots,T_{r}\subseteq T$ such that $T=\cup_{i=1}^{r}T_{i}$ and $\alpha_{1},\dots,\alpha_{r}\geq 0$ such that for all elements $s\in T$ , $\sum_{T_{j}:s\in T_{j}}\alpha_{j}\geq 1$ and $f(T)>\sum_{j=1}^{r}\alpha_{j}f(T_{j})$ .

We show that removing all sets not in $M_{\lambda}$ , as well as the bad sets, gives us a partial function that can be extended to subadditive (or XOS). Since the function is $\epsilon$ -far and $M_{\lambda}$ is large by our choice of $\lambda$ , therefore there must be many bad sets.

Lemma 5.

The partial function $H=\{(S,f(S))|S\in M_{\lambda}\quad\text{and}\quad S\not\in\mathcal{B}\}$ is extendible to a subadditive (XOS) function.

Proof.

Suppose the partial function is not extendible, and let $\mathcal{D}=\{S|S\in M_{\lambda}\quad\text{and}\quad S\not\in\mathcal{B}\}$ be the defined points in $H$ . Then for subadditive functions, by Lemma 1, there exist $T_{1},\dots,T_{r},T\in\mathcal{D}$ such that $T\subseteq\cup_{i=1}^{r}T_{i}$ and $f(T)>\sum_{i=1}^{r}f(T_{i})$ . Then either $f(T)>\sum_{i=1}^{r}f(T\cap T_{i})$ or for some $j\in[r]$ , $f(T_{j})<f(T\cap T_{j})$ . Note that since $T,T_{j}\in M_{\lambda}$ , so is $T\cap T_{j}$ . Further, $T=\cup_{i\in[r]}T\cap T_{i}$ . Thus, in the first case, $T$ is in $\mathcal{B}$ while in the second case, $T_{j}\in\mathcal{B}$ , giving a contradiction.

For XOS functions, by Lemma 4 there exists $T_{1},\dots,T_{r}$ and $\alpha_{1},\dots,\alpha_{r}\geq 0$ for some $r\geq 1$ , such that $\sum_{T_{j}:s\in T_{j}}\alpha_{j}\geq 1$ for each $s\in T$ and $f(T)>\sum_{j=1}^{r}\alpha_{j}f(T_{j})$ . Like subadditive functions, either $f(T)>\sum_{j=1}^{r}\alpha_{j}f(T\cap T_{j})$ or for some $j\in[r]$ , $f(T_{j})<f(T\cap T_{j})$ . Again, in the first case, $T$ is in $\mathcal{B}$ , while in the second case $T_{j}\in\mathcal{B}$ . ∎

Theorem 5.

If $f$ is $\epsilon$ -far from subadditive (XOS) functions then the above tester rejects with constant probability.

Proof.

Let $\mathcal{D}=\{S|S\in M_{\lambda}\quad\text{and}\quad S\not\in\mathcal{B}\}$ and $\mathcal{U}=2^{[m]}\setminus\mathcal{D}$ . Since the partial function $\{(S,f(S))|S\in\mathcal{D}\}$ is extendible so $|\mathcal{U}|\geq\epsilon 2^{m}$ (since $f$ is $\epsilon$ -far). Note that $|\mathcal{U}|=|\mathcal{B}|+\sum_{i=m/2+\lambda\sqrt{m}}^{m}\binom{m}{i}=|\mathcal{B}|+\sum_{i=1}^{m/2-\lambda\sqrt{m}}\binom{m}{i}$ . By Chernoff bound, $\sum_{i=1}^{m/2-\lambda\sqrt{m}}\binom{m}{i}=2^{m}Pr(X\leq m/2-\lambda\sqrt{m})\leq 2^{m}e^{-\lambda^{2}}$ where $X$ is a binomial random variable $Bi(m,1/2)$ . By the choice of $\lambda$ , $\sum_{i=1}^{m/2-\lambda\sqrt{m}}\binom{m}{i}\leq\epsilon 2^{m}/4$ . Hence we have $|\mathcal{B}|\geq 3\epsilon 2^{m}/4$ . Therefore in a single iteration our tester will pick a bad set with probability at least $\frac{3}{4}\epsilon$ . Hence after $1/\epsilon$ iterations, the tester will pick a bad set with constant probability.

∎

A subexponential tester for nonmonotone subadditive functions.

We now describe a property testing algorithm for general (nonmonotone) subadditive functions that makes $2^{O(\sqrt{m\log(1/\epsilon)}\log m)}$ queries; in this subsection, subadditive refers to nonmonotone subadditive functions.

Let $\lambda=\sqrt{\ln(4/\epsilon)}$ , and define $M_{\lambda}=\{S\subseteq[m]|m/2-\lambda\sqrt{m}\leq|S|\leq m/2+\lambda\sqrt{m}\}$ . The tester repeats the following steps $1/\epsilon$ times:

•

Randomly pick a set $T\in M_{\lambda}$ and query the sets $Q(T)=\{S\in M_{\lambda}|S\subseteq T\}$ .

•

If there exists $T_{1},\dots,T_{r}\in Q(T)$ for some $r\geq 1$ such that $T=\cup_{i=1}^{r}T_{i}$ and $f(T)>\sum_{i=1}^{r}f(T_{i})$ then reject.

The tester makes $|Q|/\epsilon$ queries, where $|Q|\leq O(\binom{m/2+\lambda\sqrt{m}}{2\lambda\sqrt{m}})$ , and hence $|Q|=2^{O(\sqrt{m\log(1/\epsilon)}\log m)}$ , which is also a bound on the number of queries by the tester.

Obviously if the function $f$ is subadditive then the tester accepts. Now we will show if the function $f$ is $\epsilon$ -far from subadditive then tester rejects with high probability.

We first give the characterization for partial function extension similar to claim 1 for general subadditive functions.

Lemma 6.

The partial function $H$ is extendible to a subadditive function (not necessarily monotone) iff $\sum_{i=1}^{r}f(T_{i})\geq f(\cup_{i=1}^{r}T_{i})$ for all $T_{1},\dots,T_{r}\in\mathcal{D}$ such that $\cup_{i=1}^{r}T_{i}\in\mathcal{D}$ .

A set $T\in M_{\lambda}$ is called bad if it causes the tester to reject. The set of bad sets $\mathcal{B}$ consists of $T\in M_{\lambda}$ such that there exists $T_{1},\dots,T_{r}\in M_{\lambda}$ for some $r\geq 1$ such that $T=\cup_{i=1}^{r}T_{i}$ and $f(T)>\sum_{i=1}^{r}f(T_{i})$ .

We show that removing all sets not in $M_{\lambda}$ , as well as the bad sets, gives us a partial function that can be extended to subadditive function. Since the function is $\epsilon$ -far and $M_{\lambda}$ is large by our choice of $\lambda$ , therefore there must be many bad sets.

Lemma 7.

The partial function $H=\{(S,f(S))|S\in M_{\lambda}\quad\text{and}\quad S\not\in\mathcal{B}\}$ is extendible to a subadditive function.

Proof.

Suppose the partial function is not extendible. Let $\mathcal{D}=\{S|S\in M_{\lambda}\quad\text{and}\quad S\not\in\mathcal{B}\}$ be the defined sets in $H$ . Then by Lemma 6, there will exist $T_{1},\dots,T_{r},T\in\mathcal{D}$ such that $T=\cup_{i=1}^{r}T_{i}$ and $\sum_{i=1}^{r}f(T_{i})<f(T)$ . This implies $T\in\mathcal{B}$ which is a contradiction. ∎

Theorem 6.

If $f$ is $\epsilon$ -far from subadditive functions then the above tester rejects with constant probability.

Proof.

Let $\mathcal{D}=\{S|S\in M_{\lambda}\quad\text{and}\quad S\not\in\mathcal{B}\}$ and $\mathcal{U}=2^{[m]}\setminus\mathcal{D}$ . Since the partial function $\{(S,f(S))|S\in\mathcal{D}\}$ is extendible, $|\mathcal{U}|\geq\epsilon 2^{m}$ (since $f$ is $\epsilon$ -far). Note that $|\mathcal{U}|=|\mathcal{B}|+2\sum_{i=1}^{m/2-\lambda\sqrt{m}}\binom{m}{i}$ . Hence again using Chernoff bound and the value of $\lambda$ , we have $|\mathcal{B}|\geq\epsilon 2^{m}/2$ . Therefore in a single iteration our tester will pick a bad set with $\epsilon/2$ probability. Hence after $1/\epsilon$ iterations, the tester will pick a bad set with constant probability. ∎

3 Submodular functions

Submodular functions are perhaps the the most important functions in combinatorial optimization. They capture diminishing marginal returns for set functions, which is satisfied in many practical applications as well as theoretical constructs, and can be viewed as the discrete analog of concave functions. Coverage functions, matroid rank functions, and many others are special cases of submodular functions. A function $f:2^{[m]}\rightarrow\mathbb{R}$ is submodular if $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ for all $A,B\subseteq[m]$ . As before, $\mathcal{D}:=\{T_{i}\}_{i\in[n]}$ is the set of points in the given partial functions $H$ , and $\mathcal{U}:=2^{[m]}$ . These are called defined and undefined points respectively. The lattice closure $LC(\mathcal{D})$ is the minimal set that contains $\mathcal{D}$ and is closed under union and intersection. For a family of set $\mathcal{F}\subseteq 2^{[m]}$ , we say a function $f$ is submodular on $\mathcal{F}$ if for all sets $A$ , $B\in\mathcal{F}$ so that $A\cup B$ , $A\cap B$ are also in $\mathcal{F}$ , $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ .

Seshadri and Vondrak [31] study property testing of submodular functions and give the first subexponential time non-adaptive tester for submodularity. They also introduce the problem of Submodular Extension — the problem we study — to a submodular function, and note its usefulness in analyzing property testing algorithms. They give a partial function $H$ that is defined on an exponential number of points and is not extendible to a submodular function. However, if any point is removed from $H$ , then it can be extended to a submodular function, indicating the difficulty in designing a tester for submodularity.

Seshadri and Vondrak introduce and use a combinatorial structure called a path certificate, the existence of which certifies that a given partial function cannot be extended to a submodular function. Our results are instead based on a structure called a square certificate. A square certificate is multiset of tuples $(\{A,B\},A\cup B,A\cap B)$ with $A,B\subseteq[m]$ , with additional constraints on the multiset. Since the union and intersection of sets can be seen as bitwise AND and OR of the characteristic vectors of the sets, our analysis proceeds by viewing a square certificate as special monotone Boolean circuit. Since submodular functions are defined using union and intersection, we believe that square certificates are more natural than path certificates. It is easy to show that a square certificate exists iff the given partial function is not extendible. Our main technical result is that if a partial function is not extendible, then there exists a square certificate where the sets in every square $(\{A,B\},A\cup B,A\cap B)$ are in the lattice closure of $\mathcal{D}$ . We use this to obtain a number of results on the extendibility of a partial function.

Theorem 7.

If the sets in $\mathcal{D}$ form an antichain,333A family of sets is an antichain if no set in the family is contained in another set. then the partial function can always be extended to a submodular function. Hence, submodular functions cannot be PMAC-learned. 2. 2.

Let $\mathcal{F}:=LC(\mathcal{D})\cap\{S\,:\,\exists T_{i},T_{j}\in\mathcal{D}\mbox{ s.t. }T_{i}\supseteq S\supseteq T_{j}\}$ be the sets obtained by the union and intersection of sets in $\mathcal{D}$ , that are also both contained in and contained by sets in $\mathcal{D}$ . If the partial function can be extended to a submodular function on $\mathcal{F}$ , then it can be extended to a submodular function on $2^{[m]}$ . Hence, Submodular Approximate Extension (and thus Extension) can be solved in $O(\operatorname{poly}(|\mathcal{F}|,m,n))$ time.

Let $r$ be the maximum difference in the size of two sets $T_{i},T_{j}\in\mathcal{D}$ . Then an upper bound on the size of $\mathcal{F}$ is $O(\operatorname{poly}(n,2^{r}))$ . Thus if all sets are roughly of the same size, Approximate Extension can be solved in polynomial time. Further, if $n$ is a constant then $|LC(\mathcal{D})|$ is a constant and hence Approximate Extension can be solved in polynomial time. For the proof of the theorem, the second result is in particular very non-trivial and require insights into the structure of square certificates. We start by formally defining square certificates.

Square Certificates. A square tuple is a triple $(\{A,B\},A\cup B,A\cap B)$ where $A,B\subseteq[m]$ . The sets $A$ and $B$ are called middle points, $A\cup B$ is the top point and $A\cap B$ is the bottom point of this square tuple. Sets $A,B,A\cup B,A\cap B$ are said to be part of this square tuple. Given a multiset of square tuples, for a set $S\subseteq[m]$ , define $m(S)$ to be the number of square tuples with $S$ as a middle point, and $tb(S)$ to be the number of square tuples with $S$ as a top or bottom point. We say a set $S$ is an input set if $m(S)>tb(S)$ , an intermediate set if $m(S)=tb(S)$ , and an output set if $m(S)<tb(S)$ . A square certificate is a multiset of square tuples with the following properties:

(P1)

If $S$ is an input or an output set, i.e., $m(S)\neq tb(S)$ then $S$ must be in $\mathcal{D}$ . 2. (P2)

$\sum_{i\in[n]}f_{i}\left(tb(T_{i})-m(T_{i})\right)>0$ .

A set $S$ is involved if it is a part of some square triple in the square certificate. By definition, a partial function can be extended iff the following linear program (with variables $w_{A}$ for all $A\subseteq[m]$ ) is feasible:

[TABLE]

A square certificate can be seen as a dual solution obtained from Farkas’ lemma.

Lemma 8.

A partial function is extendible iff there does not exist a square certificate.

Proof.

A partial function can be extended iff the following linear program (with variables $w_{S}$ for all $S\subseteq[m]$ ) is feasible:

[TABLE]

Using Farkas’ lemma, the following linear program with variables $y_{\{A,B\}}$ for all $A,B\subseteq[m]$ must then be infeasible:

[TABLE]

We will show that there exists a square certificate iff this dual linear program is feasible. Firstly, note that if the dual is feasible then it has a rational solution, and any rational solution to the dual can be converted into an integral solution by multiplying by the product of the denominators. Now given a square certificate, we identify each variable $y_{\{A,B\}}$ in the dual with the square tuple $(\{A,B\},A\cup B,A\cap B)$ , and set $y_{\{A,B\}}=k$ if this square tuple appears $k$ times in the square certificate. Then for a set $A$ , $m(A)=\sum_{B}y_{\{A,B\}}$ , and $tb(A)$ $=\sum_{P,Q:P\cup Q=A}y_{\{P,Q\}}$ $+\sum_{P,Q:P\cap Q=A}y_{\{P,Q\}}$ . The first constraint then says that if $A$ is not defined, then $A$ must appear an equal number of times as a middle point, and as a top or bottom point. That is, $A$ must be an intermediate set. The second constraint in the linear program corresponds exactly to property (P2). Thus, a square certificate gives us a feasible dual solution. The converse can be proved with the same construction: given an integral dual solution, we create a square certificate by including a square tuple $(\{A,B\},A\cup B,A\cap B)$ $y_{\{A,B\}}$ times. The first and second constraints in the linear program give us properties (P1) and (P2) of the square certificate exactly. ∎

Lemma 9.

If there is a square certificate for a partial function, then for any extension $f(\cdot)$ to the hypercube, there is a square tuple $(A,B,A\cup B,A\cap B)$ in the square certificate with $f(A)+f(B)<f(A\cup B)+f(A\cap B)$ .

Our main technical result is the following lemma.

Lemma 10.

If there is a square certificate for a partial function, then there is a square certificate where, if $S$ is an involved set, then (i) there exist $T_{i}$ , $T_{j}\in\mathcal{D}$ so that $T_{i}\supseteq S\supseteq T_{j}$ , and (ii) $S\in LC(\mathcal{D})$ .

We will develop the proof of the second part of the lemma in the remainder of the paper. The first part, however, is easily seen. Let $S$ be an involved set. Then either by property (P1) $S\in\mathcal{D}$ , in which case $T_{i}=T_{j}=S$ , or $S$ is undefined and hence $S$ is an intermediate set. In the latter case, $m(S)=tb(S)>0$ , and hence $S$ is a middle set in some square $(S,S^{\prime},S\cup S^{\prime},S\cap S^{\prime})$ . If $S\cup S^{\prime}$ is defined, then $T_{i}=S\cup S^{\prime}$ . Otherwise, $S\cup S^{\prime}$ is an intermediate set, and we can continue in this way until we find a defined set that contains $S$ . Similarly, considering the bottom point $S\cap S^{\prime}$ of the square, we can find a defined set that is contained by $S$ .

Proof of Theorem 7 (assuming Lemma 10)..

For the first part of the theorem, if the partial function cannot be extended, there must exist a square certificate. In particular, from (P2), there must exist $T_{i}\in\mathcal{D}$ with $tb(T_{i})-m(T_{i})>0$ . Thus, $T_{i}$ must be part of at least one square tuple $(A,B,A\cup B,A\cap B)$ that is not trivial, i.e., $A\neq B$ (otherwise all four sets are just equal to $T_{i}$ , and this square contributes to $tb(T_{i})$ and $m(T_{i})$ equally). Then $A\cap B\neq A\cup B$ , and either $T_{i}\subsetneq A\cup B$ , or $T_{i}\supsetneq A\cap B$ (or both). In the former case, by Lemma 10, there must exist $T_{j}\supseteq A\cup B\neq T_{i}$ with $T_{j}\in\mathcal{D}$ , or in the latter case, there must exist $T_{j}\subseteq A\cap B\neq T_{i}$ with $T_{j}\in\mathcal{D}$ . However, since $\mathcal{D}$ is an antichain, no such sets $T_{i}$ , $T_{j}$ can exist.

To prove that submodular functions cannot be learned, consider the family $\mathcal{F}={[m]\choose\frac{m}{2}}$ of sets of size $m/2$ , and note that $|\mathcal{F}|=\Theta(2^{m})$ . Since any partial function with sets in $\mathcal{F}$ and arbitrary values for the sets is extendible to a submodular function, submodular functions can not be learned by any factor (by Lemma 2).

For the second part of the theorem, let $f(\cdot)$ be a submodular extension of the partial function to $\mathcal{F}$ . Then for any points $A,B\in\mathcal{F}$ so that $A\cup B,A\cap B$ are also in $\mathcal{F}$ , $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ . Suppose for a contradiction that the partial function cannot be extended to a submodular function on the entire domain. Then by Lemma 8 and Lemma 10, there is a square certificate where all involved sets are in $\mathcal{F}$ . But then we have a square certificate and an extension $f(\cdot)$ for which $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ for all square tuples in the square certificate. This gives us a contradiction, since by Lemma 9 the two cannot coexist. Thus, it is necessary and sufficient to find a submodular extension to points in $\mathcal{F}$ . Therefore, Approximate Extension can be solved by writing a linear program with the variable $w_{S}$ for each set $S\in\mathcal{F}$ and objective of minimizing $\alpha\geq 1$ subject to submodularity constraints on the sets $w_{S}+w_{T}\geq w_{S\cup T}+w_{S\cap T}$ for $S,T,S\cup T,S\cap T\in\mathcal{F}$ , and the constraints $f_{i}\leq w_{T_{i}}\leq\alpha f_{i}$ for each set $T_{i}\in\mathcal{D}$ . This gives an algorithm with the desired running time. ∎

Given a square certificate, we need to construct another square certificate where all involved sets are in the lattice closure of $\mathcal{D}$ , i.e., are obtained by the union and intersection of sets in $\mathcal{D}$ . In particular, this means that all intermediate sets must be in $LC(\mathcal{D})$ , since by property (P1) input and output sets are already in $\mathcal{D}$ . Our insight is to think of the union and intersection of two sets as the bitwise OR and AND of the characteristic vectors respectively. Given a square certificate, we will construct a boolean circuit with certain properties, which we call a boolean certificate. Similar to square certificates, boolean certificates are also necessary and sufficient conditions for a partial function to not be extendible to a submodular function.

As background for defining boolean certificates, we first define boolean circuits. Formally, a boolean circuit is a directed graph with three types of nodes — input nodes IP with no incoming edges, intermediate nodes IM with incoming and outgoing edges, and output nodes OP with no outgoing edges. Intermediate and output nodes are labelled with logic gates and perform the labelled boolean operation. The computed value at an intermediate node is fed to other intermediate or output nodes, and the computed value at the output nodes are the output of the circuit. The fan-in of a gate is its indegree, which is the number of arguments for the boolean operation. Throughout the paper, we will consider only AND and OR gates with fan-in 2. An assignment $A:\textsf{IP}\cup\textsf{OP}\cup\textsf{IM}\rightarrow\{0,1\}$ is called satisfying if the boolean operation operation at each gate is satisfied by the assignment. That is, if $g$ is an AND gate with inputs $g_{1}$ and $g_{2}$ , then $A(g)=A(g_{1})\land A(g_{2})$ , and $A(g)=A(g_{1})\lor A(g_{2})$ if $g$ is an OR gate.

The boolean circuits we consider are constructed from square certificates and can contain cycles. In a cyclic boolean circuit, given values to the input nodes, there may be many satisfying assignments (see figure 1). The technical difficulty in proving second part of Lemma 10 is due to the presence of cycles in the constructed boolean circuit (the square certificate may correspondingly have analogous “cycles”). If the constructed boolean circuit does not have cycles, then the proof of the lemma follows trivially.

We now define the notion of computation in cyclic boolean circuits. We use G to denote the set of all gates.We assume an ordering of the gates, with the input gates appearing first in this order, followed by the output gates and then the intermediate gates. An assignment to the input gates is denoted $(x_{1},\dots,x_{|\textsf{IP}|})\in\{0,1\}^{|\textsf{IP}|}$ . An assignment to all gates will be denoted by $(x_{1},\dots,x_{|\textsf{IP}|},y_{1},\dots,y_{|\textsf{OP}|},z_{1},\dots,z_{|\textsf{IM}|})\in\{0,1\}^{|\textsf{G}|}$ . We use $X_{1}$ , $X_{2}$ , $\ldots$ , $X_{|\textsf{IP}|}$ to refer to the input gates.

Recall that an assignment $(x_{1},\dots,x_{|\textsf{IP}|},y_{1},\dots,y_{|\textsf{OP}|},z_{1},\dots,z_{|\textsf{IM}|})$ is called satisfying if the boolean operation operation at each gate is satisfied by the assignment. Given an assignment $(x_{1},\dots,x_{|\textsf{IP}|})$ to input gates, we say a gate $g^{*}$ computes a value $b^{*}\in\{0,1\}$ if the value at $g^{*}$ is $b^{*}$ in every satisfying assignment to the gates, with the input gates as specified. We now formally define the notion of a gate being fixed to a value $b\in\{0,1\}$ . The equivalence of the two definitions is shown in Lemma 12. It is possible that a gate does not compute any value.

Definition 2.

Given an assignment $(x_{1},\dots,x_{|\textsf{IP}|})\in\{0,1\}^{|\textsf{IP}|}$ to the input gates, a gate $g^{*}$ is fixed to value $b^{*}\in\{0,1\}$ if there exists a subgraph $G^{\prime}=(V^{\prime},E^{\prime})$ of the boolean circuit with values $\operatorname{\textsf{val}}(g)\in\{0,1\}$ for all gates $g\in V^{\prime}$ , so that:

•

$G^{\prime}$ * is a rooted tree with all edges directed towards the root.*

•

$g^{*}$ * is the root, and all leaves are input nodes.*

•

$\operatorname{\textsf{val}}(g^{*})=b^{*}$ , and $\operatorname{\textsf{val}}(X_{i})=x_{i}$ for each input node $X_{i}$ that is a leaf.

•

If gate $g$ has only one child $g^{\prime}$ in the tree then either (i) $g$ is an AND gate and $\operatorname{\textsf{val}}(g)=\operatorname{\textsf{val}}(g^{\prime})=0$ or (ii) $g$ is an OR gate and $\operatorname{\textsf{val}}(g)=\operatorname{\textsf{val}}(g^{\prime})=1$ .

•

*If gate $g$ has two children $g^{\prime}$ , $g^{\prime\prime}$ in the tree then either (i) $g$ is an AND gate and $\operatorname{\textsf{val}}(g)=\operatorname{\textsf{val}}(g^{\prime})=\operatorname{\textsf{val}}(g^{\prime\prime})=1$ or (ii) $g$ is an OR gate and $\operatorname{\textsf{val}}(g)=\operatorname{\textsf{val}}(g^{\prime})=\operatorname{\textsf{val}}(g^{\prime\prime})=0$ . *

We call the above rooted tree and associated values a proof of $g^{*}$ fixed to $b^{*}$ for the given assignment to the input gates. We make a few observations regarding the definition. First, since in the boolean circuit the fan-in for each gate is 2, the rooted tree is a binary tree. Second, since the value of every gate in the rooted tree must be the same as its children, in fact every gate in the rooted tree must have the same value.

Note that a priori, a gate may be fixed to different values for the same input. We now show that in fact this cannot happen.

Lemma 11.

If a gate $g$ is fixed to both $b^{\prime}$ and $b^{\prime\prime}$ for a particular assignment to the input gates, then $b^{\prime}=b^{\prime\prime}$ .

Define $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}$ to be the set of all gates that gets fixed by the assignment $(x_{1},\dots,x_{|\textsf{IP}|})$ to the input gates. We say such gates are fixed, with the inputs clear from the context. If $g$ is fixed, then Lemma 11 allows us to define $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}(g)$ as the value that gate $g$ is fixed to.

Lemma 12 says that the value of fixed gates should be consistent with any satisfying assignment and Lemma 13 says that $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}(g)$ is monotone for fixed gate $g$ as a function of $(x_{1},\dots,x_{|\textsf{IP}|})$ . These lemmas are intuitive and are proven by induction on the height of the rooted tree corresponding to any proof that fixes the gate $g$ .

Lemma 12.

Let $A:\textsf{IP}\cup\textsf{OP}\cup\textsf{IM}\rightarrow\{0,1\}$ be a satisfying assignment to a boolean circuit, and gate $g$ be fixed by the assignment to the input gates. Then $g$ is fixed to $A(g)$ .

For two vectors $v$ , $v^{\prime}$ we use the standard notation that $v\geq v^{\prime}$ if this is true of each component.

Lemma 13.

Let $(x_{1},\ldots,x_{|\textsf{IP}|})\geq(x_{1}^{\prime},\ldots,x_{|\textsf{IP}|}^{\prime})$ be two assignments to the input gates, and consider a gate $g$ . If $g$ is fixed to [math] by the assignment $(x_{1},\ldots,x_{|\textsf{IP}|})$ , then this is true for $(x_{1}^{\prime},\ldots,x_{|\textsf{IP}|}^{\prime})$ as well. Conversely, if $g$ is fixed to $1$ by the assignment $(x_{1}^{\prime},\ldots,x_{|\textsf{IP}|}^{\prime})$ , then this is true for $(x_{1},\ldots,x_{|\textsf{IP}|})$ as well.

Boolean Certificates.

Given a partial function $H=\{(T_{1},f_{1}),\ldots,(T_{n},f_{n})\}$ , a boolean certificate, similar to a square certificate, characterizes partial functions that cannot be extended to a submodular function. A boolean certificate consists of the following two parts.

A boolean circuit with AND and OR gates, that satisfies two conditions: (i) Input gates and intermediate gates have outdegree two, and output and intermediate gates have indegree two. (ii) If $g$ , $g^{\prime}$ are inputs to an AND gate, then these are inputs to an OR gate as well. As before, IP, OP, and IM are the set of input, output, and intermediate gates, and G is the set of all gates. 2. 2.

A function $\mathcal{C}:\textsf{G}\rightarrow 2^{[m]}$ called the creator function, that satisfies three conditions: (i) $\mathcal{C}(g)\in\mathcal{D}$ for all $g\in\textsf{IP}\cup\textsf{OP}$ , i.e., all input and output gates map to defined sets. (ii) For each $i\in[m]$ , the assignment $A_{i}(g)=(\mathcal{C}(g))_{i}$ (which assigns $1$ to the gate if the $i$ th element is present in $\mathcal{C}(g)$ , and [math] otherwise) is a satisfying assignment. This gives $m$ satisfying assigments to the boolean circuit. (iii) For $T_{i}\in\mathcal{D}$ , let $n_{i}^{\textsf{IP}}$ be the number of input gates that $\mathcal{C}$ maps to $T_{i}$ , and $n_{i}^{\textsf{OP}}$ be the number of output gates that $\mathcal{C}$ maps to $T_{i}$ . Then $\sum_{i\in[n]}f_{i}\left(n_{i}^{\textsf{OP}}-n_{i}^{\textsf{IP}}\right)>0$ .

Figure 2 shows an example of a square certificate and the corresponding boolean certificate.

Lemma 14.

Given a partial function, there exists a square certificate iff there exists a boolean certificate. If there is a boolean certificate with creator function $\mathcal{C}$ then there is a square certificate with $\{\mathcal{C}(g)|g\in\textsf{G}\}$ as the family of involved sets.

Proof.

Given a square certificate, we first show how to obtain a boolean certificate. The construction proceeds in two steps. In the first step, for each square tuple $(\{A,B\},A\cup B,A\cap B)$ in the square certificate, we create four gates $g_{1}$ , $g_{2}$ , $g_{3}$ , $g_{4}$ , and set $\mathcal{C}(g_{1})=A$ , $\mathcal{C}(g_{2})=B$ , $\mathcal{C}(g_{3})=A\cup B$ , and $\mathcal{C}(g_{4})=A\cap B$ . Gates $g_{1}$ , $g_{2}$ are input gates, $g_{3}$ is an output OR gate, and $g_{4}$ is an output AND gate. We add edges from both $g_{1}$ and $g_{2}$ to $g_{3}$ and $g_{4}$ . We do this for each square tuple in the square certificate. Thus, if there are $N$ square tuples, we obtain a boolean circuit with $N$ components and $4N$ gates. We call this the primary boolean circuit, and note its properties below.

The circuit only has input gates and output gates. Each input gate has fan-out 2, and each output gate has fan-in 2. Further, if $g$ , $g^{\prime}$ are inputs to an AND gate, these are inputs to an OR gate as well. 2. 2.

Consider an involved set $S$ in the square certificate. For every square tuple that has $S$ as a top or bottom point, there is a component in the circuit that has $g$ as an output gate, and $\mathcal{C}(g)=S$ . For every square tuple that has $S$ is a middle point, there is a square circuit that has $g$ as an input gate, and $\mathcal{C}(g)=S$ . The converse also holds. Thus, if $n_{S}^{\textsf{IP}}$ and $n_{S}^{\textsf{OP}}$ are the number of input and output gates respectively that the creator function maps to $S$ , then $n_{S}^{\textsf{OP}}-n_{S}^{\textsf{IP}}=tb(S)-m(S)$ . Hence, $\sum_{i\in[n]}f_{i}\left(n_{i}^{\textsf{OP}}-n_{i}^{\textsf{IP}}\right)>0$ . 3. 3.

For each $i\in[m]$ , the assignment $A_{i}(g)=(\mathcal{C}(g))_{i}$ is a satisfying assignment. This can be verified by checking each component. 4. 4.

However, $g$ may be an input or output gate, but $\mathcal{C}(g)$ an intermediate set, and hence not in $\mathcal{D}$ .

Hence, the primary boolean circuit (with the creator function) satisfies all the conditions to be a boolean certificate, except that an input or output gate may be mapped to a set not in $\mathcal{D}$ . In the second step, we fix this. Now we take two gates $g_{1}$ , $g_{2}$ of which one (say $g_{1}$ ) is an input gate, one ( $g_{2}$ ) is an output gate, and $\mathcal{C}(g_{1})=\mathcal{C}(g_{2})$ . We replace these two by a single intermediate gate $g_{3}$ that takes inputs as did $g_{2}$ , provides output as did $g_{1}$ , has logical operation (AND or OR) as did $g_{2}$ , and set $\mathcal{C}(g_{3})=\mathcal{C}(g_{1})$ . Note that $g_{3}$ has fan-in and fan-out 2. We do this for all pairs of gates that satisfy these conditions, one pair at a time. We call the circuit thus obtained the secondary boolean circuit. It is easy to verify by induction on the number of gates replaced that the properties of a boolean certificate satisfied earlier are satisfied now as well. Finally, consider a gate $g\in\textsf{IP}$ , i.e., an input gate in the secondary boolean circuit. Let $\mathcal{C}(g)=S$ . Since $g$ has not been replaced, there is no output gate that the creator function maps to set $S$ . Then since $n_{S}^{\textsf{OP}}-n_{S}^{\textsf{IP}}=tb(S)-m(S)$ , it must be true that $m(S)>tb(S)$ , and hence $S\in\mathcal{D}$ as required. Similarly, if $g$ is an output gate in the secondary boolean circuit, then $\mathcal{C}(g)\in\mathcal{D}$ . This concludes the construction of the boolean certificate.

Given a boolean certificate, we now want to construct a square certificate so that the involved sets are exactly $\{\mathcal{C}(g)|g\in\textsf{G}\}$ . For this, we first convert the boolean circuit to a primary boolean circuit by reversing the earlier procedure. That is, if $g$ is an intermediate gate (and hence has two inputs and two outputs), we replace it by an input gate $g_{1}$ that provides output as did $g$ , and an output gate $g_{2}$ that takes inputs as did $g$ , with $\mathcal{C}(g_{1})=\mathcal{C}(g_{2})=\mathcal{C}(g)$ , and the output gate has the same logical operation as $g$ . Continuing this process, we obtain a primary boolean circuit consisting only of input and output gates. This maintains the property that if $g_{1}$ , $g_{2}$ are inputs to an AND gate, then they are inputs to an OR gate as well. Thus, this primary boolean circuits must consist of components with four gates each: inputs $g_{1}$ and $g_{2}$ , an output OR gate $g_{3}$ , and an output AND gate $g_{4}$ . We now construct our square certificate by including a square $(\{\mathcal{C}(g_{1}),\mathcal{C}(g_{2})\},\mathcal{C}(g_{3}),\mathcal{C}(g_{4}))$ for each such component in the boolean circuit. It is easy to see that for any input or output set $T_{i}\in\mathcal{D}$ in the constructed square certificate, $m(T_{i})-tb(T_{i})=n_{i}^{\textsf{IP}}-n_{i}^{\textsf{OP}}$ . Similarly, for any input or output gate $g$ in boolean certificate with $\mathcal{C}(g)=T_{i}\in\mathcal{D}$ , we have $n_{i}^{\textsf{IP}}-n_{i}^{\textsf{OP}}=m(T_{i})-tb(T_{i})$ . Thus $\sum_{i\in[n]}f_{i}\left(tb(T_{i})-m(T_{i})\right)=\sum_{i\in[n]}f_{i}\left(n_{i}^{\textsf{OP}}-n_{i}^{\textsf{IP}}\right)>0$ . Therefore, the multiset of squares thus obtained is indeed a square certificate. Further, if $S$ is an involved set, then $S=\mathcal{C}(g)$ for some gate $g$ in the boolean certificate, as required. ∎

Recall that in a cyclic boolean circuit, a gate may not get fixed to some value. The next lemma crucially shows that any assignment to the input gates of a boolean circuit (of a boolean certificate) always fixes the values at the output gates, and hence for any satisfying assignments $A(\cdot)$ , $A^{\prime}(\cdot)$ to the gates that assign the same values to the input sets, the values assigned to the output gates must be the same as well.

The proof requires the special structure of the boolean circuit in our boolean certificate, namely that if $g$ , $g^{\prime}$ are inputs to an AND gate, they are also inputs to an OR gate. First, since intermediate gates in the circuit have fan-in = fan-out, and fan-out for input gates = fan-in for output gates = 2, $|\textsf{IP}|=|\textsf{OP}|$ . We will show that $\textsf{OP}\subseteq\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}$ for any assignment $(x_{1},\dots,x_{|\textsf{IP}|})$ to the input gates. That is, all the output gates get fixed for any assignment to the input gates.

We show this by giving an algorithm (Algorithm 1) that takes as input an assignment $(x_{1},\dots,x_{|\textsf{IP}|})$ to the input gates, and assigns value [math] or $1$ to some subset of gates $\textsf{G}^{\prime}\supseteq\textsf{OP}$ by setting $\operatorname{\textsf{val}}(g)$ for gates $g\in\textsf{G}^{\prime}$ . We will prove that all gates in $\textsf{G}^{\prime}$ are in $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}$ and that $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}(g)=\operatorname{\textsf{val}}(g)$ .

Initially, $\textsf{G}^{\prime}=\textsf{IP}$ . The algorithm maintains the invariant that for each gate $g\in\textsf{G}^{\prime}$ , there is a value $\operatorname{\textsf{val}}(g)$ and a proof in the sense of Definition 2 that $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}(g)=\operatorname{\textsf{val}}(g)$ using gates that were assigned values before $g$ . It starts with a gate $X_{i}\in\textsf{G}^{\prime}$ as the current gate. If the current gate is not an output gate, then it is an input to an OR gate (say $g_{2}$ ) and an AND gate (say $g_{3}$ ). Further, let $g_{1}$ be the other input to these gates (such a gate must exist, by property of boolean certificates). Then it uses properties from Definition 2 to show that at least one of $g_{2}$ , $g_{3}$ has not been assigned a value, and can be chosen as the current gate and assigned a value so that the invariant is maintained. If the (new) current gate is not an output gate, the algorithm continues by considering the AND and OR gates that the current gate is input to.

Lemma 15.

Let $\textsf{G}^{\prime}$ be the set of all gates $g$ such that $\operatorname{\textsf{val}}(g)$ is set by the Fixing Algorithm. Then $\textsf{OP}\subseteq\textsf{G}^{\prime}$ , and for all gates $g\in\textsf{G}^{\prime}$ , $g$ is in $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}$ and $\operatorname{\textsf{fix}}^{(x_{1},\dots,x_{|\textsf{IP}|})}(g)=\operatorname{\textsf{val}}(g)$ .

Proof.

The algorithm starts from gate $X_{1}$ , and picks a sequence of gates and assigns them a value. Assume first that if we enter the while loop, at least one of the conditions from lines 6, 8, 10, or 12 must be true. Then starting with an input gate $X_{i}$ (i.e., in the $i$ th iteration of the for loop), the algorithm assigns values to distinct gates that did not earlier have a value, until it assigns a value to an output gate. Then for each $X_{i}$ , a distinct output gate is assigned a value, and since $|\textsf{IP}|=|\textsf{OP}|$ , $\textsf{OP}\subseteq\textsf{G}^{\prime}$ .

We now show one of the conditions from lines 6, 8, 10, or 12 must be true at any step, and that for all gates $g$ in $\textsf{G}^{\prime}$ , we have a proof of $g$ being fixed to $\operatorname{\textsf{val}}(g)$ . We claim that the following statement is an invariant: if gate $g$ is assigned a value, then $g$ has a proof of being fixed to this value consisting only of gates that were assigned values before $g$ . To show this, note that after the execution of $3$ , the invariant will continue to hold if it was true previously, as input gates are trivially fixed to their value. We show that if we enter the while loop with the invariant condition then one of the conditions in lines 6, 8, 10, or 12 must be true and the invariant must hold true after execution of lines $5$ to $14$ . This will complete the proof. Let $g$ be the current gate $CG$ . Then $g$ must be assigned a value in the previous iteration of the while loop. Since we now enter the while loop, $g$ is not an output gate. Let $g_{2}$ , $g_{3}$ be the OR and AND gates for which $g$ is an input, and let $g_{1}$ be the other input to these gates. Suppose for a contradiction that $\operatorname{\textsf{val}}(g_{2})\neq\bot$ and $\operatorname{\textsf{val}}(g_{3})\neq\bot$ . By the invariant condition, both $g_{2}$ and $g_{3}$ have a proof of being fixed that does not use gate $g$ . But $g_{2}$ is an OR gate, and can have a proof without $g$ only if $g_{1}$ is fixed to $1$ , and $g_{3}$ can have proof without $g$ only if $g_{1}$ is fixed to [math]. By Lemma 11, this is a contradiction. Thus one of the conditions 6, 8, 10, or 12 must hold true.

We now show that the invariant condition holds after execution of one of the lines 6, 8, 10, 12. Suppose condition $6$ holds (the case when condition 8 holds is very similar). Then $g_{3}$ is assigned a value in the current while loop. Since $g_{2}$ is an OR gate and has a value assigned previously, by the invariant condition $g_{1}$ must previously (before $g_{2}$ was assigned its value) been assigned $\operatorname{\textsf{val}}(g_{1})=1$ . We then get the proof of $g_{3}$ being fixed to $\operatorname{\textsf{val}}(g_{3})$ by making the root of proofs of $g$ and $g_{1}$ the children of $g_{3}$ when $\operatorname{\textsf{val}}(g)=1$ , or the root of the proof of $g$ as the child of $g_{3}$ when $\operatorname{\textsf{val}}(g)=0$ .

Suppose condition $10$ holds (the case when condition 12 holds is very similar). Then gate $g_{3}$ is assigned a value in this while loop. Since $g_{3}$ is an AND gate, we get a proof of its being fixed to $\operatorname{\textsf{val}}(g_{3})=0$ by making the root of proof of $g$ the only child of $g_{3}$ . ∎

A function $f:\{0,1\}^{n}\rightarrow\{0,1\}$ is a monotone function if for any two assignments $(x_{1},\dots,x_{n})$ and $(x^{\prime}_{1},\dots,x^{\prime}_{n})$ such that for all $i\in[n],x_{i}\leq x^{\prime}_{i}$ , $f(x_{1},\dots,x_{n})\leq f(x^{\prime}_{1},\dots,x^{\prime}_{n})$ . It is well known that a function $f$ is monotone iff $f(x_{1},\dots,x_{n})=\sum_{S\subseteq[n]}\alpha_{S}\prod_{j\in S}x_{j}$ for some $\alpha_{S}\in\{0,1\}$ for all $S\subseteq[n]$ , and hence $f(\cdot)$ can be written as the OR of the AND of the subsets $S$ with $\alpha_{S}=1$ .

Given a boolean certificate, we associate with each gate $g$ a function $f_{g}:\{0,1\}^{|\textsf{IP}|}\rightarrow\{0,1\}$ as follows. $f_{g}(x_{1},\ldots,x_{|\textsf{IP}|})=b\in\{0,1\}$ if the assignment $(x_{1},\dots,x_{|\textsf{IP}|})$ to the input gates fixes gate $g$ to $b$ , else $f_{g}(x_{1},\ldots,x_{n})=1$ . The next two lemmas show that for each gate $g$ , the function $f_{g}$ is a monotone function, and that for any $(x_{1},\dots,x_{|\textsf{IP}|})\in\{0,1\}^{|\textsf{IP}|}$ , the assignment $A(g)=f_{g}(x_{1},\dots,x_{|\textsf{IP}|})$ is a satisfying assignment. These allow us to complete the proof of Lemma 10.

Lemma 16.

The function $f_{g}$ is a monotone for each intermediate and output gate $g\in\textsf{OP}\cup\textsf{IM}$ .

Lemma 17.

For any assignment $(x_{1},\dots,x_{|\textsf{IP}|})$ to the input gates, the extension $A$ given by $A(X_{i})=x_{i}$ for all $X_{i}\in\textsf{IP},A(g)=f_{g}(x_{1},\dots,x_{|\textsf{IP}|})$ for all $g\in\textsf{OP}\cup\textsf{IM}$ is a satisfying assignment.

Proof of Lemma 10..

We prove the second property, since the first was proved earlier. If there is a square certificate for a partial function, then there exists a boolean certificate. From the definition of boolean certificates, for each $i\in[m]$ , the assignment to all gates in which gate $g\in\textsf{G}$ is assigned $(\mathcal{C}(g))_{i}$ is a satisfying assignment. From Lemmas 12 and 15, we have $\operatorname{\textsf{fix}}^{((\mathcal{C}(X_{1}))_{i},\dots,(\mathcal{C}(X_{|IP|})_{i})}(g)=(\mathcal{C}(g))_{i}$ for all output gates $g$ and $i\in[m]$ (Recall that $\textsf{IP}=\{X_{1},\dots,X_{|IP|}\}$ ). Consider the assignment $B^{i}:\textsf{G}\rightarrow\{0,1\}$ , for each $i\in[m]$ , given by $B^{i}(g)=(\mathcal{C}(g))_{i}$ for all $g\in\textsf{IP}$ and $B^{i}(g)=f_{g}(((\mathcal{C}(X_{1}))_{i},\dots,(\mathcal{C}(X_{|IP|})_{i}))$ for all $g\in\textsf{OP}\cup\textsf{IM}$ . Hence, $B^{i}(g)=(\mathcal{C}(g))_{i}$ for all $g\in\textsf{IP}\cup\textsf{OP}$ and $i\in[m]$ . Consider the function $\mathcal{C}^{\prime}:\textsf{G}\rightarrow 2^{[m]}$ given by $\mathcal{C}^{\prime}(g)=\{i\in[m]|B^{i}(g)=1\}$ . Note that $\mathcal{C}^{\prime}(g)=\mathcal{C}(g)$ for all $g\in\textsf{IP}\cup\textsf{OP}$ and hence $\mathcal{C}^{\prime}$ satisfies the second property of boolean certificate. Also $(\mathcal{C}^{\prime}(g))_{i}$ is a satisfying assignment for each $i\in[m]$ by Lemma 17. Therefore, there exists a boolean certificate with creator function $\mathcal{C}^{\prime}$ instead of $\mathcal{C}$ . By Lemma 14, there exists a square certificate with set of involved sets as $\{\mathcal{C}^{\prime}(g)|g\in\textsf{G}\}$ . Since $f_{g}$ is a monotone function of the values at the input gates, we have for all $i\in[m]$ , $B^{i}(g)=\sum_{S\subseteq\textsf{IP}}\alpha_{S}\prod_{j\in S}(\mathcal{C}(X_{j}))_{i}$ for some $\alpha_{S}\in\{0,1\}$ . Therefore, for all $g\in\textsf{G}$ , $\mathcal{C}^{\prime}(g)$ can be obtained by union and intersection of sets $\mathcal{C}(X_{j})$ for $X_{j}\in\textsf{IP}$ , i.e., of input sets. ∎

4 Convex functions

A function $f:K\rightarrow\mathbb{R}$ is convex ( $K\subseteq\mathbb{R}^{m}$ ) iff for all $x,y\in K$ and $\lambda\in[0,1]$ ,

[TABLE]

The problem of extending a convex function is extensively studied in convex analysis. We briefly discuss the work of Yan [33] and Dragomirescu and Ivan [21], focusing on the presentation by Yan. Given a partial function $f$ that is convex on a non-convex domain $C$ , Yan considers extending $f$ to a convex function both within and outside the convex hull $\operatorname{Conv}(C)$ . For a point $x\in\operatorname{Conv}(C)$ , he defines function $\hat{g}$ using the well-known convex roof construction:

[TABLE]

The infimum thus runs over all possible convex combinations of points $y\in C$ that evaluate to $x$ . If the function $f$ is bounded below or the domain $C$ contains a point in the relative interior of $\operatorname{Conv}(C)$ , then $\hat{g}$ is a convex extension [33]. For a point $x$ outside the convex hull of $C$ , assuming now that $f$ is defined and convex inside $\operatorname{Conv}(C)$ , Yan defines

[TABLE]

The function $\tilde{g}$ is a convex extension iff $f$ satisfies the Lipschitz property444A function $f$ has the Lipschitz property on $K$ if there exists a constant $L$ such that $|f(x)-f(y)|\leq L||x-y||\quad\forall x,y\in K$ on $\operatorname{Conv}(C)$ [33]. The extensions $\hat{g}(x)$ and $\tilde{g}(x)$ are optimal in the following sense. Within the convex hull, $\hat{g}(x)$ is maximal, i.e., for any convex function $g$ that extends $f$ to the convex hull of $C$ , for any point $x$ inside the convex hull, $g(x)\leq\hat{g}(x)$ . This is because for optimal $\lambda$ in the definition of $\hat{g}$ , $g(x)\leq\sum_{y\in C}\lambda_{y}g(y)=\sum_{y\in C}\lambda_{y}f(y)=\hat{g}(x)$ . Similarly, outside the convex hull, $\tilde{g}(x)$ is minimal, i.e., for any convex function $g$ that extends $f$ (assuming $f$ is defined and convex on the convex hull) to $\mathbb{R}^{m}$ , $g(x)\geq\tilde{g}(x)$ for any point $x$ outside the convex hull. This again can be easily seen as for optimal $\lambda$ and $y,z\in\operatorname{Conv}(C)$ in the definition of $\tilde{g}$ , $g(x)\geq\lambda g(y)+(1-\lambda)g(z)=\lambda f(y)+(1-\lambda)f(z)=\tilde{g}(x)$ .

Our results.

The set of points $\mathcal{D}=\{T_{1},\dots,T_{n}\}$ in the given partial function $H$ corresponds to the domain $C$ described above. We assume for simplicity that $\operatorname{Conv}(\mathcal{D})$ has non-zero volume. We show that Approximate Convex Extension (and hence Convex Extension) can be solved in polynomial time. We also give a unified construction for a convex function $\tilde{f}$ that equals $\hat{g}(x)$ inside the convex hull $\operatorname{Conv}(\mathcal{D})$ and $\tilde{g}(x)$ outside, and show that if there exists a convex extension of $H$ , then our construction is a convex extension. However, we show that evaluating $\tilde{f}$ for a point $x\in\operatorname{Conv}(\mathcal{D})$ is strongly NP-hard. Our results hold for concave functions as well, using the fact that $f$ is a convex function iff $-f$ is concave. Recall that $H=\{(T_{1},f_{1}),\dots,(T_{n},f_{n})\}$ is our partial function, with each $T_{i}\in\mathbb{R}^{m}$ and $f_{i}\in\mathbb{R}.$

We now give the construction of our convex extension $\tilde{f}$ . Since our partial function is defined on a finite set of points $\mathcal{D}$ , the convex roof function $\hat{g}(x)$ from (4) is the optimal value of the linear program Convex-P:

[TABLE]

The linear program Convex-D is dual of Convex-P and has variables $(y_{1},\dots,y_{m},\mu)$ . Now let $Q=\{(y_{1},\dots,y_{m},\mu)|\sum_{j=1}^{m}(T_{i})_{j}y_{j}+\mu\leq f_{i}\quad\forall i\in[n]\}$ denote the polyhedron of dual feasible solutions. By our assumption that $\operatorname{Conv}(\mathcal{D})$ has non-zero volume, $Q$ has at least one vertex (shown in Appendix C). Let $V=\{v_{1},\dots,v_{N}\}$ be the set of vertices of the polyhedron $Q$ , and let $v_{i}=(y^{i},\mu^{i})$ for all $i\in[N]$ , with $y^{i}\in\mathbb{R}^{m}$ and $\mu^{i}\in\mathbb{R}$ . We define $\tilde{f}$ as the maximum over all vertices of $Q$ , of the objective of Convex-D:

[TABLE]

Since $\tilde{f}(x)$ is the maximum of linear functions, it is convex. Also, $\tilde{f}(x)=\hat{g}(x)$ if $x\in\operatorname{Conv}(\mathcal{D})$ . This is because Convex-P is feasible for $x\in\operatorname{Conv}(\mathcal{D})$ , and hence the dual is bounded and there is an optimal vertex.

Lemma 18.

A partial function $H$ can be extended to a convex function on $\mathbb{R}^{m}$ iff $\tilde{f}(T_{i})=f_{i}$ for all $i\in[n]$ .

Proof.

Since $\tilde{f}(x)$ is convex so if $\tilde{f}(T_{i})=f_{i}$ for all $i\in[n]$ then clearly $\tilde{f}$ is a convex extension to $\mathbb{R}^{m}$ .

Now suppose that $\tilde{f}(T_{k})=\hat{g}(T_{k})$ is not equal to $f_{k}$ for some $k\in[n]$ . For any $x=T_{i}$ , it is clear that $\hat{g}(x)\leq f_{i}$ (if we set $\lambda_{i}=1$ , then the objective value is $f_{i}$ ). Thus $\hat{g}(T_{k})<f_{k}$ . If there exists a convex extension $g$ then $g(T_{k})\leq\hat{g}(T_{k})<f_{i}$ , which is a contradiction. The first inequality is due to the earlier observation that if $g$ is any convex extension of $H$ on $\operatorname{Conv}(\mathcal{D})$ then $g(x)\leq\hat{g}(x)$ for all $x\in\operatorname{Conv}(\mathcal{D})$ . ∎

Theorem 8.

Convex Extension is in P, and if $H$ is extendible then $\tilde{f}$ is an extension.

Proof.

By Lemma 18, determining the existence of extension boils down to checking $\tilde{f}(T_{i})=\hat{g}(T_{i})=f_{i}$ for all $i\in[n]$ . Since $\hat{g}$ can be efficiently computed by solving Convex-P, Convex Extension is in P. If $H$ is extendible then we have by Lemma 18, $\tilde{f}(T_{i})=f_{i}$ for all $i\in[n]$ . Hence, $\tilde{f}$ is an extension. ∎

Theorem 9.

Approximate Convex Extension is in P.

We have shown that $\tilde{f}$ is the maximal extension $\hat{g}$ inside the $\operatorname{Conv}(\mathcal{D})$ . Interestingly, $\tilde{f}(x)$ is also the minimal extension $\tilde{g}(x)$ outside the convex hull, where $\tilde{g}(x)$ is defined as follows.

[TABLE]

Lemma 19.

For any partial function $H=\{(T_{1},f_{1}),\dots,(T_{n},f_{n})\}$ , we have $\tilde{f}(x)=\tilde{g}(x)$ for all $x$ outside the convex hull $\operatorname{Conv}(\mathcal{D})$ .

The function $\tilde{f}(x)$ is thus a natural and canonical convex extension of a given partial function. In fact, we know of no other convex extensions that are widely studied. It is then natural to ask, given a partial function, to evaluate the convex extension $\tilde{f}(x)$ at a given point. Surprisingly, we show that this problem is strongly NP-hard, and hence this canonical extension cannot be efficiently evaluated at a given point, unless P = NP. Since $\tilde{f}$ is the maximum over vertices $V=\{v_{1},\dots,v_{N}\}$ of the polyhedron $Q$ , one may wonder why it can not be computed by solving the linear program Convex-D. This is because, for $x$ outside $\operatorname{Conv}(\mathcal{D})$ , Convex-P is infeasible and hence Convex-D is unbounded. So the optimal value is $\infty$ , whereas $\tilde{f}(x)$ is a finite.

We show a reduction from the Optimal Vertex problem, which is strongly NP-hard [25]. In this problem, we are given an $n\times m$ rational matrix $A$ , rational $n$ -vector $b$ , $m$ -vector $c$ and a rational number $K$ . Then the objective is to decide if there exists a vertex $v$ of the polyhedron $Ay\leq b$ with $c^{T}v\geq K$ .

In the proof of hardness of Optimal Vertex in [25], the instance $A$ and $b$ satisfy the property that the polyhedron $Ay\leq b$ has at least one vertex. We will use this property in our proof. Before we give the reduction, we state a property of vertices of polyhedra.

Lemma 20 (Lemma 8.2 of [10]555The first part of this lemma is 8.2(a) of [10] whereas the second part follows from the proof of 8.2(a)).

Suppose $A$ is a $n\times m$ matrix and $b$ is an $n$ -vector. Also assume all the entries of $A$ and $b$ are integers and $U$ is the largest absolute value of the entries in $A$ and $b$ . Then every extreme point of the polyhedron $P=\{y\in\mathbb{R}^{m}|Ay\leq b\}$ satisfies :

$|y_{j}|\leq(mU)^{m},j=1,\dots,m$ ** 2. 2.

If $y_{j}\neq 0$ then $|y_{j}|\geq\frac{1}{(mU)^{m}},j=1,\dots,m$

Theorem 10.

Given a partial function $H$ , a point $x\in\mathbb{R}^{m}$ and rational $k$ , it is strongly NP-hard to determine if $\tilde{f}(x)\geq k$ . However, if the dimension $m$ is constant, $\tilde{f}(x)$ can be computed in polynomial time.

Proof.

First note that the number of vertices of the dual polyhedron $Q$ is bounded by $O(n^{m})$ , and can be enumerated [18]. If $m$ is constant, this gives a polynomial time algorithm to compute $\tilde{f}(x)$ .

Let the instance of Optimal Vertex have parameters $A$ , $b$ , $c$ and $K$ , and as noted we assume that the polyhedron $Ay\leq b$ has at least one vertex. We will also assume wlog that all entries in $A$ and $b$ are integers. Also let $U$ be the largest absolute value among all entries of $A$ and $b$ . Let $A_{i}=[a_{i1}a_{i2}\dots a_{im}]^{T}$ be the $i$ th row of matrix $A$ .

Let $M=(mU)^{m}$ . We construct the instance for our problem as follows. We set the partial function $H=\{(A_{1},b_{1}),\dots,(A_{n},b_{n}),(\bm{0},0)\}$ , $x=\frac{c}{L}$ and $k=K/L$ , where $L=3M^{2}\sum_{j=1}^{m}|c_{j}|$ (we need $L>2M^{2}\sum_{j=1}^{m}|c_{j}|$ ). Note that the size of $x$ is polynomial in the sizes of $A,b$ and $c$ . Our claim is that $\tilde{f}(x)=\frac{\max_{v}c^{T}v}{L}$ where maximum is taken over all vertices of polyhedron $Ay\leq b$ . Thus $\tilde{f}(x)\geq k$ iff there exists a vertex $v$ such that $c^{T}v\geq K$ . To prove the claim, we observe that the polyhedron $Q$ associated with the definition of $\tilde{f}$ is $\{Ay+\bm{\mu}\leq b,\mu\leq 0\}$ (where $\bm{\mu}$ is an $n$ -vector with each entry $\mu$ ). Let $Q^{\prime}$ be the polyhedron $Ay\leq b$ and $V^{\prime}$ be the set all vertices of $Q^{\prime}$ ( $V^{\prime}$ is non-empty). It is easy to see that $\hat{y}$ is a vertex of $Q^{\prime}$ iff $(\hat{y},0)$ is a vertex of $Q$ . Therefore, the polyhedron $Q$ also has at least one vertex. Let $V$ be the set of all vertices of $Q$ . We denote any vertex in $Q$ as $(y,\mu)$ . Since $V$ is non-empty, we have $\tilde{f}(x)=\tilde{f}(\frac{c}{L})=\max_{(y,\mu)\in V}(\frac{c}{L})^{T}y+\mu$ from (6). If we prove that this maximum can only be attained at the vertices with $\mu=0$ then we will be done. Note that any vertex $(y,0)\in V$ has the property $(\frac{c}{L})^{T}y+\mu\geq-\frac{M\sum_{j}|c_{j}|}{L}$ (because of first part of Lemma 20). Consider a vertex $(y,\mu)\in V$ with $\mu<0$ (recall that $\mu\leq 0$ ). For such a vertex $(\frac{c}{L})^{T}y+\mu\leq\frac{M\sum_{j}|c_{j}|}{L}-\frac{1}{M}$ (Lemma 20). For our choice of $L$ , $-\frac{M\sum_{j}|c_{j}|}{L}>\frac{M\sum_{j}|c_{j}|}{L}-\frac{1}{M}$ . Hence, $\tilde{f}(x)=\max_{v\in V^{\prime}}\frac{c^{T}v}{L}$ , completing the proof of the claim. ∎

Conclusion. Our work is the first to formally study the complexity of partial function extension. We show that results can often be counterintuitive, and shed new light on problems previously studied. Our work also gives a number of new results for learning and property testing. While there are clearly a large number of interesting open problems, one we particularly would like to highlight is the basic question of membership in NP (or coNP) of partial function extension. We are able to resolve this for XOS, subadditive, and convex functions, but leave it open for submodular functions. Resolving this problem may lead to further insights on the structure of these functions.

Appendix A Subadditive and XOS functions

Proof of Theorem 1

Let the given partial function be $H=\{(T_{1},f_{1}),\dots,(T_{n},f_{n})\}$ . We claim that the optimal value of $\alpha$ for Aproximate XOS Extension (say $\hat{\alpha}$ ) is equal to the optimal value of $\alpha$ in the following linear program (say $\alpha^{*}$ ), with variables $\alpha$ and $w_{ij}$ for all $1\leq i\leq n$ and $1\leq j\leq m$ . Since the linear program can be solved in polynomial time, this claim implies that Approximate XOS Extension can be efficiently solved.

[TABLE]

Let the XOS function $g$ corresponding to the optimal solution $\hat{\alpha}$ for Approximate XOS Extension be given by linear functions $v_{1},\dots,v_{k}\in\mathbb{R}^{m}_{+}$ for some $k\geq 1$ , and let the linear functions be indexed so that $g(T_{i})=v_{i}^{T}\chi(T_{i})$ for $i\in[n]$ (the same linear function can appear with multiple indices, i.e., $v_{i}=v_{j}$ for $i\neq j$ ). Then $f_{i}\leq g(T_{i})=v_{i}^{T}\chi(T_{i})\leq\hat{\alpha}f_{i}$ for all $i\in[n]$ , and $v_{i}^{T}\chi(T_{i})\geq v_{j}^{T}\chi(T_{i})$ for all $i,j\in[n]$ . It is clear that $\hat{\alpha},\langle v_{i}\rangle_{i\in[n]}$ are feasible for the linear program, hence $\alpha^{*}\leq\hat{\alpha}$ . By definition, $\hat{\alpha}\leq\alpha^{*}$ , since the linear program produces an XOS function that has value within $\alpha^{*}$ factor at each $T_{i}$ for all $i\in[n]$ . Hence $\hat{\alpha}=\alpha^{*}$ .

Upper bound for Approximate Extension

We now show the upper bound for the Approximate Extension. Let $\mathcal{F}$ and $\mathcal{G}$ be two classes of functions. We say that $\mathcal{G}$ $\theta$ -approximates $\mathcal{F}$ if for all functions $f$ in $\mathcal{F}$ , there exists a function $g$ in $\mathcal{G}$ such that $g(S)\leq f(S)\leq\theta g(S)$ for all $S\subseteq[m]$ . We first prove the following lemma.

Lemma 21.

Let $\mathcal{F}$ and $\mathcal{G}$ be two classes of functions so that $\mathcal{G}$ $\theta_{1}$ -approximates $\mathcal{F}$ and $\mathcal{F}$ $\theta_{2}$ -approximates $\mathcal{G}$ . If there is a $\rho$ -approximation algorithm for Approximate Extension for $\mathcal{F}$ then there is an $\rho\theta_{1}\theta_{2}$ - approximation algorithm for Approximate Extension for $\mathcal{G}$ .

Proof.

For a given instance of partial function extension, let $\alpha^{*}_{\mathcal{F}}$ and $\alpha^{*}_{\mathcal{G}}$ be the optimal value of $\alpha$ in the Approximate Extension problem for $\mathcal{F}$ and $\mathcal{G}$ respectively. Let $A$ be the $\rho$ -approximation algorithm for $\mathcal{F}$ . A $\rho\theta_{1}\theta_{2}$ -approximation algorithm for Approximate Extension for $\mathcal{G}$ is as follows: given any partial function $H$ , return $\theta_{1}\alpha$ where $\alpha$ is the value returned by algorithm $A$ on $H$ . We have $\alpha^{*}_{\mathcal{F}}\geq\frac{\alpha}{\rho}$ as $A$ is an $\rho$ -approximation algorithm. Since $\mathcal{G}$ $\theta_{1}$ -approximates $\mathcal{F}$ , we have $\alpha^{*}_{\mathcal{G}}\leq\theta_{1}\alpha^{*}_{\mathcal{F}}$ . As $\alpha^{*}_{\mathcal{F}}\leq\alpha$ so we have $\alpha^{*}_{\mathcal{G}}\leq\theta_{1}\alpha$ . Also $\mathcal{F}$ $\theta_{2}$ -approximates $\mathcal{G}$ so we have $\alpha^{*}_{\mathcal{F}}\leq\theta_{2}\alpha^{*}_{\mathcal{G}}$ . Then $\alpha^{*}_{\mathcal{G}}\geq\frac{\alpha^{*}_{\mathcal{F}}}{\theta_{2}}\geq\frac{\alpha}{\rho\theta_{2}}$ . Hence $\frac{\alpha}{\rho\theta_{2}}\leq\alpha^{*}_{\mathcal{G}}\leq\theta_{1}\alpha$ . This proves our result. ∎

Recall that XOS functions are a subclass of subadditive functions. We will use the following result:

Theorem 11 ([11, 20]).

For any subadditive function $f$ , there exists an XOS function $g$ such $g(S)\leq f(S)\leq O(\log m)g(S)$ for all $S\subseteq[m]$ .

From the above results and Lemma 21, the upper bound for Theorem 2 follows, with $\theta_{1}=1,\theta_{2}=O(\log m)$ and $\rho=1$ .

Proof of Lemma 2

Consider the distribution $\mu$ that assigns probability mass uniformly to $\mathcal{D}=\{T_{1},\dots,T_{n}\}$ and [math] elsewhere. We restrict the target function to the family $\mathcal{F^{\prime}}=\{f\in\mathcal{F}|f(T_{i})\in[1,r]\thinspace\forall i\in[n]\}\subseteq\mathcal{F}$ . Suppose there is an algorithm that PMAC-learns $\mathcal{F^{\prime}}$ with approximation factor $<r$ . Let $g$ be an arbitrary function in $\mathcal{F^{\prime}}$ . Let the input to the algorithm be $\{(S_{i},g(S_{i}))\}_{1\leq i\leq l}$ where $l$ is $poly(m)$ (let $\epsilon$ and $\delta$ be constant). Let $\mathcal{F}^{*}=\{h\in\mathcal{F^{\prime}}|h(S_{i})=g(S_{i})\thinspace\forall i\in[l]\}$ . By our assumption, given any values in $[1,r]$ at the sets $\mathcal{D}\setminus\{S_{1},\dots,S_{l}\}$ , there is a function in $\mathcal{F}^{*}$ that takes those values. Let $S\sim\mu$ and $v$ be the value returned by the algorithm at $S$ . If $\alpha<r$ then $f^{*}(S)\leq v\leq\alpha f^{*}(S)$ for all target function $f^{*}\in\mathcal{F}^{*}$ iff $S\in\{S_{1},\dots,S_{l}\}$ . Since $l$ is polynomial (whereas $n$ is superpolynomial), the algorithm then returns a value within $\alpha$ factor with only small probability. Hence $\alpha$ must be at least $r$ . Also above argument and hence lower bound holds even if the algorithm knows the distribution $\mu$ , allowed unbounded computation and choose samples $(\{S_{i}\}_{1\leq i\leq l})$ adaptively.

Proof of Lemma 6

One direction is trivial. If there exists $T_{1},\dots,T_{r},\cup_{i=1}^{r}T_{i}\in\mathcal{D}$ such that $\sum_{i=1}^{r}f(T_{i})<f(\cup_{i=1}^{r}T_{i})$ then partial function is not extendible. Now assume this is not the case. Let $\mathcal{D}^{c}:=\{S|S=\cup_{i=1}^{r}A_{i}\quad\text{for some}\quad A_{1},\dots,A_{r}\in\mathcal{D}\}$ be the union-closure of $\mathcal{D}$ . We now define $\hat{f}$ which is an extension of $f$ to $\mathcal{D}^{c}$ . If $S\in\mathcal{D}$ then $\hat{f}(S)=f(S)$ . If $S\not\in\mathcal{D}$ (and $S\in\mathcal{D}^{c}$ ) then $\hat{f}(S)$ is the minimum value of $\sum_{i=1}^{k}f(S_{i})$ over all families of sets $(S_{1},\dots,S_{k})$ such that each $S_{i}\in\mathcal{D}$ , and $\cup_{i=1}^{k}S_{i}=S$ . Let $M$ be the maximum value of $\hat{f}$ on $\mathcal{D}^{c}$ . We define an extension of $\hat{f}$ to $2^{[m]}$ by assigning value $M$ to each set not in $\mathcal{D}^{c}$ . Let this extension be $\tilde{f}$ . We claim that $\tilde{f}$ is subadditive.

Note that $M$ is the maximum value of $\tilde{f}$ . Let $A$ and $B$ be any two sets. If any of $A$ or $B$ is not in $\mathcal{D}^{c}$ then $\tilde{f}(A)+\tilde{f}(B)$ is at least $M$ and thus $\tilde{f}(A)+\tilde{f}(B)\geq\tilde{f}(A\cup B)$ . Therefore, we assume both $A$ and $B$ are in $\mathcal{D}^{c}$ which implies $A\cup B$ is also in $\mathcal{D}^{c}$ . Let $A$ be the union of $A_{1},\dots,A_{r}\in\mathcal{D}$ ( $r\geq 1$ ) and $B$ be the union of $B_{1},\dots,B_{r^{\prime}}\in\mathcal{D}$ ( $r^{\prime}\geq 1$ ). Therefore, $A\cup B$ is union of $A_{1},\dots,A_{r},B_{1},\dots,B_{r^{\prime}}$ . If $A\cup B$ is in $\mathcal{D}$ then by assumption and otherwise by definition of $\hat{f}$ , we have $\tilde{f}(A)+\tilde{f}(B)\geq\tilde{f}(A\cup B)$ .

Appendix B Submodular functions

Proof of Lemma 9.

Assume for a contradiction that there exists a square certificate and an extension $f(\cdot)$ without a square tuple as described in the lemma. Summing the inequalities $f(A)+f(B)\geq f(A\cup B)+f(A\cap B)$ for each square tuple $(A,B,A\cup B,A\cap B)$ in the square certificate, we observe that all the intermediate sets cancel out, since they appear an equal number of times on the left and right hand sides. We get $\sum_{i\in[n]}f_{i}\thinspace m(T_{i})\geq\sum_{i\in[n]}f_{i}\thinspace tb(T_{i})$ which contradicts property (P2) of the square certificate.

Proof of Lemma 11.

If gate $g$ is fixed to both $b^{\prime}$ and $b^{\prime\prime}$ , there must be two proofs $G^{\prime}=(V^{\prime},E^{\prime})$ with values $\operatorname{\textsf{val}}^{\prime}(\cdot)$ , and $G^{\prime\prime}=(V^{\prime\prime},E^{\prime\prime})$ with values $\operatorname{\textsf{val}}^{\prime\prime}(\cdot)$ as described above. In the rooted tree $G^{\prime}$ , let $g_{0}$ be a gate at maximum distance from the root $g$ that has different values in the two proofs (such a gate must exist, since $g$ is such a gate). Gate $g_{0}$ cannot be a leaf, since all leaves have the same values by definition. Assume that $g_{0}$ is an AND gate; a similar proof holds if $g_{0}$ is an OR gate. If $g_{0}$ has one child in both proofs, since $g_{0}$ is an AND gate, by definition it must have value [math] in both proofs. Similarly if $g_{0}$ has two children in both proofs, it must have value $1$ in both proofs. Now suppose without loss of generality that $g_{0}$ has one child (say $h_{0}$ ) in $G^{\prime}$ , and two children in $G^{\prime\prime}$ . Then $\operatorname{\textsf{val}}^{\prime}(g_{0})=\operatorname{\textsf{val}}^{\prime}(h_{0})=0$ . Since the fan-in for each gate is 2 in the boolean circuit, $h_{0}$ must be a child of $g_{0}$ in $G^{\prime\prime}$ as well, and by assumption, $\operatorname{\textsf{val}}^{\prime}(h_{0})=\operatorname{\textsf{val}}^{\prime\prime}(h_{0})=0$ . But this gives us a contradiction, since if $g_{0}$ is an AND gate with two children in any proof, then both children must have value 1.

Proof of Lemma 12.

Let $G^{\prime}=(V^{\prime},E^{\prime})$ and $\operatorname{\textsf{val}}^{\prime}(\cdot)$ be a proof for gate $g$ for the given assignment to the input gates. Let $g_{0}$ be the gate at maximum distance from the root for which $\operatorname{\textsf{val}}^{\prime}(g_{0})\neq A(g_{0})$ . The children of $g_{0}$ by assumption have the same value as in the satisfying assignment $A(\cdot)$ , and considering the cases in the construction of the proof gives us a contradiction.

Proof of Lemma 13.

Let $G^{\prime}=(V^{\prime},E^{\prime})$ and $\operatorname{\textsf{val}}^{\prime}(\cdot)$ be a proof that fixes $\operatorname{\textsf{val}}(g^{\prime})=0$ for the assignment $(x_{1},\ldots,x_{|\textsf{IP}|})$ . Then as observed previously, every gate in the proof has value 0, including the input gates as the leaves. Since $(x_{1},\ldots,x_{|\textsf{IP}|})\geq(x_{1}^{\prime},\ldots,x_{|\textsf{IP}|}^{\prime})$ , the values at the leaves of the rooted tree remain unchanged, and the proof $G^{\prime}=(V^{\prime},E^{\prime})$ and $\operatorname{\textsf{val}}^{\prime}(\cdot)$ is a proof with inputs $(x_{1}^{\prime},\ldots,x_{|\textsf{IP}|}^{\prime})$ as well, that fixes gate $g$ to [math]. The proof of the converse holds similarly.

Proof of Lemma 16.

Consider an assignment in which an input $x_{i}$ is set to [math] and $f_{g}$ is $1$ for some $g\in\textsf{OP}\cup\textsf{IM}$ . If the gate $g$ is fixed to $1$ by the assignment then it will continue to be fixed to $1$ after setting $x_{i}$ to 1 by Lemma 13. Suppose instead that the gate $g$ is not fixed by the initial assignment with $x_{i}=0$ . In this case also, after setting $x_{i}=1$ , gate $g$ cannot get fixed to [math] by Lemma 13. So, flipping any input $x_{i}$ from [math] to $1$ can not make the function $f_{g}$ evaluate from $1$ to [math]. Hence, $f_{g}$ is monotone.

Proof of Lemma 17.

We will show that the boolean operation at any gate $g$ is not violated by this assignment. Let $g$ be an AND gate with incoming edges from gates $g_{1}$ and gate $g_{2}$ (the case when $g$ is an OR gate is similar). Suppose first that $g$ is fixed by the assignment $(A(X_{1}),\dots,A(X_{n}))$ to the input gates. Since $g$ is fixed, by Definition 2, either all of $g$ , $g_{1}$ and $g_{2}$ are fixed to $1$ or $g$ and one of $g_{1}$ and $g_{2}$ is fixed to [math]. In either case, the boolean operation is satisfied. Now assume that $g$ is not fixed and hence it is assigned $1$ . For a violation, either $g_{1}$ or $g_{2}$ must be assigned [math]. Say $g_{1}$ is assigned 0. But then $g_{1}$ is fixed, and since $g$ is an AND gate, $g$ must also be fixed to 0, giving a contradiction.

Appendix C Convex functions

Proof that $Q$ has at least one vertex

Claim 1.

If $\operatorname{Conv}(\mathcal{D})$ has non-zero volume then the polyhedron $Q$ has at least one vertex.

Proof.

Since $\operatorname{Conv}(\mathcal{D})$ has non-zero volume, the $T_{i}$ ’s do not lie on a hyperplane. Formally, this means that $(y,\mu)(y\in\mathbb{R}^{m},\mu\in\mathbb{R})$ satisfies the inequalities $(T_{i})^{T}y+\mu=0$ for all $i\in[n]$ iff $(y,\mu)=0$ . This assumption implies that there can not exist a line666A polyhedron $P\subseteq\mathbb{R}^{m}$ contains a line if there exists a vector $x\in P$ and a non-zero vector $d\in\mathbb{R}^{m}$ such that the vector $(x+\lambda d)\in P$ for all $\lambda\in\mathbb{R}$ in $Q$ . This is because, for any polyhedron $P=\{Ax\leq b\}$ to contain a line, there must exists a point $x\in P$ and a non-zero vector $d$ such that $Ax+\lambda Ad\leq b$ for all $\lambda\in\mathbb{R}$ which is possible only if $Ad=0$ . Since $Q$ does not contain a line, it must have at least one vertex (Theorem 2.6 of [10]). ∎

Proof of Theorem 9

We will show that for any $\alpha\geq 1$ , it can be efficiently determined whether there exists a convex function $f$ such that $f_{i}\leq f(T_{i})\leq\alpha f_{i}$ for all $i\in[n]$ . Then by binary search Approximate Extension can be solved efficiently. By setting $c_{i}=f_{i}$ and $d_{i}=\alpha f_{i}$ in the following Theorem, we can determine the same.

Lemma 22.

Given a set $\mathcal{D}=\{T_{1},\dots,T_{n}\}$ and pairs $c_{i},d_{i}$ ( $c_{i}\leq d_{i}$ ) associated with each $i\in[n]$ , it can be efficiently determined if there exists a convex function $f:\mathbb{R}^{m}\rightarrow\mathbb{R}$ such that $c_{i}\leq f(T_{i})\leq d_{i}\quad\forall i\in[n]$ .

Proof.

Let $\hat{g}$ be the convex roof function corresponding to the partial function $\{(T_{1},d_{1}),\dots,(T_{n},d_{n})\}$ . Our claim is that there exists a convex function $f:\mathbb{R}^{m}\rightarrow\mathbb{R}$ such that $c_{i}\leq f(T_{i})\leq d_{i}\quad\forall i\in[n]$ iff $c_{i}\leq\hat{g}(T_{i})\leq d_{i}\quad\forall i\in[n]$ . Suppose $\hat{g}(T_{i})\in[c_{i},d_{i}]$ for all $i\in[n]$ . Consider the function $\tilde{f}$ defined by (6). We know that $\tilde{f}$ is convex and $\tilde{f}(T_{i})=\hat{g}(T_{i})\in[c_{i},d_{i}]$ for all $i\in[n]$ . Thus $\tilde{f}$ is the required convex function. For the other direction, suppose for a contradiction that $\hat{g}(T_{i})\not\in[c_{i},d_{i}]$ for some $i=i^{\prime}$ and there exists a required convex function $f$ . Therefore, we have $\hat{g}(T_{i^{\prime}})<c_{i^{\prime}}$ as $\hat{g}(T_{i})\leq d_{i}$ for all $i\in[n]$ (since the partial function is $\{(T_{1},d_{1}),\dots,(T_{n},d_{n})\}$ ). Now consider a partial function $H^{\prime}=\{(T_{1},f(T_{1})),\dots,(T_{n},f(T_{n}))\}$ and its convex roof function $\hat{g^{\prime}}$ . Since $H^{\prime}$ is clearly extendible, we have $\hat{g^{\prime}}(T_{i})=f(T_{i})$ for all $i\in[n]$ (Lemma 18). Further from the definition of $\hat{g}$ and since $f(T_{i})\leq d_{i}$ for all $i$ , it follows that $\hat{g^{\prime}}(T_{i})\leq\hat{g}(T_{i})$ for all $i\in[n]$ . Therefore we have $\hat{g^{\prime}}(T_{i^{\prime}})\leq\hat{g}(T_{i^{\prime}})<c_{i^{\prime}}\leq f(T_{i^{\prime}})$ . This is a contradiction since as noted, $\hat{g^{\prime}}(T_{i})=f(T_{i})$ for all $i\in[n]$ .

This proves the theorem as the conditions $c_{i}\leq\hat{g}(T_{i})\leq d_{i}$ for all $i\in[n]$ can be checked efficiently. ∎

Proof of Lemma 19

Fix $x$ outside and $w$ inside $\operatorname{Conv}(\mathcal{D})$ . Let $z\in\operatorname{Conv}(\mathcal{D})$ and $\lambda\geq 1$ be such that $x=\lambda w+(1-\lambda)z$ . In the proof we vary $z$ , and assume that $\lambda$ changes accordingly so that $x=\lambda w+(1-\lambda)z$ . Since $\hat{g}$ is convex, the value of $\lambda\hat{g}(w)+(1-\lambda)\hat{g}(z)$ increases as $z$ gets close to $w$ . Therefore, $\tilde{g}(x)=\sup_{w\in\operatorname{Conv}(\mathcal{D})}L(w,x)$ where

[TABLE]

We know that $\hat{g}(s)=\max_{1\leq i\leq N}\{\sum_{j=1}^{m}s_{j}y^{i}_{j}+\mu^{i}\}$ (where $(y^{i},\mu^{i})$ is the vertex of the polyhedron $Q$ for all $i\in[N]$ , as defined in main section). Let $1\leq k\leq N$ be the index of the vertex that maximizes $\hat{g}(w)$ . Therefore, $\hat{g}(s)=\{\sum_{j=1}^{m}s_{j}y^{k}_{j}+\mu^{k}\}$ at $s=w$ and in vicinity of $w$ in line segment joining $z$ and $w$ . Therefore,

[TABLE]

Since $x-z=\lambda(w-z)$ so we have $\lambda=\frac{x_{j}-z_{j}}{w_{j}-z_{j}}$ for all $j\in[m]$ . Now

[TABLE]

Thus for any $w\in\operatorname{Conv}(\mathcal{D})$ , $L(w,x)$ is equal to $\sum_{j=1}^{m}x_{j}y^{k}_{j}+\mu^{k}$ where $1\leq k\leq N$ is the index of the vertex that maximizes $\hat{g}(w)=\{\sum_{j=1}^{m}w_{j}y^{k}_{j}+\mu^{k}\}$ . This implies that $\tilde{g}(x)=\sup_{w\in\operatorname{Conv}(\mathcal{D})}L(w,x)$ is at most $\tilde{f}(x)=\max_{1\leq i\leq N}\{\sum_{j=1}^{m}x_{j}y^{i}_{j}+\mu^{i}\}$ .

Now we will show $\tilde{g}(x)\geq\tilde{f}(x)$ . Let $1\leq t\leq N$ be the index of the vertex that maximizes $\tilde{f}(x)$ . Consider the vertex $(y^{t},\mu^{t})$ . By definition of a vertex, there exists $c_{1},\dots,c_{m},c$ such that $\sum_{j=1}^{m}c_{j}y^{t}_{j}+c\mu^{t}>\sum_{j=1}^{m}c_{j}y_{j}+c\mu$ for all $(y,\mu)\in Q$ . We claim that $c>0$ . First note that the left side of the above inequality is a finite quantity. Also for $\mu=-\infty$ (sufficiently small), the point $(y,\mu)$ is in $Q$ for any $y\in\mathbb{R}^{m}$ . If $c<0$ then the right side can be made arbritray large by setting $\mu=-\infty$ . For the case $c=0$ , at least one of the $c_{j}$ ’s must be non zero. Therefore, by setting $\mu=-\infty$ and $y_{j}<0$ if $c_{j}<0$ (and $y_{j}>0$ if $c_{j}>0$ ) the quantity $\sum_{j=1}^{m}c_{j}y_{j}$ can be made arbitrary large. Therefore, we assume $c>0$ . Now the above inequality can be written as $\sum_{j=1}^{m}\frac{c_{j}}{c}y^{t}_{j}+\mu^{t}>\sum_{j=1}^{m}\frac{c_{j}}{c}y_{j}+\mu$ for all $(y,\mu)\in Q$ . Let $y=(\frac{c_{1}}{c},\dots,\frac{c_{m}}{c})$ . The vector $y$ must be in $\operatorname{Conv}(\mathcal{D})$ otherwise Convex-D should be unbounded (and hence right side should be unbounded), which is a contradiction. Now $\tilde{g}(x)\geq L(y,x)$ as $y\in\operatorname{Conv}(\mathcal{D})$ . Since $\sum_{j=1}^{m}\frac{c_{j}}{c}y^{t}_{j}+\mu^{t}>\sum_{j=1}^{m}\frac{c_{j}}{c}y_{j}+\mu$ for all $(y,\mu)\in Q$ , the function $\hat{g}(s)=\max_{1\leq i\leq N}\{\sum_{j=1}^{m}s_{j}y^{i}_{j}+\mu^{i}\}$ is given by $\hat{g}(s)=\{\sum_{j=1}^{m}s_{j}y^{t}_{j}+\mu^{t}\}$ in a sufficiently small neighbourhood of $y$ . Therefore, as before $L(y,x)=\sum_{j=1}^{m}x_{j}y^{t}_{j}+\mu^{t}$ and hence $\tilde{g}(x)\geq\tilde{f}(x)$ .

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. N. Afriat. The construction of utility functions from expenditure data. International Economic Review , 8(1):67–77, feb 1967.
2[2] Ashwinkumar Badanidiyuru, Shahar Dobzinski, Hu Fu, Robert Kleinberg, Noam Nisan, and Tim Roughgarden. Sketching valuation functions. In Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms , pages 1025–1035. Society for Industrial and Applied Mathematics, 2012.
3[3] Maria-Florina Balcan, Florin Constantin, Satoru Iwata, and Lei Wang. Learning valuation functions. In COLT 2012 - The 25th Annual Conference on Learning Theory, June 25-27, 2012, Edinburgh, Scotland , pages 4.1–4.24, 2012.
4[4] Maria Florina Balcan, Florin Constantin, Satoru Iwata, and Lei Wang. Learning valuation functions. In Conference on Learning Theory , pages 4–1, 2012.
5[5] Maria-Florina Balcan, Amit Daniely, Ruta Mehta, Ruth Urner, and Vijay V Vazirani. Learning economic parameters from revealed preferences. In International Conference on Web and Internet Economics , pages 338–353. Springer, 2014.
6[6] Maria-Florina Balcan and Nicholas J. A. Harvey. Learning submodular functions. In Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San Jose, CA, USA, 6-8 June 2011 , pages 793–802, 2011.
7[7] Maria-Florina Balcan and Nicholas JA Harvey. Learning submodular functions. In Proceedings of the forty-third annual ACM symposium on Theory of computing , pages 793–802. ACM, 2011.
8[8] Xiaohui Bei, Wei Chen, Jugal Garg, Martin Hoefer, and Xiaoming Sun. Learning market parameters using aggregate demand queries. In AAAI , pages 411–417, 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Partial Function Extension with Applications to Learning and Property Testing

Abstract

1 Introduction

Our Contribution.

Result 1**.**

Result 2**.**

Result 3**.**

Result 4**.**

Result 5**.**

Result 6**.**

2 Subadditive and XOS Functions

Theorem 1**.**

Lemma 1** ([2]).**

Proof.

Corollary 1**.**

Theorem 2**.**

Proof.

A lower bound on learning subadditive functions.

Definition 1** ([7]).**

Lemma 2**.**

Theorem 3** ([22]).**

Lemma 3**.**

Proof.

Theorem 4**.**

Proof.

Testers for subadditive and XOS functions.

Lemma 4**.**

Proof.

Lemma 5**.**

Proof.

Theorem 5**.**

Proof.

A subexponential tester for nonmonotone subadditive functions.

Lemma 6**.**

Lemma 7**.**

Proof.

Theorem 6**.**

Proof.

3 Submodular functions

Theorem 7**.**

Lemma 8**.**

Proof.

Lemma 9**.**

Lemma 10**.**

Proof of Theorem 7 (assuming Lemma 10)..

Definition 2**.**

Lemma 11**.**

Lemma 12**.**

Lemma 13**.**

Boolean Certificates.

Lemma 14**.**

Proof.

Lemma 15**.**

Proof.

Lemma 16**.**

Lemma 17**.**

Proof of Lemma 10..

4 Convex functions

Our results.

Lemma 18**.**

Proof.

Theorem 8**.**

Proof.

Theorem 9**.**

Lemma 19**.**

Lemma 20** (Lemma 8.2 of [10]555The first part of this lemma is 8.2(a) of [10] whereas the second part follows from the proof of 8.2(a)).**

Theorem 10**.**

Proof.

Appendix A Subadditive and XOS functions

Proof of Theorem 1

Upper bound for Approximate Extension

Lemma 21**.**

Proof.

Theorem 11** ([11, 20]).**

Result 1.

Result 2.

Result 3.

Result 4.

Result 5.

Result 6.

Theorem 1.

Lemma 1 ([2]).

Corollary 1.

Theorem 2.

Definition 1 ([7]).

Lemma 2.

Theorem 3 ([22]).

Lemma 3.

Theorem 4.

Lemma 4.

Lemma 5.

Theorem 5.

Lemma 6.

Lemma 7.

Theorem 6.

Theorem 7.

Lemma 8.

Lemma 9.

Lemma 10.

Definition 2.

Lemma 11.

Lemma 12.

Lemma 13.

Lemma 14.

Lemma 15.

Lemma 16.

Lemma 17.

Lemma 18.

Theorem 8.

Theorem 9.

Lemma 19.

Lemma 20 (Lemma 8.2 of [10]555The first part of this lemma is 8.2(a) of [10] whereas the second part follows from the proof of 8.2(a)).

Theorem 10.

Lemma 21.

Theorem 11 ([11, 20]).

Proof that $Q$ has at least one vertex

Claim 1.

Lemma 22.