System of unbiased representatives for a collection of bicolorings

Niranjan Balachandran; Rogers Mathew; Tapas Kumar Mishra and; Sudebkumar Prasant Pal

arXiv:1704.07716·math.CO·April 26, 2017

System of unbiased representatives for a collection of bicolorings

Niranjan Balachandran, Rogers Mathew, Tapas Kumar Mishra and, Sudebkumar Prasant Pal

PDF

TL;DR

This paper investigates the minimal size of a family of subsets of [n] such that each bicoloring in a given set has an unbiased representative, contributing to combinatorial design theory.

Contribution

It introduces the concept of unbiased representatives for bicolorings and analyzes the minimal family size needed for coverage.

Findings

01

Established bounds on the minimum size of such families.

02

Characterized conditions for the existence of unbiased representatives.

03

Provided algorithms for constructing small unbiased representative families.

Abstract

Let $B$ denote a set of bicolorings of $[n]$ , where each bicoloring is a mapping of the points in $[n]$ to ${- 1, + 1}$ . For each $B \in B$ , let $Y_{B} = (B (1), \dots, B (n))$ . For each $A \subseteq [n]$ , let $X_{A} \in {0, 1}^{n}$ denote the incidence vector of $A$ . A non-empty set $A$ is said to be an `unbiased representative' for a bicoloring $B \in B$ if $⟨ X_{A}, Y_{B} ⟩ = 0$ . Given a set $B$ of bicolorings, we study the minimum cardinality of a family $A$ consisting of subsets of $[n]$ such that every bicoloring in $B$ has an unbiased representative in $A$ .

Figures1

Click any figure to enlarge with its caption.

Equations41

γ (n, k, r) = B max γ (B, k, r) .

γ (n, k, r) = B max γ (B, k, r) .

γ (n, [k_{1}, k_{2}], [r_{1}, r_{2}]) = B max γ (B, [k_{1}, k_{2}], [r_{1}, r_{2}]) .

γ (n, [k_{1}, k_{2}], [r_{1}, r_{2}]) = B max γ (B, [k_{1}, k_{2}], [r_{1}, r_{2}]) .

\frac{( k n )}{( k 2 k )} \leq γ (n, k, 2 k) \leq \frac{( k n )}{( k 2 k )} (1 + o (1)),

\frac{( k n )}{( k 2 k )} \leq γ (n, k, 2 k) \leq \frac{( k n )}{( k 2 k )} (1 + o (1)),

H (x_{i}) = s \in S_{i} ∖ {s_{i}} \prod (x_{i} - s) .

H (x_{i}) = s \in S_{i} ∖ {s_{i}} \prod (x_{i} - s) .

G (x_{1}, \dots, x_{n}) = i = 1 \prod n H (x_{i}) .

P (Y_{B}) = A \in A \prod ⟨ X_{A}, Y_{B} ⟩ .

P (Y_{B}) = A \in A \prod ⟨ X_{A}, Y_{B} ⟩ .

P^{'} (X = (x_{1}, \dots, x_{n})) = P (Y_{B} = (1 - 2 x_{1}, \dots, 1 - 2 x_{n})) (x_{1} + \dots + x_{n} - n) .

P^{'} (X = (x_{1}, \dots, x_{n})) = P (Y_{B} = (1 - 2 x_{1}, \dots, 1 - 2 x_{n})) (x_{1} + \dots + x_{n} - n) .

1 - (\frac{r}{2} r) (\frac{α n}{n})^{\frac{r}{2}} (\frac{( 1 - α ) n}{n})^{\frac{r}{2}} \leq 1 - C \frac{2 ^{r}}{r} α^{\frac{r}{2}} (1 - α)^{\frac{r}{2}} < e^{- C \frac{2 ^{r}}{r} α^{\frac{r}{2}} (1 - α)^{\frac{r}{2}}}, where C = \frac{1}{π} .

1 - (\frac{r}{2} r) (\frac{α n}{n})^{\frac{r}{2}} (\frac{( 1 - α ) n}{n})^{\frac{r}{2}} \leq 1 - C \frac{2 ^{r}}{r} α^{\frac{r}{2}} (1 - α)^{\frac{r}{2}} < e^{- C \frac{2 ^{r}}{r} α^{\frac{r}{2}} (1 - α)^{\frac{r}{2}}}, where C = \frac{1}{π} .

∣ A ∣ \leq \frac{r}{C 2 ^{r} α ^{\frac{r}{2}} ( 1 - α ) ^{\frac{r}{2}}} ln (∣ B ∣),

∣ A ∣ \leq \frac{r}{C 2 ^{r} α ^{\frac{r}{2}} ( 1 - α ) ^{\frac{r}{2}}} ln (∣ B ∣),

∣ A ∣ \leq \frac{r}{C ( 1 - 4 ϵ ^{2} ) ^{\frac{r}{2}}} ln (∣ B ∣) .

∣ A ∣ \leq \frac{r}{C ( 1 - 4 ϵ ^{2} ) ^{\frac{r}{2}}} ln (∣ B ∣) .

γ (n, k, r) \geq \frac{( k n )}{( \frac{r}{2} r ) ( k - \frac{r}{2} n - r )} .

γ (n, k, r) \geq \frac{( k n )}{( \frac{r}{2} r ) ( k - \frac{r}{2} n - r )} .

C o v (F) \leq \frac{∣ F ∣}{v} (1 + ln a) .

C o v (F) \leq \frac{∣ F ∣}{v} (1 + ln a) .

\frac{( k n )}{( \frac{r}{2} r ) ( k - \frac{r}{2} n - r )} \leq γ (n, k, r) \leq \frac{( k n )}{( \frac{r}{2} r ) ( k - \frac{r}{2} n - r )} (1 + 0.7 r + ln ((k - \frac{r}{2} n - r))) .

\frac{( k n )}{( \frac{r}{2} r ) ( k - \frac{r}{2} n - r )} \leq γ (n, k, r) \leq \frac{( k n )}{( \frac{r}{2} r ) ( k - \frac{r}{2} n - r )} (1 + 0.7 r + ln ((k - \frac{r}{2} n - r))) .

γ (n, k, r) \leq \frac{( r n )}{( \frac{r}{2} k ) ( \frac{r}{2} n - k )} (1 + ln ((\frac{r}{2} r) (k - \frac{r}{2} n - r))) .

γ (n, k, r) \leq \frac{( r n )}{( \frac{r}{2} k ) ( \frac{r}{2} n - k )} (1 + ln ((\frac{r}{2} r) (k - \frac{r}{2} n - r))) .

(k n) (\frac{r}{2} k) (\frac{r}{2} n - k) = (r n) (\frac{r}{2} r) (k - \frac{r}{2} n - r) .

(k n) (\frac{r}{2} k) (\frac{r}{2} n - k) = (r n) (\frac{r}{2} r) (k - \frac{r}{2} n - r) .

v (B, D) = i, j : j \leq x, i \leq k - x, i + j = \frac{r}{2} \sum (j x) (j n - 2 k + x) ((i k - x))^{2} .

v (B, D) = i, j : j \leq x, i \leq k - x, i + j = \frac{r}{2} \sum (j x) (j n - 2 k + x) ((i k - x))^{2} .

\frac{v _{p ai r}}{v} = \frac{( \frac{r}{2} - 1 n - k - 1 )}{( \frac{r}{2} k ) ( \frac{r}{2} n - k )} = \frac{r}{2 ( n - k )} .

\frac{v _{p ai r}}{v} = \frac{( \frac{r}{2} - 1 n - k - 1 )}{( \frac{r}{2} k ) ( \frac{r}{2} n - k )} = \frac{r}{2 ( n - k )} .

\frac{( k n )}{( k 2 k )} \leq γ (n, k, 2 k) \leq \frac{( k n )}{( k 2 k )} (1 + o (1)),

\frac{( k n )}{( k 2 k )} \leq γ (n, k, 2 k) \leq \frac{( k n )}{( k 2 k )} (1 + o (1)),

max (⌈ \frac{n}{2 r} ⌉, c_{1} \frac{r ( n - r )}{n}) \leq γ (n, \frac{n}{2}, r) \leq c_{2} n \frac{r ( n - r )}{n}, where c_{1} and c_{2} are constants.

max (⌈ \frac{n}{2 r} ⌉, c_{1} \frac{r ( n - r )}{n}) \leq γ (n, \frac{n}{2}, r) \leq c_{2} n \frac{r ( n - r )}{n}, where c_{1} and c_{2} are constants.

∣ O P T (S^{'}) ∣ \leq ∣ O P T (S) ∣ \leq ∣ O P T (S^{'}) ∣ + 1 \leq 2∣ O P T (S^{'}) ∣.

∣ O P T (S^{'}) ∣ \leq ∣ O P T (S) ∣ \leq ∣ O P T (S^{'}) ∣ + 1 \leq 2∣ O P T (S^{'}) ∣.

∣ O P T_{HIT} (S) ∣ \leq ∣ V ∣ \leq r \cdot ∣ A L G (B) ∣ \leq r \cdot f \cdot ∣ O P T_{SUR} (B) ∣ < r \cdot f \cdot ∣ O P T_{HIT} (S) ∣.

∣ O P T_{HIT} (S) ∣ \leq ∣ V ∣ \leq r \cdot ∣ A L G (B) ∣ \leq r \cdot f \cdot ∣ O P T_{SUR} (B) ∣ < r \cdot f \cdot ∣ O P T_{HIT} (S) ∣.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\newdateformat

mydate\monthname[\THEMONTH], \THEYEAR

11institutetext: Department of Mathematics, Indian Institute of Technology, Bombay 400076, India 11email: [email protected]: Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, India

22email: {rogers,tkmishra,spp}@cse.iitkgp.ernet.in

System of unbiased representatives for a collection of bicolorings

Niranjan Balachandran1

Rogers Mathew2

Tapas Kumar Mishra2

Sudebkumar Prasant Pal2

Abstract

Let $\mathcal{B}$ denote a set of bicolorings of $[n]$ , where each bicoloring is a mapping of the points in $[n]$ to $\{-1,+1\}$ . For each $B\in\mathcal{B}$ , let $Y_{B}=(B(1),\ldots,B(n))$ . For each $A\subseteq[n]$ , let $X_{A}\in\{0,1\}^{n}$ denote the incidence vector of $A$ . A non-empty set $A$ is said to be an ‘unbiased representative’ for a bicoloring $B\in\mathcal{B}$ if $\left\langle X_{A},Y_{B}\right\rangle=0$ . Given a set $\mathcal{B}$ of bicolorings, we study the minimum cardinality of a family $\mathcal{A}$ consisting of subsets of $[n]$ such that every bicoloring in $\mathcal{B}$ has an unbiased representative in $\mathcal{A}$ .

1 Introduction

Let $\mathcal{B}$ denote a set of bicolorings of $[n]=\{1,\ldots,n\}$ , where each bicoloring $B\in\mathcal{B}$ maps each point $x\in[n]$ to either -1 or +1. Let $Y_{B}$ denote the $n$ -dimensional vector representing the bicoloring $B$ , i.e. $Y_{B}=(B(1),\ldots,B(n))$ . A non-empty set $A\subseteq[n]$ is said to be an unbiased representative for a bicoloring $B\in\mathcal{B}$ if $\left\langle X_{A},Y_{B}\right\rangle=0$ , where $X_{A}$ denotes the 0-1 $n$ -dimensional incidence vector corresponding to $A$ . We call a family $\mathcal{A}$ of subsets of $[n]$ a system of unbiased representatives (or ‘SUR’) for $\mathcal{B}$ if for every bicoloring $B\in\mathcal{B}$ , there exists at least one set $A\in\mathcal{A}$ such that $\left\langle X_{A},Y_{B}\right\rangle=0$ . Note that the two monochromatic bicolorings can never have any unbiased representatives - we call these bicolorings ‘trivial’. Let $\gamma(\mathcal{B})$ denote the minimum cardinality of a system of unbiased representatives for $\mathcal{B}$ . We define the maximum of $\gamma(\mathcal{B})$ over all possible families $\mathcal{B}$ of non-trivial bicolorings of $[n]$ as $\gamma(n)$ . Note that no singleton set of $[n]$ is a member of any optimal system of unbiased representatives.

Unbiased representatives are useful in testing products such as drugs over a large population where the effectiveness (or side-effect) of a new drug is studied in correlation with a large set of patient attributes such as body weight, height, age, etc. Complementary extremes in the attributes, such as being obese or underweight, tall or short, and young or old, are relevant is such correlation studies. Such studies require patients with complementary ranges of values of a certain attribute to be present in equal (or roughly equal) numbers in the representative group for that attribute – such a group may be deemed to be an unbiased representative for the attribute. However, selecting a separate sample of individuals for each attribute having equal representation of the complementary traits is practically impossible. So, one needs to select a family $\mathcal{A}$ of samples of individuals such that for any attribute $B$ , there exists a sample $A\in\mathcal{A}$ which has an equal representation of individuals from the complementary traits of $B$ . It is in the best interest to choose a family $\mathcal{A}$ of such groups of representatives of the smallest possible cardinality. It is not hard to see the direct mapping of this problem to the problem addressed in this paper. In a generic setting, SURs are useful in various applications where a collection of items (like individual patients) have many attributes (like weight, height and age), where the objective is to form a small collection of subsets of items with almost equal representation of opposite or complementary traits for each attribute.

1.1 Definitions and notations

We use ‘SUR’ to denote the phrase ‘system of unbiased representatives’. For integers $n$ and $p$ , let $[n]$ denote the set $\{1,\ldots,n\}$ , and $[n\pm p]$ denote the set $\{n-p,n-p+1,\ldots,n+p\}$ . A bicoloring $B$ of $[n]$ is called a $k$ -bicoloring if the number of +1’s in $B$ is exactly $k$ . For a bicoloring $B:[n]\rightarrow\{-1,1\}$ , we use $B(+1)$ (respectively, $B(-1)$ ) to denote the set of points receiving color +1 (respectively, -1) under $B$ . We use $Y_{B}$ ( $X_{A}$ ) to denote the $n$ -dimensional $\pm 1$ vector (respectively, 0-1 vector) representing the bicoloring $B$ (respectively, $A\subseteq[n]$ ), i.e. $Y_{B}=(B(1),\ldots,B(n))$ . Note that $\langle Y_{B},X_{A}\rangle=0$ for some $A\in\binom{[n]}{r}$ implies that that $r$ is even. Throughout the rest of the paper, we consider only the non-trivial bicolorings and assume that every set in a SUR is of even cardinality. Let $\gamma(\mathcal{B},k,r)$ (respectively, $\gamma(\mathcal{B},[k_{1},k_{2}],[r_{1},r_{2}])$ ) be the minimum cardinality of a SUR $\mathcal{A}$ for $\mathcal{B}$ , where (i) each $B\in\mathcal{B}$ is a bicoloring of $[n]$ consisting of exactly $k$ +1’s (respectively, at least $k_{1}$ and at most $k_{2}$ +1’s), and, (ii) each $A\in\mathcal{A}$ is an $r$ -sized (respectively, at least $r_{1}$ -sized and at most $r_{2}$ -sized) subset of $[n]$ . We define $\gamma(n,k,r)$ ( $\gamma(n,[k_{1},k_{2}],[r_{1},r_{2}])$ ) as follows.

[TABLE]

Since no singleton set of $[n]$ can be a member of any optimal system of unbiased representative and the monochromatic bicolorings, consisting of exactly zero (or $n$ ) +1’s, are trivial, $\gamma(n,[1,n-1],[2,n])$ is the same as $\gamma(n)$ .

1.2 Relation to existing works

Given a family $\mathcal{F}$ of subsets of $[n]$ , finding another family $\mathcal{F}^{\prime}$ with certain properties in relation with $\mathcal{F}$ has been well investigated. One of the most studied problem in this direction is the computation of separating families(see [16]). Let $\mathcal{F}$ consist of pairs $\{i,j\}$ , $i,j\in[n]$ , $i\neq j$ and $\mathcal{S}$ be another family of subsets on $[n]$ . A subset $S$ separates a pair $\{i,j\}$ if $i\in S$ and $j\not\in S$ or vice versa. The family $\mathcal{S}$ is a separating family for $\mathcal{F}$ if every pair $\{i,j\}\in\mathcal{F}$ is separated by some $S\in\mathcal{S}$ (see [27, 16, 33, 10, 31] for detailed results and related problems on separating families). Separating families have many applications like ‘Wasserman-type’ blood tests of large populations, diagnosis and chemical analysis, locating defective items, etc (see [17]). An extension of the separating family problem is the ‘test cover’ problem: “Given a family $\mathcal{F}$ of subsets of $[n]$ , finding a sub-collection $\mathcal{T}\subseteq\mathcal{F}$ of minimum cardinality such that every pair of $[n]$ is separated by some $S\in\mathcal{T}$ ”. The test cover problem is studied in the context of drug testing, biology [26, 35, 20] and pattern recognition [9]. For results and related notions, see [23, 13, 7, 8, 6]. In the above problems, any two sized set $F=\{i,j\}$ can be viewed as a partial bicoloring $\chi:[n]\rightarrow\{-1,0,1\}$ where $\chi(i)=-1$ , $\chi(j)=+1$ , and $\chi(p)=0$ for any $p\in[n]\setminus\{i,j\}$ and a set $S$ covers $F$ if and only if $\left\langle X_{S},Y_{\chi}\right\rangle\in\{-1,+1\}$ .

An affine hyperplane is a set of vectors $H(a,b)=\{x\in\mathbb{R}^{n}:\left<a,x\right>=b\}$ , where $a\in\mathbb{R}^{n}$ is a nonzero vector, $b\in\mathbb{R}$ . Covering the $\{0,1\}^{n}$ Hamming cube with the minimum number of affine hyperplanes has been well studied - a point $x\in\{0,1\}^{n}$ is said to be covered by a hyperplane $H(a,b)$ if $\left<a,x\right>=b$ (see [1, 21, 30]). It is not hard to see that any SUR for the $2^{n}-2$ non-trivial bicolorings is a covering for all the points of the $\{-1,1\}^{n}$ Hamming cube, except $\{(-1,\ldots,-1),(1,\ldots,1)\}$ , by hyperplanes $H(a,b)$ satisfying (i) $a\in\{0,1\}^{n}$ and (ii) $b=0$ .

1.3 Summary of results

The paper is divided into three logical sections. The first section (Section 2) focuses on obtaining $O(\log|\mathcal{B}|)$ upper bounds for SURs when (i) the collection $\mathcal{B}$ of bicolorings is unrestricted or has minor restrictions, and (ii) the sets in the SURs are unrestricted or have minor restrictions. When $\mathcal{B}$ consists of all the $2^{n}-2$ non-monochromatic bicolorings, it is not difficult to show that $\frac{n}{2}\leq\gamma(\mathcal{B},[1,n-1],[2,n])\leq n-1$ . Using a nice application of Combinatorial Nullstellensatz [2], we improve the above lower bound to $n-1$ .

Theorem 1.1

Let $n$ be a positive integer and $k\in[n]$ . Then, $\gamma(n,[1,n-k],[2,n])=n-1$ , where $1\leq k\leq\lceil\frac{n}{2}\rceil$ .

We relate the problem of SUR to the hitting set problem, which in turn implies relations with ‘VC-dimension’ provided $\epsilon n\leq|B(+1)|\leq(1-\epsilon)n$ for each $B\in\mathcal{B}$ . For such families $\mathcal{B}$ , this relationship assists in establishing an $O(\log|\mathcal{B}|)$ upper bound for cardinalities of any optimal SUR. Under a similar restriction for each $B\in\mathcal{B}$ , if it is mandatory that each set in the SUR is of cardinality exactly $r$ , the best upper bound obtained is large ( $\Omega(\sqrt{r}\log|\mathcal{B}|)$ ). In order to establish an $O(\log|\mathcal{B}|)$ upper bound for size of an optimal SUR under this restriction, we introduce some error in the representations and we have the following theorem.

Theorem 1.2

Let $r^{\prime}\in[r\pm\lceil\frac{r}{2}\rceil]$ , where $r\geq 8$ is an integer. Let $\mathcal{B}$ denote the set of all bicolorings $B\in\{-1,+1\}^{n}$ , where $|B(+1)-B(-1)|\leq d$ , for some $d\in\mathbb{N}$ . Then, with high probability, one can construct a family $\mathcal{A}$ of cardinality at most $\ln|\mathcal{B}|$ in $O(n|\mathcal{B}|\ln|\mathcal{B}|)$ time consisting of $r^{\prime}$ -sized subsets such that for every $B\in\mathcal{B}$ , there exists a set $A\in\mathcal{A}$ with $|\left\langle Y_{B},X_{A}\right\rangle|\leq e\sqrt{r}+\frac{dr}{n}$ .

In the second part of the paper (Section 3), we study the SUR problem where each $B\in\mathcal{B}$ is restricted to have exactly $k$ +1’s and each set in the SUR is required to be of cardinality exactly $r$ , for some $r,k\in[n]$ , $2\leq r\leq 2k$ . We relate the SUR problem under such restrictions to ‘covering’ problems, that enables us to use a deterministic algorithm of Lovász [22] and Stein [32] to compute such a SUR in polynomial time. In particular, for sufficiently large values of $n$ , and $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ , we use a result of Alon et al. [3, Corollary 1.3] to establish the following asymptotically tight bound on $\gamma(n,k,2k)$ .

Theorem 1.3

For sufficiently large values of $n$ ,

[TABLE]

provided $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ , for any $0<\epsilon<0.5$ .

The problem of estimation of $\gamma(n,k,r)$ becomes interesting when $k=\frac{n}{2}$ - the reduction to coverings gives a lower and upper bound of $\max\left(\left\lceil\frac{n}{2r}\right\rceil,c_{1}\sqrt{\frac{r(n-r)}{n}}\right)$ and $O(n\sqrt{\frac{r(n-r)}{n}})$ , respectively. For $r=f(n)$ , where $f(n)$ is an increasing function in $n$ , this establishes only sub-linear lower bounds for $\gamma(n,\frac{n}{2},r)$ . We use a vector space orthogonality argument combined with a theorem of Keevash and Long [18] to obtain a linear lower bound on $\gamma(n,k,r)$ under certain restrictions on $n$ , $k$ and $r$ .

Theorem 1.4

Let $r=2c$ for any odd integer $c\in\{1,\ldots,\frac{n}{2}\}$ . Let $k$ be an even integer, where $\epsilon n<k<(1-\epsilon)n$ for some $0<\epsilon<0.5$ . Then, $\gamma(n,k,r)\geq\delta n$ , where $\delta=\delta(\epsilon)$ is some real positive constant.

Combined with an upper bound construction given in Lemma 4, this establishes an asymptotically tight bound for $\gamma(n,\frac{n}{2},\frac{n}{2})$ , when $\frac{n}{2}\equiv 2\ (\mathrm{mod}\ 4)$ .

In the third part of the paper (Section 4), we obtain the following inapproximability result for computing optimal SURs by using a result of Dinur and Steurer [11] on the inapproximability of the hitting set problem.

Theorem 1.5

Let $r\leq(1-\Omega(1))\frac{\ln n}{2.34}$ , where $n\geq 4$ is an integer. Then, no deterministic polynomial time algorithm can approximate the system of unbiased representative problem for a family of bicolorings on $[n]$ to within a factor $(1-\Omega(1))\frac{\ln n}{2.34r}$ of the optimal when each set chosen in the representative family is required to have its cardinality at most $r$ , unless P=NP.

2 When cardinalities of sets in the ‘SUR’ are unrestricted or semi-restricted

2.1 Bounds on $\gamma(n,[k,n-1],[2,n])$

Recall that $\gamma(n)=\max_{\mathcal{B}}\gamma(\mathcal{B})$ , where $\gamma(\mathcal{B})$ is the cardinality of an optimal system of unbiased representative for $\mathcal{B}$ . Observe that $\gamma(\mathcal{B}_{1})\leq\gamma(\mathcal{B}_{2})$ when $\mathcal{B}_{1}\subseteq\mathcal{B}_{2}$ . So, to establish bounds on $\gamma(n)$ , it suffices to consider the set of all the $2^{n}-2$ non-monochromatic bicolorings as $\mathcal{B}$ and establish bounds on $\gamma(\mathcal{B})$ . We have the following proposition.

Proposition 1

Let $n$ be an integer and $k\in[n]$ .

(i)

$\gamma(n,[k,n-1],[2,n])=\gamma(n,[1,n-k],[2,n])$ . 2. (ii)

$\gamma(n,[1,n-k],[2,n])=\gamma(n,[1,\lfloor\frac{n}{2}\rfloor],[2,n])$ , for any $1\leq k\leq\lceil\frac{n}{2}\rceil$ . 3. (iii)

$\gamma(n,[1,n-k],[2,n])\leq n-1$ , for $1\leq k\leq n$ . 4. (iv)

$\frac{n}{2}\leq\gamma(n,1,[2,n])\leq\gamma(n,[1,n-k],[2,n])$ , for $1\leq k\leq n-1$ .

Proof

(i) For any $k$ -bicoloring $B$ , any unbiased representative $A$ for $B$ is also an unbiased representative for the bicoloring $B^{\prime}$ , where $B^{\prime}(+1)=B(-1)$ and $B^{\prime}(-1)=B(+1)$ .

(ii) The proof follows from the proof of Statement (i) in Proposition 1.

(iii) Let $\mathcal{B}$ denote the set of all the $2^{n}-2$ non-monochromatic bicolorings. It is not hard to see that $\mathcal{A}=\{\{1,2\},\{1,3\},\ldots,\{1,n\}\}$ is a SUR of cardinality $n-1$ for $\mathcal{B}$ .

(iv) Let $\mathcal{B}=\{B||B(+1)|=1\}$ . So, $|\mathcal{B}|=n$ . For any $B\in\mathcal{B}$ , if for any $A\subseteq[n]$ , $\left\langle Y_{B},X_{A}\right\rangle=0$ , then $|A|=2$ . Moreover, for any $A\in\binom{[n]}{2}$ , exactly two $B\in\mathcal{B}$ has $\left\langle Y_{B},X_{A}\right\rangle=0$ . So, we need at least $\frac{n}{2}$ two sized sets to form a SUR for $\mathcal{B}$ . The second inequality follows from the containment. $\Box$

In the construction leading to the proof of Statement (iii) in Proposition 1, only two-sized sets are used as unbiased representatives. We have the following slightly non-trivial construction assuming $n=2^{p}$ , for some integer $p$ , giving similar bounds. Let $\mathcal{A}_{2}=\{\{1,2\},\{3,4\},\ldots,\{n-1,n\}\}$ : a partition of $[n]$ into two-sized sets. Let $\mathcal{A}_{4}=\{\{1,2,3,4\},\{5,6,7,8\},\ldots,\{n-3,n-2,n-1,n\}\}$ : a partition of $[n]$ into four-sized sets taken in that order. Similarly, repeating the construction for $p-2$ more steps, we obtain a sequence of partitions of $[n]$ , $\mathcal{A}_{2},\mathcal{A}_{4},\ldots,\mathcal{A}_{n}$ , where $\mathcal{A}_{i}$ is a partition of $[n]$ into $i$ -sized $\frac{n}{i}$ parts, i.e., $\mathcal{A}_{i}=\{\{1,\ldots,i\},\{i+1,\ldots,2i\},\ldots,\{n-i+1,\ldots,n\}\}$ . Let $\mathcal{A}=\mathcal{A}_{2}\cup\mathcal{A}_{4}\cup\cdots\cup\mathcal{A}_{n}$ . It follows that $|\mathcal{A}|=2^{p-1}+2^{p-2}+\ldots+1=2^{p}-1=n-1$ . To see that this is indeed a SUR for the set of all the $2^{n}-2$ non-monochromatic bicolorings, let $B\in\{-1,1\}^{n}$ denote any non-trivial bicoloring of $[n]$ . Without loss of generality, assume that $|B(+1)|\leq|B(-1)|$ . Let $i$ ( $2\leq i\leq n$ ) be the minimum index such that there exists an $A\in\mathcal{A}_{i}$ with $A\setminus B(+1)\neq\phi$ and $A\cap B(+1)\neq\phi$ . From construction of $\mathcal{A}_{i}$ and assumption on $i$ , it follows that there exists consecutive parts $A_{1},A_{2}\in\mathcal{A}_{\frac{i}{2}}$ with $A_{1}\subseteq B(+1)$ , $A_{2}\cap B(+1)=\phi$ , and $A=A_{1}\cup A_{2}$ . So, it follows that $A$ is an unbiased representative for $B$ .

To establish a tight lower bound on $\gamma(n,[1,\lceil\frac{n}{2}\rceil],[2,n])$ ( $\gamma(n,[1,n-1],[2,n])$ ), we need the following lemma.

Lemma 1

Let $F\in\mathbb{F}(x_{1},\ldots,x_{n})$ be a polynomial and $S_{1},\ldots,S_{n}$ be non-empty subsets of $\mathbb{F}$ , for some field $\mathbb{F}$ . If $F$ vanishes on all but one point $(s_{1},\ldots,s_{n})\in S_{1}\times\cdots\times S_{n}\subseteq\mathbb{F}^{n}$ , then deg( $F$ ) $\geq\sum_{i=1}^{n}(|S_{i}|-1)$ .

Proof

For the sake of contradiction, assume that deg( $F$ ) $<\sum_{i=1}^{n}(|S_{i}|-1)$ . Consider the polynomials.

[TABLE]

Note that deg( $G$ ) is $\sum_{i=1}^{n}(|S_{i}|-1)$ . Let $F(s_{1},\ldots,s_{n})=c_{1}$ and $G(s_{1},\ldots,s_{n})=c_{2}$ . Then, the polynomial $c_{2}F-c_{1}G$ vanishes on all points of $S_{1}\times\cdots\times S_{n}$ . However, $c_{2}F-c_{1}G$ has degree $\sum_{i=1}^{n}(|S_{i}|-1)$ : the monomial $x_{1}^{|S_{1}|-1}\cdots x_{n}^{|S_{n}|-1}$ has $-c_{1}$ as its coefficient. Using Combinatorial Nullstellensatz [5], there exists at least one point in $S_{1}\times\cdots\times S_{n}$ where $c_{2}F-c_{1}G$ is non-zero which is a contradiction. $\Box$

Proof of Theorem 1.1

Statement of Theorem 1.1. Let $n$ be a positive integer and $k\in[n]$ . Then, $\gamma(n,[1,n-k],[2,n])=n-1$ , where $1\leq k\leq\lceil\frac{n}{2}\rceil$ .

Proof

From Statements (ii) and (iii) of Proposition 1, we know that in order to prove Theorem 1.1, we only need to establish a lower bound of $n-1$ for $\gamma(n,[1,n-1],[2,n])$ .

Let $\mathcal{B}$ denote the set of all the $2^{n}-2$ non-monochromatic bicolorings of $[n]$ . Let $\mathcal{A}$ be a SUR of minimum cardinality for $\mathcal{B}$ . Let $Y_{B}$ ( $X_{A}$ ) denote the $n$ -dimensional $\pm 1$ vector (respectively, 0-1 vector) representing the bicoloring $B$ (respectively, $A\subseteq[n]$ ) Consider the polynomial $P(Y_{B})$ , $B\in\mathcal{B}$ .

[TABLE]

From the definition of $\mathcal{A}$ , $P(Y_{B})$ vanishes on all non-trivial bicolorings of $[n]$ . Now, consider the following polynomial $P^{\prime}(X)$ .

[TABLE]

$P^{\prime}(X)$ vanishes at every $X\in\{0,1\}^{n}$ except at the point $(0,\ldots,0)$ : $P$ vanishes at every $X\in\{0,1\}^{n}$ except the two points $(0,\ldots,0)$ and $(1,\ldots,1)$ and $(x_{1}+\ldots+x_{n}-n)$ vanishes at $(1,\ldots,1)$ . $P^{\prime}(X)$ has degree at most $deg(P)+1$ (note that one can repeatedly replace $x_{i}^{2}$ with $x_{i}$ since $x_{i}\in\{0,1\}$ ). Using Lemma 1 with each $S_{i}=\{0,1\}$ , $1\leq i\leq n$ , it follows that $deg(P)+1\geq deg(P^{\prime})\geq n$ . So, $|\mathcal{A}|=deg(P)\geq n-1$ . $\Box$

Remark 1

Lemma 1 can also be used to obtain an alternative proof of induction base case of the Cayley-Bacharach theorem by Riehl and Graham [28] (see Appendix 0.A). An alternative proof of the above lower bound can also be obtained using the Cayley-Bacharach theorem by Riehl and Graham [28].

Note that in Section 2.1, the underlying set $\mathcal{B}$ of all the non-trivial bicolorings of $[n]$ , has cardinality $|\mathcal{B}|=2^{n}-2$ . In this case, Theorem 1.1 establishes that $\gamma(n,[1,n-1],[2,n])=n-1=\Theta(\log|\mathcal{B}|)$ . In the following section, we match the $O(\log|\mathcal{B}|)$ upper bound for slightly restricted sets $\mathcal{B}$ of bicolorings.

2.2 Relation to hitting sets for arbitrary collection of bicolorings

Let $\mathcal{S}$ denote a collection of subsets of $[n]$ . A subset $V\subseteq[n]$ is a hitting set for $\mathcal{S}$ if for every $S\in\mathcal{S}$ , $V\cap S$ is non-empty. Let $H(\mathcal{S})$ denote a minimum cardinality hitting set of $\mathcal{S}$ . The decision version of the Hitting set problem is: “Given the pair $(\mathcal{S},[n])$ and an integer $k$ as input, decide whether there exists a hitting set of cardinality at most $k$ for $\mathcal{S}$ ”.

Lemma 2

Let $\mathcal{B}=\{B_{0},\ldots,B_{m-1}\}\subseteq\{-1,+1\}^{n}$ be a family of bicolorings of $[n]$ . Construct the family $\mathcal{C}=\{C_{1},\ldots,C_{2m}\}$ where $C_{2i+1}=B_{i}(+1)$ and $C_{2i+2}=B_{i}(-1)$ , for $0\leq i\leq m-1$ . Let $H=\{h_{1},h_{2},h_{3},\ldots\}$ denote a hitting set for $\mathcal{C}$ . Define $\mathcal{A}=\{(h_{1},h_{q})|h_{q}\in H,q>1\}$ . Then, $\mathcal{A}$ is a SUR for $\mathcal{B}$ of cardinality $|H|-1$ .

Proof

For the sake of contradiction, assume that $B_{i}\in\mathcal{B}$ has no unbiased representative in $\mathcal{A}$ . Assume that $h_{1}\in B_{i}(+1)$ . Since $H$ is a hitting set for $\mathcal{C}$ , there exists some $h_{q}\in H$ such that $h_{q}$ hits $C_{2i+2}$ (and, thereby $B_{i}(-1)$ ). Then, the pair $(h_{1},h_{q})$ is an unbiased representative for $B_{i}$ , a contradiction to our assumption. So, $h_{1}\not\in B_{i}(+1)$ . But this implies that $h_{1}\in B_{i}(-1)$ . A similar contradiction can be obtained in this case. $\Box$

Let $\mathcal{B}$ be restricted to a special family of bicolorings: the number of +1’s for each $B\in\mathcal{B}$ lies in the range $\epsilon n$ and $(1-\epsilon)n$ , i.e., $\epsilon n\leq|B(+1)|\leq(1-\epsilon)n$ , for some fixed $0<\epsilon<\frac{1}{2}$ . Construct the family $\mathcal{C}$ as above and let $d$ be the VC-dimension of $\mathcal{C}$ . Note that every $C\in\mathcal{C}$ has size at least $\epsilon n$ , for some fixed $\epsilon<\frac{1}{2}$ . Using a result of Haussler and Welzl [14] which was improved by Komlos et al. [19], we can get an ‘epsilon net’ $H$ (which is a hitting set for $\mathcal{C}$ ) of cardinality at most $\frac{d}{\epsilon}(\ln\frac{1}{\epsilon}+2\ln\ln\frac{1}{\epsilon}+6)$ (see Corollary 15.6 of [25] for this exact bound). Using Lemma 2, it follows that we can construct a SUR for $\mathcal{B}$ of cardinality $\frac{d}{\epsilon}(\ln\frac{1}{\epsilon}+2\ln\ln\frac{1}{\epsilon}+6)-1$ . Since any family $\mathcal{C}$ of VC-dimension $d$ has cardinality at least $2^{d}$ , this establishes an $O(\log|\mathcal{C}|)=O(\log|\mathcal{B}|)$ upper bound for the cardinality of any optimal SUR under no restriction on set sizes. We state the result as a proposition below.

Proposition 2

Let $0\leq\epsilon\leq\frac{1}{2}$ be a constant. Let $\mathcal{B}$ be a family of bicolorings, where $\epsilon n\leq|B(+1)|\leq(1-\epsilon)n$ , for each $B\in\mathcal{B}$ . Let $\mathcal{C}$ be the family constructed from $\mathcal{B}$ as in Lemma 2. Let $d$ be the VC-dimension of $\mathcal{C}$ . Then, we can construct a SUR for $\mathcal{B}$ of cardinality $\frac{d}{\epsilon}(\ln\frac{1}{\epsilon}+2\ln\ln\frac{1}{\epsilon}+6)-1$ .

In both Section 2.1 and 2.2, the $O(\log|\mathcal{B}|)$ cardinality SURs contained sets of small sizes (2-sized sets) as well. In what follows, we study the problem of SURs made of large cardinality sets. In order to obtain a similar $O(\log|\mathcal{B}|)$ bound for such a SUR, we inevitably introduce some error in the representation.

2.3 Analysis with bias in representation

Consider the problem of estimation of $\gamma(\mathcal{B})$ for a set of bicolorings in terms of $|\mathcal{B}|$ , where (i) the number of +1’s in each $B\in\mathcal{B}$ lies in the range $\{\alpha n,\alpha n+1,\ldots,(1-\alpha)n\}$ for some $0<\alpha<\frac{1}{2}$ , and (ii) each set in the SUR is of cardinality exactly $r$ , for some $2\leq r\leq n$ . Choosing $r$ elements, namely $x_{1},\ldots,x_{r}$ , from $[n]$ independently and uniformly at random, the probability $p$ that a fixed bicoloring $B\in\mathcal{B}$ does not have $\left\langle Y_{B},X_{A}\right\rangle=0$ , where $A=\{x_{1},\ldots,x_{r}\}$ , is at most

[TABLE]

Let $\mathcal{A}$ be constructed by choosing $t$ $r$ -element sets into $\mathcal{A}$ independently, where each $r$ -element set is chosen as described above. Using union bound, the probability that some $B\in\mathcal{B}$ has $\left\langle Y_{B},X_{A}\right\rangle\neq 0$ for all $A\in\mathcal{A}$ , is $|\mathcal{B}|(e^{-C\frac{2^{r}}{\sqrt{r}}{\alpha}^{\frac{r}{2}}(1-\alpha)^{\frac{r}{2}}})^{t}$ . This gives an upper bound of $\frac{\sqrt{r}}{C2^{r}{\alpha}^{\frac{r}{2}}(1-\alpha)^{\frac{r}{2}}}\ln(|\mathcal{B}|)$ for $|\mathcal{A}|$ . Using Proposition 4, the case when $k=\frac{n}{2}$ and $r=2$ yields a asymptotically tight example for this upper bound. We have the following proposition.

Proposition 3

Let $\mathcal{B}$ denote a set of bicolorings, where the number of +1’s in each $B\in\mathcal{B}$ lie in the range $\{\alpha n,\alpha n+1,\ldots,(1-\alpha)n\}$ for some $0<\alpha<\frac{1}{2}$ . Let $\mathcal{A}$ denote a minimum cardinality SUR for $\mathcal{B}$ , where each $A\in\mathcal{A}$ has cardinality exactly $r$ . Then,

[TABLE]

where $C=\frac{1}{\sqrt{\pi}}.$

When $\alpha=\frac{1}{2}-\epsilon$ , for some $0\leq\epsilon<\frac{1}{2}$ , Inequality 3 becomes

[TABLE]

Using the fact that $(1-\frac{1}{m+1})^{m}\geq\frac{1}{e}$ , the right hand term is at most $e^{(\frac{4\epsilon^{2}}{1-4\epsilon^{2}})\frac{r}{2}}\sqrt{\pi r}\ln|\mathcal{B}|$ . Therefore, when $r\in O(1)$ , we have an $O(\ln|\mathcal{B}|)$ upper bound for any optimal SUR consisting of $r$ sized sets for $\mathcal{B}$ . However, if $r$ is any increasing function in $n$ , the upper bound given by Proposition 3 is large (even if $\epsilon=\frac{1}{n}$ , the term $\frac{\sqrt{r}}{C(1-4\epsilon^{2})^{\frac{r}{2}}}\ln(|\mathcal{B}|)$ is $\Omega(\sqrt{r}\ln|\mathcal{B}|)$ ). For large values of $r$ , in order to obtain an $O(\ln(|\mathcal{B}|))$ upper bound for $|\mathcal{A}|$ , one may allow some error in representation studied in the following section. Let $\mathcal{B}$ denote the set of all bicolorings $B\in\{-1,+1\}^{n}$ , where $|B(+1)-B(-1)|\leq d$ , for some $d\in\mathbb{N}$ . Our problem is to find a small sized family $\mathcal{A}$ for $\mathcal{B}$ such that

each $A\in\mathcal{A}$ is reasonably large; 2. 2.

for every $B\in\mathcal{B}$ , there exists a set $A\in\mathcal{A}$ such that $|\left\langle Y_{B},X_{A}\right\rangle|\leq\Delta$ , where $\Delta=\Delta(r,d,n)$ is as small as possible.

Proof of Theorem 1.2

Statement of Theorem 1.2. Let $r^{\prime}\in[r\pm\lceil\frac{r}{2}\rceil]$ , where $r\geq 8$ is an integer. Let $\mathcal{B}$ denote the set of all bicolorings $B\in\{-1,+1\}^{n}$ , where $|B(+1)-B(-1)|\leq d$ , for some $d\in\mathbb{N}$ . Then, with high probability, one can construct a family $\mathcal{A}$ of cardinality at most $\ln|\mathcal{B}|$ in $O(n|\mathcal{B}|\ln|\mathcal{B}|)$ time consisting of $r^{\prime}$ -sized subsets such that for every $B\in\mathcal{B}$ , there exists a set $A\in\mathcal{A}$ with $|\left\langle Y_{B},X_{A}\right\rangle|\leq e\sqrt{r}+\frac{dr}{n}$ .

Proof

We construct a set $A\subset[n]$ of size $r^{\prime}\in[r\pm\lceil\frac{r}{2}\rceil]$ by picking each element of $[n]$ into $A$ independently with probability $\frac{r}{n}$ . Let $X_{A}=(a_{1},\ldots,a_{n})$ denote the corresponding random vector where each $a_{i}\in\{0,1\}$ . Note that $|A|=\sum_{i=1}^{n}a_{i}$ . So, using linearity of expectation, $(\mu=)\mathbb{E}[|A|]=\sum_{i=1}^{n}\mathbb{E}[a_{i}]=r$ . Moreover, since $a_{i}$ ’s are independent, $Var[|A|]=\sum_{i=1}^{n}Var[a_{i}]=r(1-\frac{r}{n})$ . So, using the following form of Chernoff’s bound $P\left(|X-\mu|>\Delta\mu\right)<(\frac{e^{\Delta}}{(1+\Delta)^{(1+\Delta)}})^{\mu}+(\frac{e^{-\Delta}}{(1-\Delta)^{(1-\Delta)}})^{\mu}$ , we get, $P\left(|\sum_{i=1}^{n}a_{i}-r|>0.5r\right)<0.72$ , for $r\geq 8$ . So, we can sample a family $\mathcal{A}$ of cardinality $t$ ( $t$ to be chosen later) consisting of sets of size $r^{\prime}\in[r\pm\frac{r}{2}]$ .

Let $B\in\mathcal{B}$ be a bicoloring, where $B(+1)-B(-1)=d_{1}$ , where $-d\leq d_{1}\leq d$ . Let $Y_{B}=(b_{1},\ldots,b_{n})$ denote the corresponding bit vector, where each $b_{i}\in\{-1,+1\}$ . Let $Y=\left\langle Y_{B},X_{A}\right\rangle$ . Since $Y=\sum_{i=1}^{n}a_{i}b_{i}$ , $Y$ becomes a random variable (note that $a_{i}b_{i}$ can take values $\{-1,0,1\}$ and are independent). So, $\mathbb{E}[Y]=\sum_{i=1}^{n}b_{i}\mathbb{E}[a_{i}]=\frac{d_{1}r}{n}$ . It follows that $Var[Y]=\sum_{i=1}^{n}b_{i}^{2}Var[a_{i}]=r(1-\frac{r}{n})$ . So, using Chebyshev’s inequality, we get, $P\left(|Y-\frac{d_{1}r}{n}|\geq e\sqrt{r}\right)\leq\frac{1}{e^{2}}(1-\frac{r}{n})<\frac{1}{e^{2}}$ . That is, the probability that $|\langle Y_{B},X_{A}\rangle|>\frac{d_{1}r}{n}+e\sqrt{r}$ is at most $\frac{1}{e^{2}}$ . Let $E$ denote the bad event that some $B\in\mathcal{B}$ has $|\left\langle Y_{B},X_{A}\right\rangle|>\frac{dr}{n}+e\sqrt{r}$ for all $A\in\mathcal{A}$ . Using union bound, $P\left(E\right)\leq|\mathcal{B}|(\frac{1}{e^{2}})^{t}$ . Setting $|\mathcal{B}|(\frac{1}{e^{2}})^{t}$ to at most $\frac{1}{2}$ , we get, $t\geq\ln|\mathcal{B}|$ .

Independently choose $100t$ subsets of $[n]$ (call this collection $\mathcal{D}$ ), where each $D\in\mathcal{D}$ is constructed by picking an element of $[n]$ independently with probability $\frac{r}{n}$ . Let $\mathcal{C}\subseteq\mathcal{D}$ be the sub-collection of $r^{\prime}$ -sized subsets in $\mathcal{D}$ , where $r^{\prime}\in\lceil\frac{r}{2}\rceil$ . Then, $E[|\mathcal{C}|]\geq 28t$ . Since $Var[|\mathcal{C}|]\leq 25t$ , with high probability, $|\mathcal{C}|\geq 10t$ . Partition $\mathcal{C}$ into $t$ -sized sets. With high probability, one of the parts will form our desired family $\mathcal{A}$ that is a SUR (with restricted error) for $\mathcal{B}$ . $\Box$

Comparison between Theorem 1.2 and Proposition 3: Expressing $d$ in Theorem 1.2 in terms of $\alpha$ in Proposition 3, $(1-2\alpha)n=d$ . So, $\epsilon=\frac{1}{2}-\alpha=\frac{d}{2n}$ . Substituting this value of $\epsilon$ in Inequality 4, we get a SUR of cardinality $\Omega(\sqrt{r}\ln|\mathcal{B}|)$ with no error for $\mathcal{B}$ .

3 When cardinalities of sets in the ‘SUR’ and +1’s in the bicolorings are restricted

For any $k$ -bicoloring $B$ of $[n]$ , and any $A\subseteq[n]$ , if $A$ is an unbiased representative for $B$ , then $2\leq|A|\leq 2k$ : otherwise, $\left\langle Y_{B},X_{A}\right\rangle\neq 0$ . Recall that $\gamma(n,k,r)=\gamma(\mathcal{B})$ , where (i) $\mathcal{B}$ is the collection of the $\binom{[n]}{k}$ distinct $k$ -bicolorings, (ii) $\gamma(\mathcal{B})$ is the cardinality of an optimal SUR $\mathcal{A}$ for $\mathcal{B}$ , and, (iii) each $A\in\mathcal{A}$ has cardinality exactly $r$ . We have the following propositions.

Proposition 4

$\max(\lceil\frac{n-k}{r}\rceil,\lceil\frac{k}{r}\rceil)\leq\gamma(n,k,r)$ .

Proof

Consider the case when $k\leq\lfloor\frac{n}{2}\rfloor$ . Given a SUR $\mathcal{A}$ of cardinality $\lfloor\frac{n-k}{r}\rfloor$ consisting of $r$ -sized subsets, there exists a $k$ -sized subset (say, $S$ ) of $[n]$ that is completely disjoint from the union of these $r$ -sized subsets. The bicoloring with the points in $S$ colored +1 and the points in $[n]\setminus S$ colored -1 does not have any unbiased representative in $\mathcal{A}$ . $\Box$

Proposition 5

$\frac{2}{r(r-1)}\gamma(n,k-1,r-2)\leq\gamma(n,k,r)\leq(n-r+1)\gamma(n,k-1,r-2)$ , for $r\geq 4$ .

See Appendix 0.B for a proof of Proposition 5.

A simple averaging argument gives the following lower bound.

[TABLE]

To establish an upper bound, we reduce this problem to a covering problem and then make use of a result by Lovász and Stein [32, 22].

Definition 1

Given a family $\mathcal{F}$ of subsets of some finite set $X$ , the cover number $Cov(\mathcal{F})$ of $\mathcal{F}$ is the minimum number of members of $\mathcal{F}$ whose union includes all the points in $X$ .

Theorem 3.1

[32, 22, 15]** If each member of $\mathcal{F}$ covers at most $a$ elements and each element in $X$ is covered by at least $v$ members of $\mathcal{F}$ , then

[TABLE]

We have the following theorem.

Theorem 3.2

Let $n$ be an integer, $r,k\in[n]$ , $2\leq r\leq 2k$ and $r$ is even. Then,

[TABLE]

Proof

Consider the following construction of a uniform family of subsets based on the $\binom{n}{[k]}$ distinct $k$ -bicolorings and $\binom{n}{r}$ distinct $r$ -sized subsets of $[n]$ .

Construction 1

Corresponding to each distinct $k$ -bicoloring $B$ in $\binom{[n]}{k}$ , we add a point $v_{B}$ to $X$ . Corresponding to each distinct $r$ -sized subset $A$ in $\binom{[n]}{r}$ , we add a set $e_{A}$ to $\mathcal{F}$ , where $e_{A}$ is the collections of all $v_{B}$ ’s such that $\left<X_{A},Y_{B}\right>=0$ . So, $e_{A}$ ‘covers’ $v_{B}$ if and only if $v_{B}\in e_{A}$ .

So, $|X|=\binom{n}{k}$ , $|\mathcal{F}|=\binom{n}{r}$ . Clearly, $a={\binom{r}{\frac{r}{2}}\binom{n-r}{k-\frac{r}{2}}}$ , $v=\binom{k}{\frac{r}{2}}\binom{n-k}{\frac{r}{2}}$ . It follows from the construction that $\gamma(n,k,r)\leq Cov(\mathcal{F})$ . So, from Theorem 3.1, we have

[TABLE]

Double counting $(B,A)$ pairs, where $B$ is a $k$ -bicoloring and $A$ is a $r$ -sized subset that covers $B$ , we get

[TABLE]

Combining Inequalities 6 and 7, and from Inequality 5, Theorem 3.2 follows. $\Box$

Since Lovász-Stein method is deterministic and constructive, the above reduction gives a deterministic polynomial time algorithm for obtaining a SUR. Moreover, from Theorem 3.2, it follows that $\gamma(n,k,r)$ is $O(k\ln n)$ approximable ( $k+0.2r+(k-\frac{r}{2})\ln(\frac{n-r}{k-\frac{r}{2}})$ to be precise) and when $k=\frac{r}{2}$ , the approximation factor becomes $O(r)$ ( $1+0.7r$ to be exact). However, if $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ and $r=2k$ , for some $0<\epsilon<0.5$ , then this upper bound can be improved further.

3.1 Tight upper bounds under restrictions

From Construction 1, it is clear that the approximation factor for $\gamma(n,k,r)$ in Theorem 3.2 comes as a consequence of the approximation factor for the cover number given by Lovász-Stein Theorem. So, tighter bounds for the cover number should translate into tighter bounds for $\gamma(n,k,r)$ . Let $v(B,D)$ denote the number of $r$ -sized sets that are unbiased representatives for both $B$ and $D$ , for any pair $(B,D)$ of $k$ -bicolorings, where $B\neq D$ . Let $v_{pair}=\displaystyle\max_{\begin{subarray}{c}B,D\in\binom{[n]}{k},\\ B\neq D\end{subarray}}v(B,D)$ . Rödl nibble method [29, 4] establishes asymptotically tight bounds for the cover number provided the uniformity $a$ of the family $\mathcal{F}$ in Construction 1 is fixed, $v\rightarrow\infty$ , and $v_{pair}\in o(v)$ . Alon et al. [3] relaxed the condition to $a=o(\log v)$ provided $v_{pair}\in o(\frac{v}{e^{2a}\log v})$ . In the estimation of $\gamma(n,k,r)$ , if $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ and $r=2k$ , for any $0<\epsilon<0.5$ , using Construction 1, it follows that $a<2^{r}\in O(\log n)$ and $\log n\in o(\log v)$ . So, in order to prove Theorem 1.3, it suffices to show that $v_{pair}\in o(\frac{v}{e^{2a}\log v})$ .

Lemma 3

$v_{pair}\in o(\frac{v}{e^{2a}\log v})$ , when $r=2k$ and $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ , for any $0<\epsilon<0.5$ .

Proof

In order to prove the lemma, it is important to note that $v(B,D)$ depends intrinsically on the cardinality of $B(+1)\cap D(+1)$ . Let $S$ be some $r$ -sized subset of $[n]$ . Let $i_{B}=S\cap(B(+1)\setminus D(+1))$ , $i_{D}=S\cap(D(+1)\setminus B(+1))$ , $j_{BD}=S\cap(B(+1)\cap D(+1))$ and $j_{\overline{BD}}=S\cap([n]\setminus(B(+1)\cup D(+1))$ (see Figure 1). So, $S=i_{B}\cup i_{D}\cup j_{BD}\cup j_{\overline{BD}}$ . If $S$ is an unbiased representative for $B$ , then $|i_{B}|+|j_{BD}|=|i_{D}|+|j_{\overline{BD}}|=\frac{r}{2}$ . If $S$ is an unbiased representative of $D$ , then $|i_{D}|+|j_{BD}|=|i_{B}|+|j_{\overline{BD}}|=\frac{r}{2}$ . Therefore, if $S$ is an unbiased representative of both $B$ and $D$ , then (i) $|i_{B}|=|i_{D}|$ ( $=i$ , say), (ii) $|j_{BD}|=|j_{\overline{BD}}|$ ( $=j$ , say), and (iii) $2i+2j=r=2k$ . Let $x=|B(+1)\cap D(+1)|$ . We have,

[TABLE]

Since $|B(+1)|=|D(+1)|=k$ , applying Condition (iii), we get $x=j$ and $k-x=i$ . In other words, if $S$ is an unbiased representative of cardinality $r=2k$ for both the $k$ -bicolorings $B$ and $D$ , $B\cup D\subset S$ . So, for any pair $B,D$ of $k$ -bicolorings, exactly one term in the summation of Equation 8 remains valid, namely $\binom{x}{x}\binom{n-2k+x}{x}\left(\binom{k-x}{k-x}\right)^{2}$ . For instance, when $x=k-1$ , $v(B,D)=\binom{n-k-1}{k-1}$ ; when $x=k-2$ , $v(B,D)=\binom{n-k-2}{k-2}$ , etc. Therefore, $\frac{v(B,D)}{v(B^{\prime},D^{\prime})}=\Omega(\frac{n}{k})$ if $|B(+1)\cap D(+1)|=k-1$ and $|B^{\prime}(+1)\cap D^{\prime}(+1)|\leq k-2$ . So, $v_{pair}=v(B,D)$ , when $|B(+1)\cap D(+1)|=k-1$ provided $r=2k$ . Thus, $v_{pair}=\binom{n-k-1}{\frac{r}{2}-1}$ , when $r=2k$ . Computing $\frac{v_{pair}}{v}$ ,

[TABLE]

Note that $\log v=O(r\log n)$ , $e^{2a}\leq n^{1-2\epsilon}$ since $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ . So, $\frac{v_{pair}e^{2a}\log v}{v}=O(\frac{r^{2}\log n}{n^{2\epsilon}})\rightarrow 0$ , when $n\rightarrow\infty$ . $\Box$

Proof of Theorem 1.3

Statement of Theorem 1.3. For sufficiently large values of $n$ ,

[TABLE]

provided $k\leq\log_{4}\log_{4}(n^{0.5-\epsilon})$ , for any $0<\epsilon<0.5$ .

Proof

From Lemma 3, and using the result of Alon et al. [3, Corollary 1.3] to obtain coverings, the proof follows. $\Box$

3.2 $\gamma(n,k,r)$ when $k=n/2$

Let $\mathcal{B}$ denote the set of all $\binom{n}{\frac{n}{2}}$ distinct $\frac{n}{2}$ -bicolorings. It is not hard to see that $\mathcal{A}=\{\{1,2\},\{1,3\},\allowbreak\ldots,\{1,\frac{n}{2}+1\}\}$ is a SUR of cardinality $\frac{n}{2}$ for $\mathcal{B}$ . Together with Proposition 4, this establishes $\frac{n}{4}\leq\gamma(n,\frac{n}{2},2)\leq\frac{n}{2}$ . It is easy to see that $\gamma(n,\frac{n}{2},n)=1$ . For arbitrary values of $r$ , from Theorem 3.2 and Proposition 4, we have,

[TABLE]

When $r=\frac{n}{2}$ , this establishes a lower bound and upper bound of $\Omega(\sqrt{n})$ and $O(n\sqrt{n})$ , respectively. In general, when $r=f(n)$ is an increasing function in $n$ , this establishes sub-linear lower bounds for $\gamma(n,\frac{n}{2},r)$ .

We use an extension of a theorem of Frankl and Rödl [12] given by Keevash and Long [18] to obtain a linear lower bound on $\gamma(n,k,r)$ under certain restrictions on $k$ and $r$ . Let $\mathcal{D}\subseteq[q]^{n}$ be a $q$ -ary code. For any $x,y\in\mathcal{D}$ , the Hamming distance between $x$ and $y$ is the number of indices where $x(i)\neq y(i)$ , for $1\leq i\leq n$ . The code $\mathcal{D}$ is called $d$ -avoiding if the Hamming distance between no pair of code-words in $\mathcal{D}$ is $d$ . The following upper bound for $d$ -avoiding codes is given in [18].

Theorem 3.3

[18]** Let $\mathcal{D}\subseteq[q]^{n}$ and let $\epsilon$ satisfy $0<\epsilon<\frac{1}{2}$ . Suppose that $\epsilon n<d<(1-\epsilon)n$ and $d$ is even if $q=2$ . If $\mathcal{D}$ is $d$ -avoiding, then $|\mathcal{D}|\leq q^{(1-\delta)n}$ , for some positive constant $\delta=\delta(\epsilon)$ .

We have the following lower bound for $\gamma(n,k,r)$ , when $r=2c$ for any odd integer $c\in\{1,\ldots,\frac{n}{2}\}$ and $\epsilon n<k<(1-\epsilon)n$ , for some $0<\epsilon<0.5$ .

Proof of Theorem 1.4

Statement of Theorem 1.4. Let $r=2c$ for any odd integer $c\in\{1,\ldots,\frac{n}{2}\}$ . Let $k$ be an even integer, where $\epsilon n<k<(1-\epsilon)n$ for some $0<\epsilon<0.5$ . Then, $\gamma(n,k,r)\geq\delta n$ , where $\delta=\delta(\epsilon)$ is some real positive constant.

Proof

Let $\mathcal{B}=\{B_{1},\ldots,B_{\binom{n}{k}}\}$ denote the set of all the bicolorings of $[n]$ consisting of exactly $k$ +1’s. We construct a family $\mathcal{C}=\{C_{1},\ldots,C_{\binom{n}{k}}\}$ , where $C_{i}$ corresponds to the +1 colored points of $B_{i}\in\mathcal{B}$ . Let $\mathcal{A}$ be a SUR for $\mathcal{B}$ , where each $A\in\mathcal{A}$ has cardinality exactly $2c$ for some odd number $c\in[n]$ . Note that $\left\langle Y_{B_{i}},X_{A}\right\rangle=0$ implies that $\left\langle X_{C_{i}},X_{A}\right\rangle=c$ , where $X_{C_{i}}$ denotes the 0-1 incidence vector corresponding to the set $C_{i}$ . Let $V\subset\{0,1\}^{n}$ denote the vector space spanned by the vectors $X_{A}$ ’s, $A\in\mathcal{A}$ , over $\mathbb{F}_{2}$ . Let $V^{\perp}\subset\{0,1\}^{n}$ denote the subspace orthogonal to $V$ . Since $\mathcal{A}$ is a SUR for $\mathcal{B}$ , it follows that for every $C_{i}$ , there exists a set $A\in\mathcal{A}$ such that $\left\langle X_{C_{i}},X_{A}\right\rangle=1(\mod 2)$ (since $c$ is odd). Therefore, $X_{C_{i}}\not\in V^{\perp}$ , for all $X_{C_{i}}\in\mathcal{C}=\binom{[n]}{k}$ . In other words, $V^{\perp}$ does not contain any vector consisting of exactly $k$ ones. Moreover, observe that for any $x,y\in V^{\perp}$ , the number of ones in $x+y$ is same as the Hamming distance between $x$ and $y$ . Thus, $V^{\perp}$ is $k$ -avoiding. Since $\epsilon n<k<(1-\epsilon)n$ and $k$ is even, from Theorem 3.3, it follows that there exists a positive constant $\delta=\delta(\epsilon)$ such that $|V^{\perp}|\leq 2^{n(1-\delta)}$ . So, dimension of $V^{\perp}$ is at most $n(1-\delta)$ . Therefore, it follows that dimension of $V$ is at least $\delta n$ . $\Box$

Corollary 1

$\gamma(n,\frac{n}{2},r)\geq\delta n$ * provided $\frac{n}{2}$ is even and $\frac{r}{2}$ is odd, for some $0<\delta<1$ .*

Let $\frac{n}{2}$ be even and $\frac{r}{2}$ be odd. From Inequality 10, we have $\gamma(n,\frac{n}{2},r)\in O(n\sqrt{r})$ . When $r$ is a constant, using Corollary 1, this upper bound is asymptotically tight. However, for larger values of $r$ , there can be a large gap (up to $O(\sqrt{n})$ when $r\in\Omega(n)$ ) between the upper and the lower bound. In what follows, we address the problem for a special case when $r=\frac{n}{2}$ and establish a better upper bound of $\frac{n}{2}$ on $\gamma(n,\frac{n}{2},\frac{n}{2})$ .

Lemma 4

$\gamma(n,\frac{n}{2},\frac{n}{2})\leq\frac{n}{2}$ , where $\frac{n}{2}$ is any even integer.

Proof

Let $\mathcal{B}$ denote the set of all the bicolorings with equal number of +1’s and -1’s. Let $A_{1}=\{1,2,\ldots,\frac{n}{2}\},A_{2}=\{2,3,\ldots,\frac{n}{2}+1\},\ldots,A_{\frac{n}{2}}=\{\frac{n}{2},\frac{n}{2}+1,\ldots,n-1\}$ . Let $c_{i}(B)=\left\langle Y_{B},X_{A_{i}}\right\rangle$ . For any $B\in\mathcal{B}$ , it is not hard to see that each $c_{i}(B)$ is even and $|c_{i}(B)-c_{i+1}(B)|\in\{0,2\}$ . Since the bicolorings consist of equal number of +1’s and -1’s, $c_{\frac{n}{2}}(B)\leq-c_{1}(B)+2$ if $c_{1}(B)\geq 0$ , and $c_{\frac{n}{2}}(B)\geq-c_{1}(B)-2$ if $c_{1}(B)<0$ . In particular, we have $c_{1}(B)c_{\frac{n}{2}}(B)\leq 0$ . Since $|c_{i}(B)-c_{i+1}(B)|\in\{0,2\}$ , this implies the existence of an index $i$ such that $c_{i}(B)=\left\langle Y_{B},X_{A_{i}}\right\rangle=0$ . This concludes the proof that $\gamma(n,\frac{n}{2},\frac{n}{2})\leq\frac{n}{2}$ . $\Box$

From Corollary 1 and Lemma 4, we have the following theorem.

Theorem 3.4

$\gamma(n,\frac{n}{2},\frac{n}{2})\leq\frac{n}{2}$ . Moreover, $\gamma(n,\frac{n}{2},\frac{n}{2})\geq\delta n$ if $n/2$ is even and $n/4$ is odd, for some $0<\delta<1$ .

4 Inapproximability of the SUR problem

Firstly, we establish a hardness result of the hitting set problem for a special family of subsets.

Definition 2

A family $\mathcal{F}$ of subsets of $[n]$ is complement closed on $[n]$ if for all $F\in\mathcal{F}$ , $[n]\setminus F\in\mathcal{F}.$

Proposition 6

Let $n$ be an integer, $n\geq 4$ . No deterministic polynomial time algorithm can approximate the hitting set problem for complement closed families on $[n]$ to within a factor of $(1-\Omega(1))\frac{\ln n}{2.34}$ of the optimal, unless P=NP.

Proof

For the sake of contradiction, assume that there exists an algorithm $ALG$ that approximates the hitting set for complement closed families on $[n]$ to within a factor of $(1-\Omega(1))\frac{\ln n}{2.34}$ of the optimal. We obtain a contradiction to this assumption by the following reduction from the general hitting set problem.

Given a pair $(\mathcal{S}^{\prime},[n])$ as input to the general hitting set problem, we extend the universe to $[n+1]$ by adding the element $n+1$ . We construct $\mathcal{S}$ as follows: $\mathcal{S}=\mathcal{S}^{\prime}\cup\{[n+1]\setminus S|S\in\mathcal{S}^{\prime}\}$ . Let $OPT(\mathcal{S})$ ( $OPT(\mathcal{S}^{\prime})$ ) denote an optimal solution to the hitting set problem on $\mathcal{S}$ (respectively, $\mathcal{S}^{\prime}$ ). Let $ALG(\mathcal{S})$ denote a hitting set outputted by $ALG$ on $\mathcal{S}$ as input.

Observe that

[TABLE]

From our assumption, we know that $|OPT(\mathcal{S})|\leq|ALG(\mathcal{S})|\leq(1-\Omega(1))\frac{\ln(n+1)}{2.34}|OPT(\mathcal{S})|<(1-\Omega(1))\frac{\ln n}{2}|OPT(\mathcal{S})|$ , for $n\geq 4$ . Note that $ALG(\mathcal{S})$ is a valid hitting set for $\mathcal{S}^{\prime}$ . So, $|OPT(\mathcal{S}^{\prime})|\leq|OPT(\mathcal{S})|\leq|ALG(\mathcal{S})|\leq(1-\Omega(1))\frac{\ln n}{2}|OPT(\mathcal{S})|<(1-\Omega(1))\frac{\ln n}{2}2\cdot|OPT(\mathcal{S}^{\prime})|=(1-\Omega(1))\ln n\allowbreak|OPT(\mathcal{S}^{\prime})|$ . Therefore, $ALG$ is a $(1-\Omega(1))\ln n$ factor approximation algorithm for the general hitting set problem. However, Dinur and Steurer [11] proved that it is impossible to approximate the set cover problem to a factor of $(1-\Omega(1))\ln n$ of the optimal, unless P=NP. $\Box$

We use Proposition 6 to establish the following hardness result for the system of unbiased representative problem.

Proof of Theorem 1.5

Statement of Theorem 1.5. Let $r\leq(1-\Omega(1))\frac{\ln n}{2.34}$ , where $n\geq 4$ is an integer. Then, no deterministic polynomial time algorithm can approximate the system of unbiased representative problem for a family of bicolorings on $[n]$ to within a factor $(1-\Omega(1))\frac{\ln n}{2.34r}$ of the optimal when each set chosen in the representative family is required to have its cardinality at most $r$ , unless P=NP.

Proof

We prove Theorem 1.5 by a reduction from an instance of the hitting set problem on complement closed familes. Let $\mathcal{S}$ be a complement closed family on $[n]$ . From $\mathcal{S}$ , we construct a family $\mathcal{B}$ of bicolorings on $[n]$ in the following way: $\mathcal{B}=\{B|B(+1)=S,B(-1)=[n]\setminus S,S\in\mathcal{S}\}$ . For the sake of contradiction, assume that there exists an algorithm $ALG$ that approximates the system of unbiased representative problem for any family of bicolorings on $[n]$ to within a factor $f$ of the optimal, where $1\leq f\leq(1-\Omega(1))\frac{\ln n}{2.34r}$ and each set in the SUR is required to have its cardinality at most $r$ . Let $OPT_{\text{HIT}}(\mathcal{S})$ ( $OPT_{\text{SUR}}(\mathcal{B})$ ) denote an optimal solution to the hitting set problem (respectively, the system of unbiased representative problem) on $\mathcal{S}$ (respectively, $\mathcal{B}$ ). Let $ALG(\mathcal{B})$ denote a SUR outputted by $ALG$ with $\mathcal{B}$ as its input. Then, executing $ALG$ on $\mathcal{B}$ as input, we obtain a SUR $\mathcal{A}$ for $\mathcal{B}$ such that (i) $2\leq|A|\leq r$ for each $A\in\mathcal{A}$ , (ii) $|ALG(\mathcal{B})|=|\mathcal{A}|\leq f\cdot|OPT_{\text{SUR}}(\mathcal{B})|$ , for some $1\leq f\leq(1-\Omega(1))\frac{\ln n}{2.34r}$ . Let $V=\cup_{A\in\mathcal{A}}A$ . It follows that $|V|\leq r|\mathcal{A}|$ and $V$ is a hitting set for $\mathcal{S}$ .

From Lemma 2, we know that $|OPT_{\text{SUR}}(\mathcal{B})|\leq|OPT_{\text{HIT}}(\mathcal{S})|-1$ . Therefore,

[TABLE]

So, $ALG$ is a $(r\cdot f)$ -factor approximation algorithm for computing hitting set of $\mathcal{S}$ . Since $1\leq f\leq(1-\Omega(1))\frac{\ln n}{2.34r}$ , this is a contradiction to Proposition 6. $\Box$

Remark 2

Consider the case when the family $\mathcal{B}$ is restricted to a special family of bicolorings, where the number of +1’s (or -1’s) for each $B\in\mathcal{B}$ is exactly one, i.e. $|B(+1)|=1$ (or $|B(-1)|=1$ ). Then, the problem of system of unbiased representatives reduces to an edge cover problem [34, 24] on a complete graph $G$ , where for each $B\in\mathcal{B}$ , a vertex $v_{B(+1)}$ (respectively, $v_{B(-1)}$ ) is added to $V(G)$ . So, this reduction makes the SUR problem polynomial time solvable for such families of bicolorings.

Acknowledgment

The authors thank Prof. Niloy Ganguly for helpful discussions on the problem.

Appendix 0.A Proof of induction base case of Theorem 0.A.1

Theorem 0.A.1

[28]** Given the $n$ quadratics in $n$ variables $x_{1}(x_{1}-1),\ldots,x_{n}(x_{n}-1)$ with $2^{n}$ common zeros, the maximum number of those common zeros a polynomial $P$ of degree $k$ can go through without going through them all is $2^{n}-2^{n-k}$ .

Proof

The proof is by induction on $n$ . When $k=0$ , we have nothing to prove. So, we consider all the degree $k$ polynomials $P$ on $k+1$ variables as the base case. For the sake of contradiction, assume that $P$ is a polynomial of degree $k$ on $k+1$ variable and it misses only one common zero of $x_{1}(1-x_{1}),\ldots,x_{k+1}(1-x_{k+1})$ . Then, using Lemma 1, it follows that degree of $P$ must be $k+1$ , which is a contradiction. This completes the proof of the induction base case.

The rest of the proof is exactly same as given in [28]. $\Box$

Appendix 0.B Proof of Proposition 5

Statement of Proposition 5. $\frac{2}{r(r-1)}\gamma(n,k-1,r-2)\leq\gamma(n,k,r)\leq(n-r+1)\gamma(n,k-1,r-2)$ , for $r\geq 4$ .

Proof

Let $\mathcal{B}_{i}$ denote the set of all the bicolorings consisting of exactly $i$ +1’s, for $i\in\{k,k-1\}$ . Let $\mathcal{A}_{r-2}$ denote a family of $(r-2)$ -sized subsets that is an optimal unbiased representative family for $\mathcal{B}_{k-1}$ . For any $A\in\mathcal{A}_{r-2}$ , let $\bar{A}=[n]\setminus A=\{x_{1},\ldots,x_{n-r+2}\}$ . For each $A\in\mathcal{A}_{r-2}$ , we construct $(n-r+1)$ $r$ -sized subsets as follows: $A^{1}=A\cup\{x_{1},x_{2}\}$ , $A^{2}=A\cup\{x_{1},x_{3}\}$ , $\cdots$ , $A^{n-r+1}=A\cup\{x_{1},x_{n-r+2}\}$ . Let $\mathcal{A}_{r}=\cup_{A\in\mathcal{A}_{r-2}}\{A^{1},\cdots,A^{n-r+1}\}$ . To see that $\mathcal{A}_{r}$ is a system of unbiased representative for $\mathcal{B}_{k}$ , consider any $B\in\mathcal{B}_{k}$ and a $(k-1)$ -sized subset $B^{\prime}\subset B_{k}$ . Let $A^{\prime}\in\mathcal{A}_{r-2}$ has $\left\langle Y_{B^{\prime}},X_{A^{\prime}}\right\rangle=0$ . From the construction, it follows that there is at least one $A\in\{A^{\prime 1},\cdots,A^{\prime n-r+1}\}$ such that $\left\langle Y_{B},X_{A}\right\rangle=0$ .

For the lower bound, consider a SUR $\mathcal{A}$ for $\mathcal{B}_{k}$ of size $\gamma(n,k,r)$ . For each $A\in\mathcal{A}$ , let $\mathcal{F}_{A}$ denote the family of $\binom{r}{r-2}$ distinct $(r-2)$ -sized subsets of $A$ . Then, $\mathcal{A}^{\prime}=\cup_{A\in\mathcal{A}}F_{A}$ is an unbiased representative family for $\mathcal{B}_{k-1}$ where each set in the family is of size exactly $(r-2)$ . $\Box$

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N. Alon and Z. Füredi. Covering the cube by affine hyperplanes. European Journal of Combinatorics , 14(2):79–83, 1993.
2[2] Noga Alon. Combinatorial nullstellensatz. Combinatorics, Probability and Computing , 8(1-2):7–29.
3[3] Noga Alon, Béla Bollobás, Jeong Han Kim, and Van H. Vu. Economical covers with geometric applications. Proceedings of the London Mathematical Society , 86(2):273, 2003.
4[4] Noga Alon and Joel H. Spencer. The probabilistic method . John Wiley & Sons, 2000.
5[5] Noga Alon and M. Tarsi. Combinatorial nullstellensatz. Combinatorics Probability and Computing , 8(1):7–30, 1999.
6[6] Manu Basavaraju, Mathew C. Francis, M. S. Ramanujan, and Saket Saurabh. Partially polynomial kernels for set cover and test cover. SIAM Journal on Discrete Mathematics , 30(3):1401–1423, 2016.
7[7] Koen M. J. De Bontridder, B. J. Lageweg, Jan K. Lenstra, James B. Orlin, and Leen Stougie. Branch-and-bound algorithms for the test cover problem. In European Symposium on Algorithms , pages 223–233. Springer, 2002.
8[8] Robert Crowston, Gregory Gutin, Mark Jones, Saket Saurabh, and Anders Yeo. Parameterized study of the test cover problem. In International Symposium on Mathematical Foundations of Computer Science , pages 283–295. Springer, 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

System of unbiased representatives for a collection of bicolorings

Abstract

1 Introduction

1.1 Definitions and notations

1.2 Relation to existing works

1.3 Summary of results

Theorem 1.1

Theorem 1.2

Theorem 1.3

Theorem 1.4

Theorem 1.5

2 When cardinalities of sets in the ‘SUR’ are unrestricted or semi-restricted

2.1 Bounds on γ(n,[k,n−1],[2,n])\gamma(n,[k,n-1],[2,n])γ(n,[k,n−1],[2,n])

Proposition 1

Proof

Lemma 1

Proof

Proof of Theorem 1.1

Proof

Remark 1

2.2 Relation to hitting sets for arbitrary collection of bicolorings

Lemma 2

Proof

Proposition 2

2.3 Analysis with bias in representation

Proposition 3

Proof of Theorem 1.2

Proof

3 When cardinalities of sets in the ‘SUR’ and +1’s in the bicolorings are restricted

Proposition 4

Proof

Proposition 5

Definition 1

Theorem 3.1

Theorem 3.2

Proof

Construction 1

3.1 Tight upper bounds under restrictions

Lemma 3

Proof

Proof of Theorem 1.3

Proof

3.2 γ(n,k,r)\gamma(n,k,r)γ(n,k,r) when k=n/2k=n/2k=n/2

Theorem 3.3

Proof of Theorem 1.4

Proof

Corollary 1

Lemma 4

Proof

Theorem 3.4

4 Inapproximability of the SUR problem

Definition 2

Proposition 6

Proof

Proof of Theorem 1.5

Proof

Remark 2

Acknowledgment

Appendix 0.A Proof of induction base case of Theorem 0.A.1

Theorem 0.A.1

Proof

Appendix 0.B Proof of Proposition 5

Proof

2.1 Bounds on $\gamma(n,[k,n-1],[2,n])$

3.2 $\gamma(n,k,r)$ when $k=n/2$