Finite Representability of Integers as $2$-Sums
Anant Godbole, Zach Higgins, Zoe Koch

TL;DR
This paper investigates the properties of randomly selected sets as truncated additive bases, providing precise asymptotic results for their likelihood of representing integers in a specified range as sums of two elements.
Contribution
It introduces the concept of truncated $(eta,2,g)$ additive bases and derives sharp asymptotic probabilities for random sets to be such bases.
Findings
Sharp asymptotics for the probability of random sets being truncated additive bases.
Characterization of the number of representations of integers as 2-sums within a range.
Analysis of high and low probability regimes for these bases.
Abstract
A set is said to be an additive -basis if each element in can be written as an -sum of elements of in {\it at least} one way. We seek multiple representations as -sums, and, in this paper we make a start by restricting ourselves to . We say that is said to be a truncated additive basis if each can be represented as a -sum of elements of in at least ways. In this paper, we provide sharp asymptotics for the event that a randomly selected set is a truncated additive basis with high or low probability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLimits and Structures in Graph Theory · semigroups and automata theory · Analytic Number Theory Research
FINITE REPRESENTABILITY OF INTEGERS AS -SUMS
**Anant Godbole
***Department of Mathematics and Statistics,
East Tennessee State University,
Johnson City, TN 37614, USA
**Zach Higgins
***Department of Mathematics,
University of California (San Diego),
La Jolla, CA 92093, USA
***Zoe Koch
***Department of Mathematics,
University of Vermont,
Burlington, VT 05405, USA
Abstract
A set is said to be an additive -basis if each element in can be written as an -sum of elements of in at least one way. We seek multiple representations as -sums, and, in this paper we make a start by restricting ourselves to . We say that is said to be a truncated additive basis if each can be represented as a -sum of elements of in at least ways. In this paper, we provide sharp asymptotics for the event that a randomly selected set is a truncated additive basis with high or low probability.
1 Introduction
1.1 Balls in Boxes
We start by introducing results from the classical theory of the random allocation of balls to boxes. We will be seeing, in the rest of the paper, how and to what extent the results apply to situations such as coverage of integers by -sets of integers.
Suppose that we are trying to “pack” balls in boxes so that each box contains at most one ball. This is the so-called “birthday problem”, and it is well-known, e.g., [2] that if we randomly throw balls into boxes, then the threshold for the property to hold with high or low probability (whp or wlp) is , in the sense that if , then the probability that each box contains at most one ball is asymptotically 0, and this probability is asymptotically 1 if . Here and throughtout this paper, we will describe these two situations by using the notation and respectively. There is a generalization of the birthday threshold to to “at most balls, which we rederived in [5] using Talagrand’s inequality [1]:
Theorem 1.1**.**
When balls are randomly and uniformly distributed in boxes, then letting denote the number of boxes with or more balls,
[TABLE]
and
[TABLE]
Theorem 1.1 exhibits a progression of thresholds, which get close to as . It may still be the case, however, that not all boxes will have a ball in them if , which leads us to the question of the coverage of each box by at least one ball, or “coupon collection”. It is well known that the expected waiting time for each of the boxes to be filled is where is Euler’s constant, and that the variance of the waiting time is . Together with these facts, Chebychev’s inequality can be use to prove, with denoting the number of empty boxes, and an arbitrary function tending to , that
Theorem 1.2**.**
[TABLE]
and
[TABLE]
Various people have asked about covering each box or more times. Generalizing work of Erdős and Rényi; and Newman and Shepp; Holst produced the following definitive result:
Theorem 1.3**.**
Letting denote the waiting time until each box has at least balls, we have
[TABLE]
Normalizing by setting , we have that are asymptotically independent. Moreover
[TABLE]
From Theorem 1.3, it is easy to derive the following result
Theorem 1.4**.**
[TABLE]
and
[TABLE]
where is the number of boxes with balls.
Of particular note is the linearity (in ) for coverings beyond the first, showing that an additional iterated logarithmic fraction suffices for each subsequent covering (which are asymptotically independent!) We hope to show that many of these features stay intact even as dependence is introduced into the covering scenarios. As a final note, we observe that extremal behaviour in the “balls in boxes” example is trivial: The maximal number of balls that may be placed in boxes so that each contains at most one box is , as is the smallest number of balls so as to guarantee at least one ball per box.
1.2 Dependence
A set is said to be a set (the totality of these for all are known as Sidon sets) if each of the sums of elements drawn with replacement from are distinct. A set is said to be an -additive basis if each can be written as the sum of elements in . Thus, a set is -Sidon or an -additive basis if each element in the potential sumset can be obtained in at most one or at least one way using elements of . It is known that maximal Sidon sets and minimal additive bases are both of order ; for example minimal 2-additive bases have size . See [7] and [8] for details on such results regarding Sidon sets and additive bases respectively. We are interested, however, in random versions of these results, and we start by noting that the corresponding balls in boxes model is as follows:
The balls are the integers randomly chosen from . However they do not “go into a single box”. Rather, each ball colludes with other chosen balls, including itself, generating sums with multisets of other balls. A ball is then placed into the box corresponding to each generated sumset. For example, if and the balls drawn in sequence are 4, 2, and 6, then balls are placed in boxes
[TABLE]
[TABLE]
[TABLE]
where the numbers in the three lines indicate what occurs with balls 4, 2, and 6 respectively. There are clearly several layers of dependence in the allocation of balls to boxes.
Three known facts in the area of thresholds for the emergence of Sidon sets and additive bases are stated next:
Theorem 1.5**.**
([7]) Consider a subset of random size obtained by choosing each integer in independently with probability . Then for any ,
[TABLE]
and
[TABLE]
In [5], we find the following definition that is related to the original question of Sidon (see [9] and also the second open question in Section 3 below).
Definition 1.6**.**
We say that satisfies the property for integers if for all integers , is realized in at most ways as a sum
[TABLE]
for and for each .
The authors of [5] go on to generalize Theorem 1.5 as follows:
Theorem 1.7**.**
Let be a random subset of in which each element of is selected for membership in independently with probability . Then for any , we have:
[TABLE]
and
[TABLE]
In transitioning to the case of additive bases, we first note, as in [8], that a single input probability for integer selection will cause edge effect issues. For example, for , since the only way to represent 1 as a 2-sum is as , both 0 and 1 must be selected in order for 1 to be represented. For this reason, we say that is a truncated additive basis if each integer in can be written as an -sum of elements in .
Theorem 1.8**.**
([8]) If we choose elements of to be in with probability
[TABLE]
where , then
[TABLE]
Even though edge effects can be eliminated by considering modular additive bases, here we consider the truncated additive basis case, where the target sumset is reduced via the parameter – since we are using the same probability of selection. The case is studied in greater detail in the next result, which addresses coverage of each sum times, and is which the main result of this paper.
Theorem 1.9**.**
If we choose elements of to be in with probability
[TABLE]
then
[TABLE]
where a basis is one for which each integer in the target set can be written as a 2-sum in ways.
Both Theorems 1.8 and 1.9 are finite representability versions of the key result in [4], where a variable input probability was used and the focus was on representing each integer as a sum in logarithmically many ways. See also the key results in [6], where logarithmic representability is studied in the context of a single input probability. Theorem 1.9 exhibits the phenomenon that arose in the context of Coupon Collection. Interestingly, though, the factor is present for the first covering with a negative contribution, disappears for the second, and then reappears with a positive sign. The paper [5] provides many more examples of this phenomenon in a variety of covering and packing situations, specifically those that arise in the context of combinatorial designs, permutations, and union free set families.
2 Proof of Theorem 1.9
We start by defining the key random variable of interest. Let be the number of integers in that are represented as a sum in fewer than ways, i.e., in ways. The threshold we seek to establish is for , and, as in so many instances where we employ the Poisson paradigm (see, e.g., [1]), this transition occurs at the level at which rapidly transitions from asymptotically 0 to asymptotically ; this is because . Towards this end we next carefully estimate . We have that
[TABLE]
where is the indicator of the event that the integer is underrepresented as defined above. By linearity of expectation,
[TABLE]
where the last equality follows from the fact that each can be represented as the sum of disjoint pairs of integers. Trivially, we have
[TABLE]
and Theorem A.2.5 (iii) in [3], which estimates the left tail of a binomial random variable with large mean by its last term yields
[TABLE]
In deriving (3), we need to know that for ’s in the selected range. This is something we can assume, since we are seeking a threshold at , and we can suppose up front, e.g., that . Equations (2) and (3) reveal that
[TABLE]
where, in the second line of (4) we have used the facts that and is finite, and that for ’s in the specified range, we have , and . Since the function is decreasing for , we see that the summand in (4) will also be decreasing provided, e.g., that , which we will assume. Thus
[TABLE]
where the last line of (5) follows from the simplest asymptotic estimate (i.e., without error terms) of the incomplete gamma function. We first check to see when . We start by letting and find that the right side of (5) is of order
[TABLE]
for . Making the adjustment reveals that the right side of (5) tends to zero at the rate
[TABLE]
which we seek to improve. Accordingly, we set
[TABLE]
and find that the right side of (5) is of order
[TABLE]
Setting yields that the right side of (5) is constant, and the incorporation of an additional term, with , yields and leads to the conclusion that
[TABLE]
Next, we return to (4) and see that
[TABLE]
It follows, as with the analysis concerning the case, that when
[TABLE]
with .
The next (and critical) phase of the proof is to show that . We will exhibit this by using the Stein-Chen method of Poisson approximation [3], which will yield that
[TABLE]
for a range of ’s that encompasses our threshold. (In the above denotes the distribution of , the Poisson distribution with parameter , and the usual total variation distance.) Setting will complete the proof.
For each we seek to define an ensemble of auxiliary variables that satisfy, for each ,
[TABLE]
We do this as follows: If , we simply set for each . If, however, the integer is represented or more times, we deselect one or more integer so as to achieve the distribution corresponding to . We then set if integer is represented or fewer times after the deselection. Now the exact nature of this coupling is not important (and, in fact, is rather complicated), but what is evident is that , since would entail that flipped from being underrepresented to being represented times after the deselection of some integers. In other words, the indicator variables are positively related and Corollary 2.C.4 in [3] applies, so that we have
[TABLE]
We begin with .
[TABLE]
provided that , which we may assume without any loss. Clearly the correlation term will dictate the closeness of the Poisson approximation. Our first lemma shows that while computing , it suffices to consider the case where the sumsets for and are disjoint.
Lemma 2.1**.**
For some constant , we have that for each ,
[TABLE]
Proof.
We let denote the event that integers are represented and times respectively. Likewise, let denote the event that integers are represented in and entirely disjoint ways.
[TABLE]
We first calculate the contribution to (9) of the disjoint case:
[TABLE]
In the above array, we have denoted by the number of -sumsets that have an overlap with the chosen -sumsets, where . We must not choose the second of the two integers that give the sumset; this explains the term.
We next turn to , and see that
[TABLE]
Consider the summand in the third sum in (11). We see, on ignoring terms, that
[TABLE]
so that is decreasing in . It follows that
[TABLE]
and thus, by (11)
[TABLE]
Equation (9) thus yields, for some constant ,
[TABLE]
This proves Lemma 2.1.
Returning to (7), using (8), we see that for another constant ,
[TABLE]
Thus may be approximated by a Poisson random variable provided that and . The first condition may be seen to hold if, e.g.,
[TABLE]
and the second if the given by (6) is (roughly speaking) of order smaller than . We have thus established Theorem 1.9 for a range of ’s that spans part of the and regimes; the full theorem, including the delicated behavior at the threshold, follows easily by monotonicity (e.g., if is even larger than then it is even less likely that , so that this quantity tends to zero as well).
3 Open Questions
Establishing an analog of Theorem 1.9 for would, of course, be of great interest. Combating the fact that sums are not disjoint is the main technical hurdle we would need to overcome.
Our second open problem is related to an original question of Sidon; see, e.g., [9], and pertains more to Theorem 1.7. It has been suggested by Kevin O’Bryant. Sidon’s original question was “How thick can a set be if
[TABLE]
and
[TABLE]
satisfy, for each , .” Note that in this ordered set format, Sidon sets are those for which for each . It is easy to verify that . But if then it is still possible for to be unbounded. Sidon’s original question has not been the subject of a large-scale investigation. In our context, however, we might ask for thresholds for the property .
4 Acknowledgments
The research of all four authors was supported by NSF Grant 13XXXX
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Alon and J. Spencer (1992). The Probabilistic Method. Wiley, New York.
- 2[2] A. Barbour and L. Holst (1989). Some applications of the Stein-Chen method for proving Poisson convergence, Adv. Appl. Probab. 21 , 74–90.
- 3[3] A. Barbour, L. Holst, and S. Janson (1992). Poisson Approximation. Oxford University Press.
- 4[4] P. Erdős and P. Tetali (1990). Representations of integers as the sum of k 𝑘 k terms, Rand. Structures Algorithms 1 , 245–261.
- 5[5] A. Godbole, T. Grubb, K. Han, and B. Kay (2017+). Threshold Progressions in a Variety of Covering and Packing Contexts. Preprint.
- 6[6] A. Godbole, S. Gutekunst, V. Lyzinski, and Y. Zhuang (2015). Logarithmic Representability of Integers as k 𝑘 k -Sums, Integers: Electronic Journal of Combinatorial Number Theory 15A , Paper #A 5.
- 7[7] A. Godbole, S. Janson, N. Locantore, and R. Rapoport (1999). Random Sidon sequences, J. Number Theory 75 , 7–22.
- 8[8] A. Godbole, C-M. Lim, V. Lyzinski, and N. Triantafillou (2013). Sharp threshold asymptotics for the emergence of additive bases, Integers: Electronic Journal of Combinatorial Number Theory 13 , Paper # A 14.
