Encoding 3SUM
Sergio Cabello, Jean Cardinal, John Iacono, Stefan Langerman, Pat, Morin, Aur\'elien Ooms

TL;DR
This paper introduces an efficient data structure for the 3SUM problem that uses near-quadratic space and enables constant-time queries for determining the sign of sums of triples from three sets.
Contribution
It presents a novel encoding scheme for 3SUM that achieves constant query time with space complexity of approximately O(N^{3/2}), improving upon previous subquadratic solutions.
Findings
Achieves constant-time queries for 3SUM sign determination.
Uses approximately O(N^{3/2}) space for encoding.
Improves on previous subquadratic space solutions.
Abstract
We consider the following problem: given three sets of real numbers, output a word-RAM data structure from which we can efficiently recover the sign of the sum of any triple of numbers, one in each set. This is similar to a previous work by some of the authors to encode the order type of a finite set of points. While this previous work showed that it was possible to achieve slightly subquadratic space and logarithmic query time, we show here that for the simpler 3SUM problem, one can achieve an encoding that takes space for inputs sets of size and allows constant time queries in the word-RAM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Complexity and Algorithms in Graphs · semigroups and automata theory
Encoding 3SUM
Sergio Cabello
University of Ljubljana
Jean Cardinal
Université libre de Bruxelles
John Iacono
Université libre de Bruxelles
New York University
Stefan Langerman
Université libre de Bruxelles
Pat Morin
Carleton University
Aurélien Ooms
Université libre de Bruxelles
Abstract
We consider the following problem: given three sets of real numbers, output a word-RAM data structure from which we can efficiently recover the sign of the sum of any triple of numbers, one in each set. This is similar to a previous work by some of the authors to encode the order type of a finite set of points. While this previous work showed that it was possible to achieve slightly subquadratic space and logarithmic query time, we show here that for the simpler 3SUM problem, one can achieve an encoding that takes space for inputs sets of size and allows constant time queries in the word-RAM.
1 The Problem
Given three sets of real numbers , , and , we wish to build a discrete data structure (using bits, words, and pointers) such that, given any triple it is possible to compute the sign of by only inspecting the data structure (we cannot consult , , or ). We refer to the map as the 3SUM-type of the instance . Obviously, one can simply construct a lookup table of size , such that triple queries can be answered in time. We aim at improving on this trivial solution.
2 Motivation
In the 3SUM problem, we are given an array of numbers as input and are asked whether any three of them sum to 0. In the mid-nineties, this problem was identified as a bottleneck of many important problems in geometry, such as detection of affine degeneracies or motion planning [5]. Since then, it has become a central problem in fine-grained complexity theory [9]. It has long been conjectured to require time. In 2014, it was shown to be solvable in time, but no algorithm with running time with constant is known [7].
Lower bounds exist in restricted models of computation. Most notably, 3-linear queries are needed to solve 3SUM [4], and nontrivial lower bounds have also been proven for slightly more powerful linear decision trees [1]. However, in a recent breakthrough contribution, Kane, Lovett, and Moran showed that 3SUM could be solved using 6-linear queries [8], hence within a factor of the information-theoretic lower bound.
Linear decision trees are examples of nonuniform algorithms, in which we are allowed to have different algorithms for different input sizes. Algebraic decision trees generalize linear decision trees by allowing decision based on the sign of constant-degree polynomials at each node [10].
Any decision tree identifying the 3SUM-type of a 3SUM instance yields a concise encoding of this 3SUM-type: just write down the outcome of the successive tests. Knowing the decision tree by convention, this sequence of bits is sufficient to recover the sign of any triple.
The question we consider here is how to make such a representation efficient, in the sense that not only does it use merely a few bits, but the answer to any triple query can be recovered efficiently. Understanding the interplay between nonuniform algorithms and such data structures hopefully sheds light on the intrinsic structure of the problem.
3 Results
See table 1 for a summary. As there are only queries, a table of size bits suffices to give constant query time [3]. This can be improved to bits of space by storing for each pair the values and . For a query , we compare against the values and to recover in time. All and can be computed in time via the classic quadratic time algorithm for 3SUM.
One seemingly simple representation is to store the numbers in , and ; however these are reals and thus we need to make them representable using a finite number of bits. In Section 4 we show that a minimal integer representation of a 3SUM instance may require bits per value, which would give rise to a query time and space, which is far from impressive. In [2] the problem of given a set of lines, to create an encoding of them so that the orientation of any triple (the order type) can be determined was studied; our problem is a special case of this where the lines only have three slopes. Can we do better for the case of 3SUM? We answer this in the affirmative. In Section 5 we show how to use an optimal bits of space with a polynomial query time. Finally, in section 6 we show how to use space to achieve -time queries.
4 Representation by numbers
A first natural idea is to encode the real 3SUM instance by rounding its numbers to integers. We show a tight bound of bits for this representation.
Lemma 1**.**
Every 3SUM instance has an equivalent integer instance where all values have absolute value at most . Furthermore, there exists an instance of 3SUM where all equivalent integer instances require numbers at least as large as the th Fibonacci number and where the standard binary representation of the instance requires bits.
Proof.
Every 3SUM instance , , and can be interpreted as the point in . Let us use the variables to encode the first dimensions of , to encode the next dimensions, and for the remaining dimensions. Consider the subset of
[TABLE]
and the set of hyperplanes , where . Let be the arrangement defined by inside . Instances of 3SUM correspond to points in . Moreoever, two 3SUM instances have the same 3SUM-type if and only if they are in the same cell of .
Consider an instance and let be the cell of that contains it. Then is the cell defined by the inequalities
[TABLE]
Let be the subset of defined by the following inequalities:
[TABLE]
Clearly is contained in . Moreover, for a sufficiently large the scaled instance belongs to . Therefore, is nonempty.
Since is defined by a collection of linear inequalities defining closed halfspaces, there exists a point in defined by a subset of at most inequalities, where the inequalities are actually equalities. Let us assume for simplicity that exactly equalities define the point . Then, is the solution to a linear system of equations where and have their entries in and each row of has at most three non-zero entries. The solution to this system of equations is an instance equivalent to .
Because of Cramer’s rule, the system of linear equations has solution with entries , where is the matrix obtained by replacing the th column of by . We use the following simple bound on the determinant. Since , where iterates over the permutations of , there are at most summands where gives non-zero product (we have to select one non-zero entry per row), and the product is always in . Therefore . Similarly, because each row of has at most non-zero entries. We conclude that the solution to the system are rationals that can be expressed with bits. This solution gives a 3SUM instance with rationals that is equivalent to . Since all the rationals have the common denominator (), we can scale the result by and we get an equivalent instance with integers, where each integer has bits.
The proof of the second statement is by implementing the Fibonacci recurrence in each of the arrays . This can be achieved by letting:
[TABLE]
The first two sets of equations ensure that the two arrays and are identical, while the array contains the corresponding negated numbers, in reverse order. From the inequalities in the third group, and depending on the choice of the initial values , each array contains a sequence growing at least as fast as the Fibonacci sequence. ∎
Note that this is a much smaller lower bound than for order types of points sets in the plane, the explicit representation of which can be shown to require exponentially many bits per coordinate [6].
5 Space-optimal representation
By considering the arrangement of hyperplanes defining the 3SUM problem, we get an information-theoretic lower bound on the number of bits in a 3SUM-type.
Lemma 2**.**
There are distinct 3SUM-types of size .
Proof.
3SUM-types of size are in one-to-one correspondence with cells of the arrangement of hyperplanes in . The number of such cells is and is easily shown to be at least . ∎
In order to reach this lower bound, we can simply encode the label of the cell of the arrangement in bits. However, decoding the information requires to construct the whole arrangement which takes time. An alternative solution is to store a vertex of the arrangement of hyperplanes . There exists such a vertex that has the same 3SUM-type as the input point, as shown in the proof of Lemma 1. To answer any query, either recompute the vertex from the basis then answer the query using arithmetic, or use linear programming. Hence we can build a data structure of bits such that triple queries can be answered in polynomial time.
Note that we do not exploit much of the 3SUM structure here. In particular, the same essentially holds for -SUM, and can also be generalized to a Subset Sum data structure of bits, from which we can extract the sign of the sum of any subset of numbers.
6 Subquadratic space and constant query time
Our encoding is inspired by Grønlund and Pettie’s non-uniform algorithm for 3SUM [7]. Our data structure stores three components, which we call the differences, the staircase and the square neighbors.
Differences.
Partition and into blocks of consecutive elements. Let be the set of all differences of the form and where the items come from the same block. There are such differences. Sort and store a table indicating for each difference in its rank among all differences in . This takes bits for each of the differences, for a total of bits.
Staircase.
Look at the table formed by all sums of the form , which is monotonic in its rows and columns due to and being sorted and view it as being partitioned into a grid of size where each square of the grid is also of size . For each element , for each we store the largest such that some elements of the square are , denote this as . We also store, for each , for each the smallest such that some elements of the square are , denote this as . We thus store, in and , values of size for each of the elements of , for a total space usage of bits. We call this the staircase as this implicitly classifies, for each , whether each square has elements larger than , smaller than , or some larger and some smaller; only can be in the last case, which we refer to as the staircase of .
Square neighbors.
For each element , for each of the squares on the staircase, we store the location of the predecessor and successor of in the squares and , for . This takes space .
To execute a query , only a constant number of lookups in the tables stored are needed. If , then we know . If , then we know . If neither of these is true, then the square is on the staircase of and thus using the square neighbors table we can determine the location of the predecessor and successor of in this square; suppose they are at and and thus . One need only determine how these two compare to to answer the query. But this can be done using the differences as follows: to compare to this would be determining the sign of which is equivalent to determining the result of comparing and , which since both are in the same square, these differences are in and the comparison can be obtained by examining their stored ranks. By doing this for the predecessor and successor we will determine the relationship between and .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Nir Ailon and Bernard Chazelle. Lower bounds for linear degeneracy testing. J. ACM , 52(2):157–171, 2005.
- 2[2] Jean Cardinal, Timothy M. Chan, John Iacono, Stefan Langerman, and Aurélien Ooms. Subquadratic encodings for point configurations. In Symposium on Computational Geometry , volume 99 of LIP Ics , pages 20:1–20:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018.
- 3[3] Yevgeniy Dodis, Mihai Patrascu, and Mikkel Thorup. Changing base without losing space. In Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010 , pages 593–602, 2010.
- 4[4] Jeff Erickson. Lower bounds for linear satisfiability problems. Chicago J. Theor. Comput. Sci. , 1999.
- 5[5] Anka Gajentaan and Mark H. Overmars. On a class of O ( n 2 ) 𝑂 superscript 𝑛 2 {O}(n^{2}) problems in computational geometry. Comput. Geom. , 5:165–185, 1995.
- 6[6] Jacob E. Goodman, Richard Pollack, and Bernd Sturmfels. Coordinate representation of order types requires exponential storage. In STOC , pages 405–410. ACM, 1989.
- 7[7] Allan Grønlund and Seth Pettie. Threesomes, degenerates, and love triangles. J. ACM , 65(4):22:1–22:25, 2018.
- 8[8] Daniel M. Kane, Shachar Lovett, and Shay Moran. Near-optimal linear decision trees for k-sum and related problems. In STOC , pages 554–563. ACM, 2018.
