The de Bruijn–Erdös–Hanani theorem
Nikolai V. Ivanov
Contents
1****1. N. Bourbaki and the “Kvant” magazine
1****2. A solution of the N. Bourbaki exercise
1****3. The de Bruijn–Erdös proof
1****4. From de Bruijn–Erdös to systems of distinct representatives
1****5. Linear algebra and
the inequality m⩾n
1****6. Hanani’s theorem
1****7. Another proof of Hanani’s theorem
1****8. All the de Bruijn–Erdös inequalities
References
††footnotetext: © Nikolai V. Ivanov, 2017. Neither the work reported in this paper, nor its preparation were
supported by any governmental
or non-governmental agency, foundation, or institution.
Preface
The present paper is devoted to a somewhat idiosyncratic account of the theorem of
de Bruijn–Erdös and Hanani
from the combinatorics of finite geometries and its various proofs. Among the proofs discussed are the original proofs by de Bruijn–Erdös
and by Hanani (the latter seems to be largely forgotten, being published in a hard to access journal)
and few others. Each of these proofs sheds new light on the theorem, illustrating the maxim that proofs are more important than the theorems proved. Some proofs and arguments in this paper seem to be new. I explain how one of the proofs was discovered, and how another one could have been discovered. See Sections 4. From de Bruijn–Erdös to systems of distinct representatives and 8. All the de Bruijn–Erdös inequalities.
I am grateful to F. Petrov for stimulating correspondence
and to M. Prokhorova for careful reading
of this paper , numerous suggestions, and providing me with copies of H. Hanani’s papers [H1, H2].
1. N. Bourbaki and the “Kvant” magazine
Problem.
*Let E be a set of n elements. Suppose that m different subsets of E
(not equal to E itself ) are selected in such a way
that for every two elements of E there is exactly one selected subset
containing both these elements. Prove that m⩾n. *
When an equality is possible ?
In 1970 this problem was included
as the Problem M5 in the very first issue of the Soviet “Kvant” magazine and attributed to N. Bourbaki [B-70]. The intended audience of the “Kvant” magazine (its name means “Quantum” ) was the school students in the USSR of the last two-three grades. Nowadays the audacity of the editorial board inspires awe: Problem M5 was offered
to this audience exactly as it is cited above, as an abstract problem about
finite sets without any motivation and any hints. The readers were expected to be interested in
this problem and to appreciate its beauty without any crutches.
In 1970 I was among the intended audience of “Kvant”, but I was more interested in the foundations of mathematics and in the set theory
than in the combinatorics of finite sets. I easily found this problem in the Russian translation [B-65] of the “Théorie des ensembles” by N. Bourbaki. It turned out to be the Exercise 12 to the section “Calcul sur les entiers”. In all editions this exercise is marked as
one of the most difficult .
The editors of “Kvant” were faithful to N. Bourbaki
in not offering any motivation. But , in contrast with “Kvant”, N. Bourbaki split
the result into few steps, offered a hint to the key one, and
stated the expected result in the case of the equality. The first two steps were rather easy, but the hint to key step turned out to be incomprehensible for me.
According to the authors of the solution [T] published in “Kvant” a few months later , they followed “the hints of the author of the problem , N. Bourbaki himself ” and referred to [B-65]. The habit of N. Bourbaki to include in his tract recent results
without attribution as exercises is well known, and was well known in 1970 in the Soviet Union. But it seems that neither the editors of the “Kvant” magazine, nor the authors of the solution [T] were aware that this result is due to N.G. de Bruijn and P. Erdös [dB-E] and H. Hanani [H1, H2].
Neither was I before by an accident I returned to this problem in 2016. By this time I was able to immediately recognize that this exercise from [B-65] is about points and lines in a geometry, and this realization quickly lead me to the de Bruijn–Erdös paper [dB-E]. The exercise turned out to be a quite faithful summary of the de Bruijn–Erdös proof , and the key part of the proof , summarized by N. Bourbaki as a hint , turned out to be nearly as obscure as this hint .
Here is my translation of this exercise
based on the reprint [B-06] of the 1970 edition (where it appears as the Exercise 14 to § 5). It is slightly different from the translation in [B-04].
Exercise.
Let E be a finite set of n elements, (aj)1⩽j⩽n be the sequence of elements of E arranged in some order , (Ai)1⩽i⩽m be a sequence of parts of E.
(a) For each index j, let kj be the number of indices i such that
aj∈Ai; for each index i let si=\mboxCard(Ai). Show that
[TABLE]
( b) *Suppose that
for each subset {x,y} of two elements of E, there exists one and only one index i such that x and y are contained in Ai. Show that if aj∈Ai, then si⩽kj. *
(c) *Under the assumptions of ( b), show that m⩾n. *
( Let kn be the least of the numbers kj; show that one may assume that , whenever i⩽kn,j⩽kn and i=j, one has aj∈Ai, and an∈Aj for all j⩾kn.)
(d) Under the assumptions of ( b), show that in order for m=n to hold, it is necessary and sufficient that
one of the following two cases occurs:
( i )i A1={a1,a2,…,an−1}, Ai={ai−1,an} for i=2,…,n;
( ii ) n=k(k−1)+1, *each Ai is a set of k elements, and each element of E belongs to exactly k sets Ai. *
Remarks.
Two aspects of this exercise need to be clarified. First , the parts Ai are implicitly assumed to be different from E. Second, the case ( ii ) of the part (d) is expected to hold
only up to renumbering of elements ai and parts Aj.
The troubles with the hint .
The parts (a) and (b) of this exercise are rather easy, and there is a hint for the part (c). But for me this hint turned out to be more of a riddle than of a help.
It would be quite easy to accept and follow the
suggestion to consider the least of the numbers kj. But why it should be kn? The phrase “Let kn be the least of the numbers kj” is fairly hard to interpret (the expressions used in the French original and
in the Russian translation have the same meaning ). The standard usage of “Let” (and of “Soit” in French) in mathematics is to introduce new notations. But kn is already defined.
The authors of the solution [T] found a clever way out . They introduce the number kn before introducing other numbers kj! This trick helps only partially: the question “Why kn?” remains.
The de Bruijn–Erdös exposition [dB-E] is better . They write “Assume now that kn is the smallest ki …”. This is less obscure, and amounts to renumbering elements of E, but leaves the question “Why kn?” unanswered.
If one manages to put this question aside, there is another riddle: how the subscripts i,j, which are merely marking the points (and do not even need to be numbers) may be compared with kn, which is a genuine characteristic of the point marked by n? Perhaps, this difficulty is encountered only by the
categorically minded mathematicians; analysts appear to be quite comfortable with using the values
of a function in its domain of definition.
Here de Bruijn and Erdös [dB-E] are again doing better . They write “Assume … that A1, A2, … , Akn are lines through an” (they call the parts Ai lines). This amounts to renumbering the parts Ai, and one may wonder why renumbering is treated as an assumption. The trick of the authors of [T] saves the day here for them. They simply denote the kn lines through an by A1, A2, … , Akn and other lines by Akn+1, Akn+2, … , Am.
There is one more riddle in the store. How one uses the assumption that kn is the least of the numbers kj
in the proof of the claim in the hint ? One does not , this claim is true without it .
Partially decrypting the hint .
Even if one encounters all these troubles and is not aware of the
de Bruijn–Erdös paper ( like me in 1970), the hint still may be of some help. The first message is that it is important to know when an element aj
is not in the part Ai. Together with the part (b) this suggest that the inequalities si⩽kj, which hold for aj∈Ai, should play a key role.
Another message is that the least of the numbers kj should play some role. After wasting some time assuming that for a given u
the number ku is minimal among all numbers kj
and trying to use this minimality to prove
something like stated in the hint , it is only natural to abandon this assumption and
consider an arbitrary subscript u such that 1⩽u⩽n.
The 1970 proof of m⩾n.
With no more than this limited help from this exercise (in 1970 I definitely understood less than in 2016) I managed to prove in the early 1970 the inequality m⩾n. Among my schoolmates this qualified as a solution of the Problem M5. This solution was lost long time ago. In April of 2016 and another time one year later
I attempted to reconstruct this proof . In these attempts I encountered the same difficulties as in 1970, and it is likely that I dealt with them in the same manner . At the very least, the resulting proof does not use any tools not known to me at the time, and does not involve any tricks (such as the cyclic ordering of some parts Ai by de Bruijn–Erdös) which I was unlikely to discover at the time. It is presented in Section 2. A solution of the N. Bourbaki exercise below.
The question “When an equality is possible ?” was considered by my classmates as too vague to be addressed seriously, and this was indirectly admitted by the authors of the solution [T]. If m= n, then (d) easily implies that Ai∩Aj=∅ if i=j. In fact , proving this property is an almost inevitable part of the proof of (d). This property means that the set E together with the parts Ai is a finite projective plane, possibly degenerate in the case ( i ) of the part (d). Therefore, this question amounts to the classification of
finite projective planes and, to the best of my knowledge, it remains largely open. See the paper by Ch. Weibel [W] for a survey
of the state of the art as of 2007, and [I1] for an introduction (not focusing on the finite case).
“Kvant”** publishes a solution.**
“Kvant” published a solution [T] of
the Problem M5 in the August or September of 1970, close to the beginning of the school year in the USSR (always September 1). The editors of the problem section wrote (see [T], p. 49):
The letters to editors indicate that this problem is extremely difficult , but interesting. As a matter of fact , here we have two problems: 1) prove that m⩾n, 2) when an equality is possible?
The first problem was completely solved only by A. Suslin from the city of Leningrad. His proof is based on a basic theorem of the linear algebra: if the number of n-vectors is greater than n, then they are linearly dependent .
Looking for such a proof will be interesting for whose who are familiar with these notions. Nobody solved completely the second problem. Of course, this is not surprising, since, as it will be explained below, it can be reduced to a well known, but unsolved problem in mathematics.
Among my schoolmates, these remarks stirred a renewed interest in the problem. A. Suslin was known as a very strong problem solver and as a
winner of the gold medal at 1967
International Mathematical Olympiad. Since only he submitted a complete solution, the problem had to be really difficult . Since he used tools going beyond the school level, the problem had to be even more difficult. And this caused a real interest in my unsubmitted
to the “Kvant” solution. I had an outline as a sparsely filled with formulas sheet of paper . One of my schoolmates borrowed this sheet for few days, and I have not seen it anymore.
But I am not aware of any serious attempt
to study the published solution [T]. For me it was almost as condensed and obscure as the N. Bourbaki hint . The role of the numbering of elements and parts is overemphasized:
Let us pay attention once again to the way we numbered elements and sets.
First of all, kn is the least of the numbers k1,k2,…,kn−1 ( sic ! – N.I. ). …
See [T], p. 51. And I always disliked random numerical examples, which are supposed to help the reader and are
extensively used in [T]. I must admit that I did not even look at the last two pages of [T] before writing these comments, and, in particular , before writing down the proof in the next section. Surprisingly, it turned out that the proof [T] contains a gap: it is mentioned that kn=2 in the situation
described in the case ( i ) of the Bourbaki exercise, but no proof that this is the only possibility is even attempted.
2. A solution of the N. Bourbaki exercise
The terminology and notations.
In contrast with N. Bourbaki and with the “Kvant”, I have no reasons to hide the geometric content of this result . Following de Bruijn and Erdös, I will call the elements of E points and the sets Ai lines. Since the lines are assumed to be proper subsets of E, every point is contained in at least 2 lines. Indeed, if a point is contained in only one line, then all points are contained in this line, i.e. it is not a proper subset .
It is convenient
to explicitly introduce a counterpart to the set E of points, namely the set of lines L={A1,A2,…,Am}. If the case ( i ) of the part (d) of the
Bourbaki exercise occurs, up to renumbering of points and lines, then the pair (E,L) is called a near-pencil. If the case ( ii ) of the part (d) occurs, then (E,L) is called a projective plane.
I also do not see any reason to follow the outdated fashion of using numerical indices (i.e. subscripts), which amounts to ordering objects even when their order is irrelevant . Instead of this, for every point z
we will denote by kz the number of lines containing z, and for every line l
we will denote by sl the number of points in l, i.e. the number of elements of the set l.
The part (a) of the Bourbaki exercise.
With the above notations the part (a) takes the form
[TABLE]
after interchanging the sides. This immediately follows from counting in two different ways the pairs (z,l)∈E×L such that z∈l.
The part ( b ) of the Bourbaki exercise.
For the rest of the paper we will assume that
the assumption of the part ( b ) holds, i.e. that for every pair of distinct points there is exactly one line containing both of them. If a line l contains ⩽1 points, then removing l from the set of lines
does not affects this assumption, and at the same time decreases number of lines by 1. Hence we may assume for the rest of the paper that
every line contains at least 2 points.
With the above notations
the part ( b ) takes the form
[TABLE]
We will call these inequalities
the de Bruijn–Erdös inequalities.
In order to prove the de Bruijn–Erdös inequalities, suppose that z∈l. Then for every y∈l there is a unique line
containing {z,y} and it is different from l
because z∈l. These lines are pairwise distinct because if y,y′∈l and y=y′, then l is the only line containing {y,y′}. There are is sl such lines and all of them contain z; therefore sl⩽kz.
Lines through an arbitrary point .
Let u∈E be an arbitrary point , let p=ku be the number of lines containing u, and let U be the set of these lines.
By the definition of U, if a line l is not in U, then u∈l. For every l∈U we have the de Bruijn–Erdös inequality sl⩽ku. By summing all these inequalities and taking into account that there
are m−p lines not belonging to U, we see that
[TABLE]
Since every set of the form {u,y} with y=u is contained in one and only one line, the sets l∖{u} with l∈U are pairwise disjoint and form a partition of E∖{u}. Since we assumed that sl⩾2 for all lines l, all these sets are non-empty. Let U be a set of representatives of these sets. In other terms, U is contained in E∖{u} and intersects every set l∖{u} with l∈U in exactly 1 point . In particular , U consists of exactly p points.
If (l,z)∈U×U and z∈l, then the de Bruijn–Erdös inequality sl⩽kz holds. There are p(p−1) of such pairs (l,z) and hence p(p−1) of such inequalities. For each l∈U the number sl occurs p−1 times in the left hand sides
of them, and for each z∈U the number kz
occurs p−1 times in the right hand sides. Hence the sum of all these inequalities is
[TABLE]
After dividing (2.3) by p−1 we get
[TABLE]
Now it is only natural to take the sum of the inequalities (2.2) and (2.4) and conclude that
[TABLE]
The left hand side of the inequality (2.5) is the same as the left hand side of the equality (2.1). The right hand side of (2.5) can be compared with the right hand side of the equality (2.1) if ku is the least among the numbers kz and m⩽n.
Proof of the inequality m⩾n.
Now we are ready to do the part (c) of the Bourbaki exercise. Let u∈E be a point such that ku
is the least of the numbers kz
over all points z∈E. Then
[TABLE]
for every subset Y⊂E consisting of m−p points.
Suppose that m⩽n. Then the subset Y can be chosen to be disjoint from U ( because U consists of p points). Let us choose an arbitrary Y disjoint from U and let Z=Y∪U. Then Z is a subset of E consisting of m points and
the inequalities (2.5) and (2.6) imply that
[TABLE]
where the last inequality is strict unless Z=E. In view of (2.1) this inequality cannot be strict
and hence Z=E and m=n. Since m⩽n implies that m=n, we see that m⩾n.
The case m=n.
After the work done in the proofs of (a), ( b ), and (c), the part (d ) nearly proves itself . As we will see, in this case all inequalities (2.2) – (2.7) are, in fact , equalities.
By (2.1) the leftmost and the rightmost sums in (2.7) are equal. It follows that Z=E and hence Y=E∖U. Moreover , the sides of each of the inequalities (2.5) and (2.6) are equal. Since ku is the least of the numbers kz, the equality of the sides of (2.6) implies that
[TABLE]
The fact that the sides of (2.5) are equal implies that the sides of each of the inequalities (2.2) and (2.4) are equal also. The equality of the sides of (2.2) implies that
[TABLE]
Since the sides of (2.4) are equal, the sides of (2.3) are also equal. Since (2.3) is the sum of the inequalities sl⩽kz over all pairs (l,z)∈U×U such that z∈l, the equality of the sides of (2.3) implies that sl=kz for all (l,z)∈U×U such that z∈l. Equivalently,
[TABLE]
The rest of the proof splits into two subcases depending on if p=2 or p⩾3.
The subcase p=2.
In this case U={l,l′} for some l,l′ and hence E=l∪l′. It follows that every line different from l,l′ contains only 2 points, namely
the points of its intersection with the lines l,l′. If sl,sl′⩾3, then there are at least 4 points z=u and the part ( b) implies that kz⩾3 for every z=u. On the other hand, (2.8) implies that
[TABLE]
for every z∈ U. But U consists of only two points and hence kz⩾3 for no more than two points z. The contradiction shows that
either sl=2 or sl′=2. We may assume that sl=2. Then l={u,a} for some a∈E and every line different from l,l′ has the form {a,z} with z∈l′∖{u}. It follows that (E,L) is a near-pencil.
The subcase p⩾3.
The set U is a set of representatives of the sets l∖{u} with l∈U. For any two lines l,l′∈U the assumption p⩾3 implies that there exists a point z∈U such that z∈l,l′. If z is such a point , then (2.10) implies that
[TABLE]
Similarly, if z,z′∈U, then there exists a line l∈U such that z,z′∈l and hence
[TABLE]
It follows that in the subcase p⩾3 all numbers sl, kz with l∈U and z∈U are equal. Since ku=p⩾3 is the smallest of the numbers kz over all z∈E, it follows that
[TABLE]
Let l∈U, and let y be the unique element of U contained in l. Since sl⩾3, there exists a point x∈l not equal to u,y. We can replace in U the point y by the point x
and get a new set of representatives U′. Then all previous results apply to U′ in the role of U. In particular , kx=kz for all z∈U∖l=U′∖l and hence
[TABLE]
On the other hand, x∈U and hence kx=ku by (2.8) applied to the original set U. At the same time (2.8) implies that ku=kz for all z∈U and hence
[TABLE]
It follows that all numbers kz are equal. At the same time by (2.9) and (2.10) every sl is equal to some kz. It follows that all numbers sl, kz with l∈L and z∈E are equal. It remains to apply the following lemma.
Lemma 1.
If all the numbers sl, kz are equal, then (E,L) is a projective plane.
Proof .
Let k be the common value of the numbers sl, kz, and let y∈E. The sets l∖{y} with y∈l are pairwise disjoint and form a partition of E∖{y}. Each of them consists of
[TABLE]
points, and there are ky=k such sets. It follows that the number n of elements of E is equal to k(k−1)+1. Therefore, (E,L) is a projective plane. ■
Remarks.
A key step of this solution and the solution [T] differ from the de-Bruijn–Erdös paper in the same way: the cyclic order argument of de Bruijn–Erdös (see Section 3. The de Bruijn–Erdös proof) is replaced by the inequalities (2.3) and (2.4).
3. The de Bruijn–Erdös proof
Since the proof presented in Section 2. A solution of the N. Bourbaki exercise grow out of a
summary of the de Bruijn–Erdös proof , albeit not quite understood, it is not surprising that the two proofs have a lot in common. In the following exposition of the de Bruijn–Erdös proof
we will use the notations of Section 2. A solution of the N. Bourbaki exercise and
will refer to Section 2. A solution of the N. Bourbaki exercise for the arguments
which differ from [dB-E] only in the notations and the amount of details. The Bruijn–Erdös paper is concise on the border of being cryptic.
The de Bruijn–Erdös proof begins with the parts (a) and ( b ) of the Bourbaki exercise. After this de Bruijn–Erdös introduce ku as the smallest among all numbers kz (and denote it by kn). Then de Bruijn–Erdös observe that
it can assumed that every line contains at least two points. Following the notations of Section 2. A solution of the N. Bourbaki exercise, let us denote by U the set of all lines containing u. By the de Bruijn–Erdös inequalities sl⩽ku for every l∈U. The inequalities (2.2) and (2.6) follows. The following argument plays a role similar to the role of the inequality (2.3).
The cyclic order argument .
Let l1,l2,…,lp be a cyclically ordered list of elements of U. We treat the subscripts 1,2,…,p as integers mod p. For each i=1,2,…,p let us choose some point aj∈lj∖{u} and let U be the set of these points. Let
[TABLE]
Since ai+1∈li, by the de Bruijn–Erdös inequalities
si⩽ki+1 for all i=1,2,…,p, i.e.
[TABLE]
By summing the inequalities (3.1) one concludes that
[TABLE]
The inequality (3.2) is nothing else but
another form of (2.4). The arguments of Section 2. A solution of the N. Bourbaki exercise show that (3.2) implies that m⩾n. In fact , de Bruijn and Erdös do not bother to
write down even the inequality (3.2), to say nothing about other details presented in Section 2. A solution of the N. Bourbaki exercise.
The case m=n.
In view of the equality (2.1) in this case the left hand and the right hand sides of
the inequality (3.2) are equal. Together with (3.1) this implies that
[TABLE]
Similarly, in this case
the left hand and the right hand sides of the inequality (2.6) are equal. Since m=n, one can take Y=E∖U in (2.6). It follows that ku=kz for all z∈E∖U. Finally, the left hand and the right hand sides of the inequality (2.2) are equal. It follows that sl=ku for all l∈U. By combining the last two observations, we see that
[TABLE]
Since m=n, both sets L∖U and E∖U consist of n−p elements. It follows that one can number the points and lines in
such a way that (in the notation of the Bourbaki exercise)
[TABLE]
As the next step, let us renumber the points
and lines once more and assume that
[TABLE]
The rest of the proof splits into two subcases depending on if k1>k2 or not .
The subcase k1>k2.
In this case s1=k1>ki for all i⩾2. By the de Bruijn–Erdös inequalities this implies that ai∈A1 for all i⩾2. It follows that (E,L) is a near-pencil.
The subcase k1=k2.
Suppose that kj<k1=k2 for some j. By the de Bruijn–Erdös inequalities aj belongs to the both lines
A1 and A2. This is possible for only one point , namely the point of the intersection of the lines
A1 and A2. In view of (3.4), this may happen only if
[TABLE]
and hence sj=kj>kn⩾2 for all j=n. It follows that sj⩾3 if j<n. In particular , all kn lines containing an
consist of ⩾2 points and all except , perhaps, the line An, consist of ⩾3 points. Therefore one can choose 2 points x,y=an on one of these lines, and a point z=an on some other line. Let lj,lj′ be the lines containing the pairs {x,z} and {y,z} respectively. Then j=j′ and an∈lj,lj′. Hence the de Bruijn–Erdös inequalities imply that sj,sj′⩽kn, contrary to the fact that sj>kn if j=n. The contradiction shows that all numbers kj are equal, and hence all numbers si,kj are equal. Now the observation at the end of Section 2. A solution of the N. Bourbaki exercise implies that (E,L) is a projective plane.
Intersection of lines.
After the proof is completed, de Bruijn–Erdös point out
that in the subcase k1=k2 of the case m=n every two lines intersect . Indeed, if l′,l′′ are two disjoint lines
and a∈l′′, then there are sl′ lines
containing a and intersecting l′, and still one more line, namely l′′, containing a. Therefore ka⩾sl′+1, contrary to the fact all numbers kz,sl are equal. In fact , every two lines obviously intersect
in the subcase k1>k2 also.
Why kn?
Now it is clear why the smallest of the numbers kz is denoted by kn. The number kn is indeed the smallest
if the points are ordered in such a way that (3.4) holds. At the same time (3.4) plays almost no role in the proof . One may speculate that (3.4) and notation kn for the smallest of the numbers kz
are remnants of an earlier approach to the theorem.
4. From de Bruijn–Erdös to systems of distinct representatives
The cyclic order argument and systems of distinct representatives.
The key step of the de Bruijn–Erdös proof is the cyclic order argument
used to prove the inequality (3.2) and the equalities (3.3) in the case m=n. Ultimately, the cyclic order argument is based on the fact that ai+1∈li for all i=1,2,…,p, i.e. on the fact that i⟼ai+1 is a system of distinct representatives for the family i⟼E∖li of subsets of E, where i=1,2,…,p.
Once this is realized, it is only natural to look for a system of distinct representatives of
the full family l⟼E∖l of the complements of lines, i.e. for an injective map l⟼a(l) from L to E such that a(l)∈E∖l for all l∈L.
By the well known Ph. Hall’s marriage theorem, such a system of distinct representatives exists if and only if
for every subset K⊂L the union
[TABLE]
contains ⩾∣K∣ elements, where ∣X∣ denotes the number of elements of a set X. But the intersection of ⩾2 lines consists of ⩽1 points, and, almost obviously, this condition holds.
The message.
All this emerged in my mind in one instant as an irreducible revelation. My first thought after this revelation
was that it cannot be true, because if it is true, then everybody writing about this topic would
use systems of distinct representatives. Perhaps, the right question is not how I came up with this idea, but why experts missed it .
The rest of this section is devoted to the proof [I2] based on this revelation.
Proof of m⩾n.
We may assume that m⩽n. Let K be a subset of L. If ∣K∣=1, then (4.1) is the complement of a line
and hence contains ⩾1 elements. If 2⩽∣K∣⩽m−1, then (4.1) is a complement in E of ⩽1 point
and hence contains
[TABLE]
elements. If ∣K∣=m, then (4.1) contains n⩾m=∣K∣ elements. Therefore there exists a system of distinct representatives for the family l⟼E∖l, i.e. there exists an injective map l⟼a(l) such that a(l)∈l for every l. By the de Bruijn–Erdös inequalities
[TABLE]
By summing all these inequalities and using the injectivity of l⟼a(l) we see that
[TABLE]
Moreover , the second inequality is strict unless m=n (otherwise the last sum has more positive summands
than the previous one). But (2.1) implies that both inequalities in (4.3) should be actually equalities. It follows that m=n. Moreover , in view of the inequalities (4.2), it follows that sl=ka(l) for every l∈L (under the assumption m⩽n).
The case m=n.
Suppose that a point z
is contained in ⩾m−1 lines. Each of these lines contains at least one point in addition to z. Since m=n, there are no other points and z is contained in exactly m−1 lines. Since there are exactly m lines, only one line does not contain z. This line should contain all points =z. It follows that (E,L) is a near-pencil.
Suppose now that no point
is contained in ⩾m−1 lines. Let K be a proper subset of L. If ∣K∣=1, then (4.1) is equal to E∖l for some line l. If E∖l consists of only one point z, then by the de Bruijn–Erdös inequalities z is contained in ⩾sl=m−1 lines, contrary to the assumption. Therefore, (4.1) contains ⩾2=∣K∣+1 points. If ∣K∣⩽m−2, then (4.1) contains ⩾n−1=m−1⩾∣K∣+1 points. Finally, if ∣K∣=m−1, then (4.1) contains all n=m=∣K∣+1 points because no point
is contained in ⩾m−1 lines.
We see that (4.1) contains ⩾∣K∣+1 elements
for every proper subset K⊂L. This allows to get from the marriage theorem more than
just the existence of a system of distinct representatives. Let λ∈L and z∈E∖λ. Then there exists a system of distinct representatives l⟼a(l) such that a(λ)=z. This immediately follows from an application
of the marriage theorem
to the family of sets (E∖{z})∖l with l∈L∖{λ}.
Since m⩽n, the existence of a system of distinct representatives l⟼a(l) such that a(λ)=z implies that sλ=ka(λ)=kz. Therefore, z∈l implies that sl=kz and hence every line containing z intersects l. It follows that every two lines intersect .
If E cannot be obtained as the union of two lines, then for every two lines l,l′ there exists a point z such that z∈l,l′ and hence sl=kz=sl′. In this case all the numbers sl,kz are equal
and hence (E,L) is a projective plane
by Lemma 1 at the end of Section 2. A solution of the N. Bourbaki exercise. If there exist two lines l,l′ such that E=l∪l′, then ky=2, where y is the point of intersection of l and l′, and the proof is completed by applying the following lemma.
Lemma 2.
If m=n and ky=2 for some point y, then (E,L) is a near-pencil.
Proof .
Let l,l′ be the lines containing y. Then E=l∪l′ and there are n=sl+sl′−1 points. In addition to the lines l,l′ there are (sl−1)(sl′−1) lines consisting of a point in l∖ {y} and a point in l′∖ {y}. If sl⩾sl′⩾3, then the number m of lines is
[TABLE]
contrary to the assumption m=n. Therefore one of the lines l,l′ consists of 2 points
and hence (E,L) is a near-pencil. ■
5. Linear algebra and the inequality m⩾n
A proof of the inequality m⩾n based on the linear independence.
This proof was communicated to me by F. Petrov [P]. I believe that this is essentially the proof found by A. Suslin.
Let RL be the vector space of maps L⟶R with the scalar product
[TABLE]
Every z∈E defines a map vz:L⟶R by the rule vz(l)=1 if z∈l and vz(l)=0 otherwise. There are n maps vz. Since the dimension of RL is equal to m, it is sufficient to prove that the maps vz
are independent as vectors of RL.
The scalar product (vz,vz) is equal to the number of lines containing the point z, and hence (vz,vz)⩾2 for all z∈E. If z=y, then (vz,vy) is equal to the number of lines containing both z and y, and hence (vz,vy)=1. If the vectors vz
are linearly dependent , then
[TABLE]
for some real numbers cz, z∈E, such that not all cz are equal to [math]. For every y∈E taking the scalar product of this equality with the vector vy results in the equality
[TABLE]
Since (vz,vy)=1 for all z=y, this equality implies that
[TABLE]
Since (vy,vy)⩾2, it follows that the coefficient cy and the sum
[TABLE]
have opposite signs. But since not all cy are equal to [math], this cannot be true for all y∈E. The contradiction shows that vectors vz are linearly independent
and hence m⩾n. ■
Standard linear algebra proofs of the inequality m⩾n.
In order to present standard proofs it is convenient to return to
the notations of the N. Bourbaki exercise. Let M be the incidence matrix of the points aj and sets Ai. Namely, M is an n×m matrix with entries mji=1 if aj∈Ai and mji=0 otherwise. Let us consider the product MMT, where MT is the matrix transposed to M. It is an n×n matrix with all non-diagonal entries equal to 1
and with diagonal entries k1,k2,…,kn. The most classical linear algebra proofs , going back to the paper [Bo] by R.C. Bose, proceed with the computation of the determinant of
MMT. It is rarely presented in details; apparently, it is expected that the readers enjoy computations of determinants. Curious readers may find a computation of detMMT at the end of this section; in particular , the computation shows that this determinant is non-zero. The non-vanishing of detMMT means that the rank of the matrix MMT is equal to n, and this implies that the rank of M is ⩾n. Since M is an n×m matrix, this may happen only if m⩾n.
More modern expositions avoid computation of the determinant detMMT by observing that MMT is equal to the sum of the diagonal matrix
with the diagonal entries
[TABLE]
and the n×n matrix J with all entries equal to 1. Since kj⩾2 and hence kj−1⩾1 for all j, the first matrix is positive definite. The matrix J is positive semi-definite, although is not definite. In fact , the associated quadratic form xJxT, where x=(x1,x2,…,xn) is a row vector , is equal to (x1+x2+…+xn)2. It follows that the sum MMT of these matrices
is positive definite and hence has the rank n. As above, this implies that m⩾n.
Comparing the proofs.
The standard proofs do not fit the “Kvant” description of the proof by A. Suslin: they use more advanced tools than the theorem about the linear dependence of
more than n vectors in an n-dimensional vector space. One can find a proof based only on this theorem
in the unpublished book draft [BF] by L. Babai and P. Frankl. But even in this remarkable book it is hidden in the exercises. See Exercise 4.1.5 and its solution on p. 184. The preference for using the matrix MMT
seems to be a part of a dominating culture. On the other hand, all proofs based on the linear algebra more or less explicitly reduce
the inequality m⩾n to the following lemma and then prove it .
Lemma.
*Let V be an m-dimensional vector space over R equipped with
a scalar product (∙,∙). Let P be a set of n vectors in V. Suppose that there exists λ∈R, λ>0, such that *
[TABLE]
*for every u∈P and every two distinct vectors v,w∈P. Then m⩾n. * ■
A generalization.
The linear algebra proofs apply with only trivial changes to a more general situation. Namely, it is sufficient to assume that
there exist a natural number λ⩾1 such that every two distinct points are contained in exactly λ lines
and every point is contained in >λ lines. Then the conclusion m⩾n still holds. This is due to H.J. Ryser [R]. Apparently, no combinatorial proof of Ryser’s theorem is known. Ryser [R] also used linear algebra to provide a description of
the case m=n similar to
de Bruijn–Erdös description in the case λ=1.
The determinant of MMT.
For the benefit of the readers who do not like
to compute the determinants themselves , here is a computation of detMMT following the textbook [HP].
Let mj=kj−1 for all j. Then
[TABLE]
Let us subtract the first row from every other one and get the matrix
[TABLE]
For j=2,3,…,n, let us multiply the j-th column by m1/mj (recall that kj⩾2 and hence mj⩾1>0) and add the result to the first column. We get the matrix
[TABLE]
where D=m1+1+j=2∑nmjm1=m1+j=1∑nmjm1 .
It follows that detMMT=D⋅j=2∏nmj=j=1∏nmj⋅(1+j=1∑nmj1)=0.
6. Hanani’s theorem
Two papers of H. Hanani.
According to the Th. Motzkin [M], the first proof of the inequality m⩾n and, it seems, of the full de Bruijn–Erdös theorem, was given in 1938 by H. Hanani. He published an outline of his proof [H1] only in 1951. Later on H. Hanani published a detailed
exposition [H2] of a simplified proof . In fact , in [H2] he proved (at no extra cost) a stronger version of the
de Bruijn–Erdös theorem. He also used his methods to prove a 3-dimensional analogue
dealing with points, lines, and planes.
Hanani’s Theorem.
*Under the previous assumptions, let L∈L be a line containing the maximal
number of points among all lines, let P be the set of all lines intersecting L (in particular , L∈P), and let p be the number of elements of P. Then p⩾n, and if p=n, then P=L and (E,L) is either a near-pencil, or a projective plane. *
Suppose
that n⩾p and (E,L) is not a near-pencil. As usual, we assume that every line contains ⩾2 points. Let a=sL. Let K be the line with the maximal number of points among the lines different from L, and let b=sK. Then a⩾b. The strategy is to estimate n, or , what is the same, n−1 in terms of a and b both from the below and from the above.
Hanani’s Lemma.
If x∈L, then
[TABLE]
Proof**.** Let us consider pairs (l,y) such that l is a line containing x and
y is a point in l∖L. Such a pair is uniquely determined by the point y
and hence there are n−a such pairs. A line l occurs in such a pair if and only if x∈l and l=L. It follows that there are kx−1 choices of l. Given a line l, there are ⩽b−1 choices for the point y. Therefore the number n−a of such pairs is ⩽(kx−1)(b−1). The lemma follows. ■
An upper estimate of n−1.
By summing the inequalities (6.1) over all x∈L and adding 1 in order to account for the line L itself , we can estimate p from below and conclude that
[TABLE]
or , what is the same,
[TABLE]
The inequality (6.3) provides an estimate of n−1 from the above.
A lower estimate of n−1.
There is another way to estimate p from below. By a miracle, this other estimate of p from the same side
leads to an estimate of n−1 from the
other side. Let z be a point in L∩K if L∩K=∅, and an arbitrary point of L otherwise. For every x∈L∖{z}, y∈K∖{z} there is a unique line l containing {x,y}. All these lines are distinct , not equal to L, and do not contain z. Clearly, there are (a−1)(b−1) of such lines. A lower estimate of number kz
of lines containing z is provided by (6.1). It follows that
[TABLE]
[TABLE]
and hence (n−1)(b−1)⩾(n−1)−(a−1)+(a−1)(b−1)2 and
[TABLE]
Since b⩾2, it follows that either b=2, or n−1⩾(a−1)b. If b=2, then all lines except L consist of 2 points and the inequality (6.3) implies that a⩾n−1. But L=E and hence a⩽n−1. It follows that a=n−1 and hence L contains all points of E except one
and (E,L) is a near-pencil, contrary to the assumption. Therefore
[TABLE]
The inequality (6.4) provides an estimate of n−1 from the below.
Combining the two estimates.
After multiplying the inequality (6.4) by (a−b+1) and combining the result with the inequality (6.3), we see that
[TABLE]
and hence a⩾b(a−b+1)=b(a−b)+b, or , what is the same
[TABLE]
Since b>1, this implies that b⩾a. On the other hand, b⩽a by the definition of a,b. It follows that a=b. By combining a=b with the inequalities (6.3) and (6.4) we conclude, respectively, that a(a−1)⩾n−1 and n−1⩾a(a−1). It follows that n−1=a(a−1) and hence n=a(a−1)+1. By combining this with (6.2) we see that
[TABLE]
It follows that n=p, and therefore p⩾n if the inequality p⩽n is not assumed.
The case p=n.
As we just saw, in this case a=b and n=a(a−1)+1.
Let us prove first that every line belonging to P consists of exactly a points. Consider all pairs (l,y) such that l∈P and y∈E∖L. The line l is uniquely determined by its point of intersection with L (which can be any point of L) and the point y. Therefore there are a(n−a)=an−a2 such pairs. On the other hand, for every line l∈P∖{L} there are ⩽a−1 choices of the point y and hence the number of such pairs is ⩽(p−1)(a−1). Moreover , if at least one line l∈P∖{L} has <a points, then the number of such pairs is <(p−1)(a−1). But (p−1)(a−1)=(n−1)(a−1)=na−a2. It follows that every line belonging to P∖{L}, and hence every line belonging to P, consists of exactly a points.
Now we are ready to prove that L=P. By the definition, every line containing a point of L belongs to P. Let y∈E∖L. For every x∈L there is
a unique line containing {x,y}. These lines are pairwise distinct , intersect only at y, and belong to P. Moreover , every line containing y and belonging to P
is equal to one of these a lines. Each of these lines contains a−1 points different form y. It follows that the total number of points on these lines is equal to a(a−1)+1, i.e. to the number n of points in E. Therefore for every point z=y there is a line belonging to P and
containing {z,y}. Since there is only one line containing any two given points, it follows that all lines containing a point y∈E∖L belong to P. It follows that L=P and every point in E∖L belongs to exactly a lines. In view of the previous paragraph, L=P implies that every line consists of exactly a points.
By the previous paragraph ky=a if y∈E∖L. If y∈L, then by (6.1)
[TABLE]
If ky>a, then the arguments of the previous paragraph show that n>a(a−1)+1, contrary to n=a(a−1)+1. The contradiction shows that ky=a also for y∈L. It follows that (E,L) is a projective plane. This completes the proof of Hanani’s theorem.
Deducing the de Bruijn–Erdös theorem.
Suppose that m⩽n. Obviously, p⩽m and hence p⩽n. By Hanani’s theorem this implies that p=n and (E,L) is either a near-pencil, or a projective plane. Since p⩽m⩽n and p=n, it follows that m=n.
Remarks.
In contrast with [dB-E] and many papers written much later , Hanani’s proof of his version of the de Bruijn–Erdös theorem
in [H2] is quite modern. The points and lines are not enumerated ; in fact , there are no subscripts at all. But when he turns to the 3-dimensional case, he returns to the tradition of enumerating almost
everything in sight …
Also, in contrast with almost every other proof , Hanani’s proof does not use the de Bruijn–Erdös inequalities, at least not directly. But the proof of the fact that P=L includes
a proof of the de Bruijn–Erdös inequalities sL⩽ky for y∈L.
7. A simpler proof of Hanani’s theorem
This proof follows the outline of the Hanani’s one, but brings into the play the smallest number ku among all kz. Also, “the second largest” line is chosen
not among all lines, but among the lines containing u. This allows to avoid Hanani’s Lemma and to replace “miraculous” estimates by rather straightforward ones. The proof was partially inspired by V. Napolitano [N]. If one is interested only in the de Bruijn–Erdös theorem, it can be simplified even further .
Suppose that n⩾p. Following de Bruijn–Erdös [dB-E], let us consider a point u such that ku
is the smallest number among all numbers kz. Let a=sL and k=ku. There are two cases to consider : the case when k⩾a and
the case when k<a. The arguments in both cases are similar and can be unified, but the first case is simpler and we will deal with it first .
The case k⩾a.
Every point is contained in one of the k lines containing u, and each of these lines contains ⩽a−1 points in addition to u. Therefore the total number of points
[TABLE]
For every point x∈L there are ⩾ k−1 lines containing x and different from L. All these lines belong to P and are pairwise distinct . Therefore
[TABLE]
If n⩾p, then the inequalities (7.1) and (7.2) imply that
[TABLE]
and hence a⩾k. Together with k⩾a this implies that a=k and the inequalities (7.1) and (7.2) are actually equalities. It follows that n=p=1+a(a−1), every line containing u consists of exactly a points, and every point belonging to L is contained in exactly k lines. In other terms, ky=k if y∈L. In particular , every point of L can be taken as u
and hence every line intersecting L consists of exactly a points. In other terms, sl=a if l∈P.
Let y∈E∖L. Then there are a lines containing y and belonging to P, and together they contain 1+a(a−1)=n points. It follows that for every point y′=y there is a line belonging to P and containing {y,y′}. Since there is only one line containing {y,y′}, this implies that L=P. This implies that sl=a for all l∈L and ky=a for all y∈E∖L. Since we already proved that ky=k=a for all y∈L, we see that ky=a for all points y. It follows that (E,L) is a projective plane.
The case k<a.
By the de Bruijn–Erdös inequalities in this case u∈L. Let M be a line containing u and such that sM is the largest
number among all numbers sl for lines l containing u and different from L. Let a′=sM. Then a⩾a′. The strategy is to use the fact that u∈L to refine the inequalities (7.1) and (7.2) by using a′.
Every point is contained either in L or in one of the other k−1 lines containing u. Each of these lines contains ⩽a′−1 points in addition to u. Therefore the total number of points
[TABLE]
There are k lines containing u, and for every point x∈L and different from u
there are kx−1 of lines containing x
and different from L. All these lines belong to P and are pairwise distinct . If x∈L and x=u, then x∈M and hence kx⩾sM=a′. It follows that
[TABLE]
The inequalities (7.3) and (7.4) together with the assumption n⩾p imply that
[TABLE]
By simplifying this inequality we see that a+k(a′−1)⩾k+a(a′−1) and hence
[TABLE]
Since a′ is the number of points in a line, a′⩾2. It follows that either k⩾a, or a′=2. But k⩾a contradicts to the assumption k<a, and hence a′=2.
The equality a′=2 means that M consists of 2 points. By the choice of M, this implies that every line containing u
and different from L consists of 2 points. Since L and these other lines contain all points and
pairwise intersect only in u, it follows that n=a+k−1.
One of the points of M is u. Let z be the other point . Then z∈L because M=L, and hence there are
a lines containing z and belonging to P. Among these lines only M contains u. There are also k−1 lines containing u and not equal to M, and all of them belong to P. It follows that p⩾k+a−1=n. Since n⩾p, this implies that p=n and every line belonging to P contains either u or z.
Suppose that there is a line l containing u and different from L,M. Let y∈l and y=u. Then y∈L and hence there are
a lines containing y and belonging to P. Among these lines only one contains u. By the previous paragraph, the other a−1 lines contain z. Since there is only one line containing {y,z}, it follows that a−1⩽1 and hence a=2. Since k<a, this implies that k⩽1 contrary to the fact that kx⩾2 for all x. The contradiction shows that only the lines L,M contain u.
It follows that E=L∪M and hence z is the only point not belonging to L. In turn, this implies that the set of lines L
consists of L and the lines of the form {x,z}, where x∈L. Therefore L=P and (E,L) is a near-pencil.
8. All the de Bruijn–Erdös inequalities
The Basterfield–Kelly–Conway argument .
Suppose that m<n. Then
[TABLE]
for every l∈L. If z∈l, then sl⩽kz and hence m−kz⩽m−sl. It follows that
[TABLE]
The contradiction leads to the conclusion that m⩾n.
This argument is the main part of the proof of Theorem 2.1 (dealing with a more general situation) of the paper [BK] by J.G. Basterfield and L.M. Kelly. Basterfield and Kelly [BK] wrote that they are
“indebted to J. Conway for the simplicity
of the present formulation of the proof of Theorem 2.1.” By some reason this acknowledgment led
to attributing this argument to J. Conway alone even by some authors referring directly to [BK]. By replacing the strict inequalities < by the non-strict ones ⩽, one can use this argument also to deal with the case m=n along the lines of Sections 2. A solution of the N. Bourbaki exercise – 4. From de Bruijn–Erdös to systems of distinct representatives. This observation is apparently due to P. de Witte [dW].
This is a sharp-witted, but also the most obscure and puzzling proof . It appears
as a rabbit from a hat
without any context or explanations
and tells nothing about why the theorem is true. In the rest of this section I will explain a natural line of thinking which leads
to such a proof . There is no evidence suggesting that it was discovered in this way, but it could have been.
Summing the de Bruijn–Erdös inequalities.
Summing de Bruijn–Erdös inequalities and then comparing the result with (2.1) is the key step of both the de Bruijn–Erdös proof and the proof from Section 2. A solution of the N. Bourbaki exercise. A natural idea is to use all de Bruijn–Erdös inequalities on an equal footing. One way to do this is to use systems of distinct representatives as in Section 4. From de Bruijn–Erdös to systems of distinct representatives.
One may hope for a proof using all de Bruijn–Erdös inequalities
in a way closer to the proof of inequalities (2.3) and (2.4) in Section 2. A solution of the N. Bourbaki exercise than to the cyclic order argument of de Bruijn–Erdös. Let us sum the inequalities sl⩽kz over all pairs (l,z)∈L×E such that z∈l. Every sl appears n−sl times
in the left hand side of these inequalities, and
every kz appears m−kz times
in the right hand side. Therefore, taking the sum results in the inequality
[TABLE]
This inequality does not lead to a proof of the desired sort , but it admits a natural generalization. Let F be an increasing function. Instead of sl⩽kz, one can sum the inequalities F(sl)⩽F(kz). In fact , there is no need to apply
the same function to sl and kz.
Let F,G be a pair of functions
such that s⩽k implies F(s)⩽G(k). Taking the sum of the inequalities F(sl)⩽G(kz) over all pairs l,z such that z∈l results in the inequality
[TABLE]
It remains to realize that
the functions F,G may depend on the numbers m,n and that one can get rid of the factors n−sl and m−kz by dividing by these factors.
A proof of the de Bruijn–Erdös-Hanani theorem.
As usual, we may assume that m⩽n. Let
[TABLE]
Then s⩽k implies F(s)⩽G(k). Indeed, the latter inequality is equivalent to the inequality s(m−k)⩽k(n−s), and hence to the inequality sm⩽kn, which is obviously true if s⩽k and m⩽n. By summing the inequalities F(sl)⩽G(kz) over all pairs (l,z)∈L×E such that z∈l we get the inequality
[TABLE]
which is obviously equivalent to
[TABLE]
In view of (2.1) the sides of the latter inequality are actually equal, and hence the sides of the inequality (8.1) are also equal. Since the inequality (8.1) is obtained by summing inequalities F(sl)⩽G(kz), it follows that if (l,z)∈L×E and z∈l, then
[TABLE]
and hence slm=kzn. Since m⩽n and sl⩽kz, the last equality implies that m=n and sl=kz. In particular , z∈l implies that sl=kz and hence every line containing z intersects l. It remains to repeat the last paragraph of Section 4. From de Bruijn–Erdös to systems of distinct representatives.
This proof has the advantage of explicitly using the equality (2.1). The Basterfield–Kelley–Conway argument implicitly uses double sums
and a change of the order of summation. This change of the order of summation
is a stronger tool than the double counting proving (2.1).
A version of this proof .
Suppose that m⩽n. One can take as F,G the functions
[TABLE]
They can be obtained by adding 1 to the previous choice of the functions F,G. Therefore s⩽k again implies F(s)⩽G(k). This can be also verified in the same way as before. By summing the inequalities F(sl)⩽G(kz) over all pairs l,z such that z∈l we get
[TABLE]
which is obviously equivalent to mn⩽nm. But the sides of the latter inequality are equal. It follows that if (l,z)∈L×E and z∈l, then
[TABLE]
and hence slm=kzn. The rest of the proof is exactly the same as above.
Dividing everything in this version of the proof by m, which amounts to taking
[TABLE]
and omitting the explanations
turns this version into the
Basterfield–Kelly–Conway argument .