Some proofs of the Poincar\'e-Birkhoff-Witt theorem and related matters
Gyula Lakos

TL;DR
This paper explores the foundational PBW theorem for free Lie algebras, demonstrating its close connection to Magnus-Witt theorems through various proofs and expositions.
Contribution
It provides new insights into the relationship between the PBW theorem and Magnus-Witt theorems, offering multiple proof strategies.
Findings
PBW theorem closely follows from Magnus-Witt theorems
Multiple proof methods elucidate the theorem's foundations
Enhanced understanding of free Lie algebra structures
Abstract
This expository paper focuses on free Lie -algebras and the basic PBW theorem. We argue in various ways that the basic PBW theorem is a quite close consequence of the Magnus-Witt theorems concerning free Lie algebras.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Topics in Algebra · Algebraic structures and combinatorial models · Homotopy and Cohomology in Algebraic Topology
Some proofs of the Poincaré–Birkhoff–Witt theorem and related matters
Gyula Lakos
Department of Geometry, Institute of Mathematics, Eötvös University, Pázmány Péter s. 1/C, Budapest, H–1117, Hungary
Abstract.
The first part of this article is concerned with proving the symmetric PBW theorem using Magnus commutators. Extensions of the Dynkin–Specht–Wever lemma and a general theorem of Nouazé–Revoy type are also obtained. The second part focuses on free Lie -algebras and the basic PBW theorem. Appendices are provided in order make the discussion self-contained, and also putting it into context.
Key words and phrases:
Poincaré–Birkhoff–Witt theorem, Magnus commutators, free Lie algebras
2010 Mathematics Subject Classification:
Primary: 17B35. Secondary: 17B01, 16S30.
The author would like to thank the support of Balázs Csikós.
1. Introduction
The objective of this paper is to give several alternative proofs of the (global versions of the) Poincaré–Birkhoff–Witt theorem. We also consider some consequences of these arguments.
The local form. Assume that is a unital commutative ring, and is a -module with a compatible Lie-ring structure; i. e. is a Lie -algebra (also called: Lie ring over ). The universal enveloping algebra is the free -algebra factorized by the ideal generated by the elements , the tensor products are taken over . Let denote this canonical homomorphism. The enveloping algebra is naturally filtered by , and the construction implies the existence of natural (surjective) maps . The (local form of the) Poincaré–Birkhoff–Witt theorem, whenever it holds, states that the maps are isomorphisms. This theorem is known to hold in the following cases:
(i) is a field, or, more generally, is a free -module, or, more generally, is a direct sum of cyclic -modules (Poincaré [25]: is a field, ; Birkhoff [2], Witt [39]: is a field, but their methods work more generally; cf. also Bourbaki [4], Ton-That, Tran [36]);
(i’) is a principal ideal domain (Lazard [19]) or just a Dedekind ring (Cartier [6]); also see Higgins [16] for further results in this direction;
(ii) (Cohn [9]);
(ii’) but (Nouazé, Revoy [24]);
(cf. Grivel [14] for a review), but there are counterexamples (Širšov [31], Cartier [6], Cohn [9]). The most general approach is of Higgins [16], cf. Revoy [27].
The global form. In practice, mostly cases (i) are (ii) are considered but typically formulated in global form:
(i) If is a sum of cyclic -modules, then we can choose a basis , and an ordering of . Then let be the submodule of spanned by with . Then the “basic” version of the PBW theorem states that
[TABLE]
(a restriction of ) is an isomorphism of -modules. (In fact, what we really need, in general, is a choice function , which transforms any finite multiset from into an ordered word . Then the statement is that is that the corresponding map is an isomorphism. For the sake of simplicity, we will use the ordered version.) This is equivalent to the local version. Indeed, it is easy to show that is surjective: Whenever we have an expression in (as an image of ), we can rearrange the formally top nonarranged degree term into an image of at the cost of generating formally lower order terms. Then we repeat this in formally lower orders. We call this as the “basic rearrangement procedure”. Then the isomorphism (i. e. injectivity) is a consequence of the local PBW theorem, and in fact, equivalent to it.
(ii) If , then one can consider the submodule of . This submodule can be interpreted either as the submodule of elements invariant under permutations in the order of tensor product or as the span of the elements . Then the “symmetric” version of the PBW theorem states that
[TABLE]
(a restriction of ) is an isomorphism of -modules. This is, again, equivalent to the local version. We can prove the surjectivity of , using the “symmetric rearrangement procedure” in , i. e. symmetrizing in the formally top nonarranged degree term at the cost of generating lower order terms, and repeating the process in formally lower orders. Then, again, isomorphism (injectivity) is a consequence of the local PBW theorem, and, in fact, equivalent to it.
In practice, one typically starts with the global cases (i), (ii), and then proceeds further to the general local ones. For sake of reference, in Appendix A, we include the Witt–Lazard version of the proof of the global cases of the PBW theorem. (We formulate it to cover not only the original case (i) but (ii).)
The free case. One of the few cases where is easy to describe is when is the free Lie -algebra . Then is naturally isomorphic to the noncommutative polynomial algebra . This holds for purely universal algebraic reasons. Indeed, evaluates by commutators, which defines a map , which, by the universality of the enveloping algebra, gives rise to a map . Here the class maps to . Comparing this to the universality of , we see that is an isomorphism. What is not that obvious is that is an inclusion. This is the theorem of Magnus [20] (cf. Witt [39]; the critical case is ) on the representability of free Lie -algebras. But this is a consequence of the PBW theorem ( level). Now, the PBW theorem does, indeed, hold in the case when is a free Lie -algebra; this is due to the fact that free Lie -algebras are free -modules. This latter fact, however, is not entirely trivial if is not a field (but it can be derived easily from the theorem of Magnus…). In Appendix B, we include some elementary facts regarding free Lie -algebras, and we show how to prove that free Lie -algebras are free -modules using a sufficiently strong version of the PBW theorem itself. In Appendix C a more informative account is given, which shows how to do this without the use of the PBW theorem. An important point is, however, that free Lie -algebras are notable special cases of the PBW theorem.
The first part of this article is concerned with proving the symmetric PBW theorem using Magnus commutators (cf. Magnus [20]). Extensions of the Dynkin–Specht–Wever lemma and a general theorem of Nouazé–Revoy type are also obtained. The second part focuses on free Lie -algebras and the basic PBW theorem. Appendices are provided in order make the discussion self-contained, and also putting it into context. Arguments of such type where considered before; see Cartier [5], Bonfiglioli, Fulci [3], Ch. 6. A difference compared to them is that in the first part we use Magnus commutators instead of BCH series (the former is just more on target); and in the second part we consider general rings ( and are relatively easy). We keep our arguments elementary. We do not consider the several generalizations of the PBW theorem (but see Grivel [14] for some). Nor we consider proofs which use higher algebraic or topological methods; for those we refer to the general literature.
2. The existence of I
For practical reasons we will use left-iterated higher commutators .
Proposition/Definition 2.1**.**
There is a series of Lie-polynomials , , over such that the following hold:
[TABLE]
where the generating function of the coefficients is
[TABLE]
[TABLE]
where the generating function of the coefficients is
[TABLE]
[TABLE]
for , where the generating function of the coefficients is
[TABLE]
Proof.
It is easy to see that has only rational coefficients, and the formal rational power series on the RHS of (cgen), line 1 and 2, expand function theoretically to line 3 (which therefore also allows a power series expansion). Thus, we have here three well-defined recursive definitions for , we just have to show that they give the same Lie-polynomials.
The three definitions are obviously the same for . By induction, assume that the are well-defined for , . Consider first the definition of according to (R). This is
[TABLE]
where we do not fill out the variables in the , but just note that we have to sum for all locally increasing deployments of the variables .
In each summand, we consider the containing , and expand it using (L). (We can do this according to the induction hypothesis). It yields
[TABLE]
There we have a commutator \boldsymbol{[}\boldsymbol{[}\underbrace{\mu(...),\ldots,\mu(...)}_{\text{s\mu}},X_{1}\boldsymbol{]}_{\mathrm{L}},\boldsymbol{[}\underbrace{\mu(...),\ldots,\mu(...)}_{\text{r-1-p\mu}},X_{n}\boldsymbol{]}_{\mathrm{L}}\boldsymbol{]}, which is commutated further by many . Let us distribute those, using the Leibniz rule, between the two terms of the commutators. We obtain
[TABLE]
This is a formally deterministic process, which gives nonzero contributions only for (because there are only many variables to distribute). According to our specific method, if , then
[TABLE]
We see that our manipulations yield for , which implies that the definitions of according to (R) and (C) are the same (again, terms with do not appear in either side, as there are only variables to distribute).
The argument that (L) and (C) give the same polynomials is analogous. ∎
Remark 2.2**.**
Some values of are given by
[TABLE]
( are the Bernoulli numbers.)
Some values of are given by
[TABLE]
(The numerators are more complicated for higher indices in both cases.) ∎
Proposition 2.3**.**
The Lie-polynomials , , satisfy the identities
[TABLE]
for , .
Proof.
It is easy to check that , which shows the statement for . By induction, assume that is well-defined for , . If is non-empty, then expand according to the (R)-expansions of the . Most of the terms cancel each other except those which contain and immediately next to each other. But then the induction hypothesis can be applied to show that it yields the (R)-expansion of . If is non-empty, then the (L)-expansion can be used to prove the identity in the same manner. ∎
3. From to the symmetric PBW theorem
Definition 3.1**.**
We define the map such that
[TABLE]
(and it acts trivially in the [math]th order).
Proposition/Definition 3.2**.**
* descends to a map .*
Proof.
It is sufficient to check that
[TABLE]
that is vanishes on the ideal generated by the elements . This vanishing, when expanded, however, is a consequence of identities (2). ∎
Lemma 3.3**.**
Suppose that , with , is a combination of Lie-monomials, such that in every Lie-monomial every variable appears exactly once. Then
[TABLE]
In particular, this holds for , .
Proof.
This is sufficient to prove for Lie-monomials of . If is a non-trivial monomial, then it contains an inner Lie-commutator , . Now, the permutations from come in pairs and , which cancel each other in the permuted monomial. ∎
Proposition 3.4**.**
* inverts .*
Proof.
Then
[TABLE]
Indeed, according to the previous definition, , furthermore is just a restriction of ; this implies the first equality. The second equality is due to the fact that the higher () vanish under symmetrization (Lemma 3.3). The third one is true due to and that the symmetrization of the symmetrization is the symmetrization. This proves . In particular, is injective. Then, the surjectivity of implies bijectivity, and, in fact, the inverse relationship. ∎
This, in particular, proves the symmetric global version of the PBW theorem, i. e. that is an isomorphism.
The facts behind the proof above are known for a long time: It is known that the canonical projections (the components of ) can be expressed by Magnus-commutators , see Solomon [33] and Mielnik, Plebański [23]; which satisfy rational Lie-recursions, see Magnus [21]. Cf. also Reutenauer [26].
The advantage of this proof is that it is constructive and explicit. On the other hand, the unmotivated nature of the definition of the is a disadvantage. However, we have not used the definition of the directly, but only that they satisfy
- ()
; 2. ()
is a linear combination of Lie-monomials where every variable has multiplicity ; 3. ()
identities (2) hold.
Thus, it might be useful to obtain a somewhat less explicit existence theorem for but which is motivated by simple universal algebraic principles. The best principle in that respect would be the global symmetric PBW theorem itself. From it we could obtain as a canonical projection. This is, of course, not the way we intend to follow (at this point). As it happens, some simpler arguments suffice:
4. The existence of II
We define a Lie-permutation of as the following data. It is a partition such that , and finite sequences such that , and .
Lemma 4.1**.**
The number of Lie-permutations of , , is .
Proof.
For any Lie-permutation write down the sequence
[TABLE]
This yields a permutation of . From this permutation the Lie-permutation can be reconstructed. Indeed, in the permutation sequence, the first couple of elements up to ‘’ form the last partition set with ordering. Then, from the rest, the first couple of elements up to the maximal element form the the partition set with ordering; etc. It is easy to see that we have a bijection between permutations and Lie-permutations. ∎
In what follows let be the vector space spanned by the noncommutative monomials in the corresponding noncommutative polynomial ring over .
Proposition 4.2**.**
Any element of can uniquely be written in the form
[TABLE]
where . (Here we used ordinary commutators and symmetrized products.)
Proof.
Existence is a consequence of the standard symmetrization argument but applied in the non-commutative polynomial algebra. This proves that any element is a sum symmetric products of Lie-monomials. Lie-monomials, on the other hand, can be brought into standard form (highest indices on the right in left-iterated Lie-commutators). Uniqueness follows from dimensional reasons, as the number of Lie-partitions of is , the same as the dimension of . ∎
Let us write the monomial into a form like above:
[TABLE]
the are concrete rational numbers.
Definition 4.3**.**
Then let us define
[TABLE]
where we now use Lie-commutators instead of commutators.
Now, the satisfy (), (), and we can also prove
Proposition 2.3.
The Lie-polynomials , , satisfy the identities
[TABLE]
for , .
Proof.
Using Lie algebra rules, both sides of (2) can be brought into form
[TABLE]
respectively. However, let us consider the expansion of
[TABLE]
with respect to (4) in each component separately (for , and terms, respectively), and apply formally the same standardization procedure (highest indices on the right in left-iterated commutators) to it. In the standardization process, the number of the components in the symmetric products does not change. All we ask to from the standardization process is to proceed in the lowest formal symmetric rank (i. e. 1) with commutators as we did with Lie-commutators before. Then, due to the unicity of the description (Proposition 4.2), the formally lowest symmetric orders agree,
[TABLE]
and again, due to the unicity, , and this is what we wanted to prove. ∎
Then one can proceed with the proof of symmetric global PBW theorem as in Section 3.
One can ask if the defined in Sections 2 and 4 are the same. Of course, they are, as they serve as components in (in particular, in the case of the free Lie algebra over ), and the inverse is unique.
From the content of Sections 3–4, one can simply develop several properties of the Magnus commutators. Some consequences are presented in Section 5 and Appendix D.
5. Related to I
Assume that expanded in the rational noncommutative polynomial algebra,
[TABLE]
For the moment, we are not interested in the actual values of the (but see Remark D.1). Let us fix an arbitrary element . Now, is a Lie-polynomial, thus, using standard commutator rules, we can write it as linear combination of terms , where . However, evaluated in the noncommutative polynomial algebra, such a commutator expression gives only one monomial contribution such that the last term is . Thus, the coefficient of can be read off from (5). We find that
[TABLE]
(Cf. Arnal, Casas, Chiralt [1].) Summing this for all possible , we obtain
[TABLE]
This allows to prove the following general version of the Dynkin–Specht–Wever lemma:
Proposition 5.1**.**
() Suppose that the -submodule is closed for actions of inducing permutations in the order of tensor product. Also assume that is injective. Now, if
[TABLE]
such that and , then
[TABLE]
Proof.
means that in
[TABLE]
Due to the injectivity of , we find
[TABLE]
(both sides are in , because is permutation-invariant). Then, applying , and using (7), and, finally, , we find
[TABLE]
This is what we wanted to prove. ∎
The discussion extends to the weighted case. If we assign the weight to the variables (for accounting purposes), then we can sum (6) for all possible with weight respectively. Then we obtain
[TABLE]
Assume that is -graded as a -module (but not necessarily as a Lie -algebra). Then is also -graded naturally. Let be the map which acts as multiplication by on the component of grade .
Proposition 5.2**.**
() Suppose is closed for actions of all inducing permutations in the th tensor order. Also assume that is injective. Now, if
[TABLE]
such that and , then
[TABLE]
Proof.
We can assume that is of homogeneous grade . Then the previous proof works but using (9) instead of (7). ∎
The ordinary (weighted) Dynkin–Specht–Wever lemma is just the case when is the free Lie -algebra generated by the formal variables (with grade ), and is generated by tensor products (multiplicities are possible).
Corresponding statements also hold with respect to the right-iterated Lie-commutators .
The same arguments can also be carried out in the following way. Let us fix . The Lie-polynomial , using standard commutator rules, can be written as a linear combination of terms , where . However, evaluated in the noncommutative polynomial algebra, such a commutator expression gives only one monomial contribution such that is immediately followed by . Compared this to (5), we find that
[TABLE]
Summing this for all possible pairs , we obtain
[TABLE]
Arguing in the same manner as before, in the setting of Proposition 5.1
[TABLE]
is also true. This implies the following version of the Dynkin–Specht–Wever lemma, which holds for arbitrary :
Proposition 5.3**.**
Suppose that is a Lie-polynomial, i. e. an element of . Assume that expands in the commutator-evaluation in the noncommutative polynomial algebra as
[TABLE]
Then
[TABLE]
Proof.
If , then it follows from the previous argument. Invoking Proposition C.4 from Appendix C, embeds to naturally. So, it is also true for . The general case follows by taking tensor products with an arbitrary . ∎
This is ‘C’-bracketed version of the well-known statement. Weighted versions are also possible, but they are better to be formulated in a multigraded environment.
6. as a direct construction
Still assume . Let us define the maps for , such that
[TABLE]
(See formula (29) for motivation with respect to the notation.) Considering the natural correspondence between and , we obtain
Proposition 6.1**.**
* is naturally isomorphic to endowed with a product rule such that*
[TABLE]
Proof.
Indeed, we have linear isomorphisms / between the two modules. Regarding the product structure, if we resolve as , take the tensor product, evaluate by , and resolve back to , then we obtain the product rule as above. ∎
Thus, a direct construction for , in case , would simply be endowed with the product rule (12). (Cf. Cartier [5].) Checking well-definedness directly is not particularly hard, but checking the arithmetics for associativity is not that easy. Nevertheless, we know that the arithmetics works out, because the proposition above holds for the free Lie algebra over the rational numbers.
In particular, it works out in the case of the free -nilpotent Lie algebra, where the identity holds. In this case, we can consider the evaluator , as identically [math]. In particular, the associativity works out only using , and the -nilpotency rule. Now, , can be defined using only the ring ; indeed, in the “symmetric rearrangement procedure” leading to we use symmetrizations up to elements only, and also in the definitions of . (For a more quantitative argument regarding , see (5)–(7) and Remark D.1.) Now, the free -nilpotent Lie algebra over naturally embeds into the free -nilpotent Lie algebra over . In fact, the free -nilpotent Lie algebra (but not its universal enveloping algebra) naturally embeds to the -nilpotent noncommutative polynomial algebra by the commutator representation. This implies that the associativity computation works out in the free -nilpotent Lie algebra over . However, this implies that it works out in any -nilpotent Lie algebra with . Thus, in that case, yields an associative algebra. is generated by , thus we have a natural factorization map . Regarding the filtration induced by the image of , this induces a natural factor map . This however, implies that is [math]. In particular, we obtain
Proposition 6.2**.**
If is -nilpotent and , then
(o) can be defined formally;
(a) is naturally isomorphic to ; and
(b) the (local) PBW theorem holds for .
Proof.
(a) and (b) are both implied by . ∎
This is a generalization of the result of Nouazé, Revoy [24].
7. via
First, we give a direct proof of the PBW theorem for free Lie algebras over . (The argument works for any field of characteristic [math].) We will use the fact that free Lie algebras are multigraded. The proof will be sketchy as we rely on familiar arguments.
Proposition 7.1**.**
The PBW theorem holds for .
Sketch of proof.
We will prove the symmetric global formulation. Consider
[TABLE]
Both sides are naturally multigraded, and the map is compatible with them. Thus, it is sufficient to prove isomorphism (that is injectivity) between them in every multigrade separately. If in the multigrade every variable has multiplicity at most one, then the injectivity holds due to Proposition 4.2. Regarding higher multigrades, assume that in multigrade ,
[TABLE]
is such that it is not zero but evaluates to zero in . Here was written such that every variable appears according to the multiplicity (for a monomial decomposition). Then the polarization
[TABLE]
() is also not zero, because its depolarization is nonzero. On the other hand, it evaluates to the polarization of [math], i. e. to [math] as a noncommutative polynomial. This contradicts to the injectivity of the multigrades without multiplicity .
Remark. The proof is not particular to the symmetric formulation. We could have used a variant of Proposition 4.2 with respect to ordinary products, not with symmetrized products. The only place where was used is in (13), where we made a polarization such that its depolarization is the original. ∎
The PBW theorem for yields as the component of degree 1 of , that is, as a canonical projection. Properties (1)–(3) are straightforward to develop. (“The existence of III”.)
We will use Proposition C.4 from Appendix C in order to prove
Proposition 7.2**.**
The PBW theorem holds for .
Proof.
First, let us consider the case . We know that is a free -module, and naturally. Thus, starting from a basis of , we see that embeds to . The latter one evaluates in injectively, thus evaluates (taking Lie-commutator into commutator, tensor product into ordinary product) in some ring injectively. This implies that must evaluate in the universal enveloping algebra injectively.
Let us now consider the general case. We know that the evaluation yields an isomorphism of free -modules. But then evaluation (i. e. the process which sends Lie-commutators to commutators and tensor product to products) gives an isomorphism . On the other hand, naturally, and compatibly with evaluation. That proves that evaluates to isomorphically. ∎
8. The case directly
We say that a free PBW word basis is the following data. We will consider words formed from an alphabet .
- (A1)
Some words should be called primitive. 2. (A2)
To any primitive word a -monomial should be associated such that the variables of correspond to the in with the same multiplicity. 3. (A3)
The Lie-polynomials should generate as a -module. 4. (A4)
The primitive words should be endowed by an ordering such that every word uniquely decomposes to a concatenation of primitive words such that . 5. (A5)
To any word decomposed as above we associate the noncommutative polynomial
[TABLE]
where is the commutator evaluation . 6. (A6)
The noncommutative polynomials should be independent in .
Our first observation is that the existence of a free PBW word basis implies the basic PBW theorem for . Indeed, considering (A3) and (A6), we see that that should be a basis of . Then, due to (A3), every element in can be brought into a combination of products such that , by the usual basic rearrangement process. Such products are then independent in the universal enveloping algebra, as they are independent in the noncommutative polynomial evaluation, due to (A6). This establishes the global form of the basic PBW theorem with respect to a specific ordering. But then the local form holds, which implies the global form in general.
Next, we will find a free PBW word basis. In order to this, we will rely on the content of Appendix C. The argument is quite combinatorial; we will be somewhat sketchy. First we establish the case .
Firstly, we need a breaking pattern. We will use an ordering on the words made from which is monotone with respect to monotone maps of . (Lexicographic ordering suffices). We break a finite -word as follows. The definition is recursive with respect to the length of the words. We identify the greatest number in . Then reads as
[TABLE]
( many occurrences of ). If the word contains only , then we break the word to letters completely. If not, then we surely break after the last occurrence after of , and we might break () after other occurrences of , and we might break after the last occurrence of ():
[TABLE]
Regarding the rule is simple: we break as we would break . Regarding the rule is more difficult: Consider the sequence of sequences
[TABLE]
Replace it by a sequence of integers
[TABLE]
such that the internal ordering pattern of (16) with respect to is the same as the ordering pattern of (15) with respect to . (We say that (15) is condensed to (16)). Then the breaking places of are defined to be the breaking places of (16). One can prove by induction that the breaking mechanism is well-defined, and it depends only on the internal ordering pattern of with respect to . A word is primitive if it does not break (so, in particular, the latest cipher is the maximal). Then there is a natural ordering defined between primitive -words as follows: is the greatest (i. e. last) cipher of is smaller than the greatest (i. e. last) cipher of ; or if the last ciphers are equal, then the simultaneous condensations of are are in relation. By induction, one can prove that the decomposition is non-strictly -decreasing, so (A1) and (A4) are established.
Secondly, we need an evaluation pattern. We do not go into various possibilities, but we simply define . To every sequence as above we associate a noncommutative polynomial such that its multidegree is given by the coming from the ciphers of , with multiplicities. If contains only the cipher , in length , then . Otherwise we let
[TABLE]
where
[TABLE]
and is
[TABLE]
but is substituted by . (Remark: A recursive evaluation pattern would be , where indicates the replacement of by .) By induction, one can see that is a product of commutator monomials corresponding to primitive words. If is a primitive, then can be defined as a Lie-monomial. This establishes (A2) and (A5).
Then the generating statement (A3) follows by induction (on formal multigrade in ) using the standard fact that Lie-monomials containing are Lie-polynomials of . The independence statement (A6) follows by induction using Corollary C.2’. By that the construction is finished for .
Now, is not necessarily the same as . For that reason we introduce an ordering on . Then word breaking and evaluation is induced by replacing the letters in by integers such that the order structure of the replacement with respect to is compatible with the order structure in with respect to ; then is replaced back to . (This is well-defined because the word breaking and evaluation structure over was invariant for monotone maps of .)
One can fine-tune the construction combinatorially by choosing various breaking patterns (e. g. one can make depend on condensation history) or evaluation patterns. In fact, such constructions were developed in great depth by Hall [15], Chen, Fox, Lyndon [8], Širšov [32], Schützenberger [29], Viennot [37], Melançon, Reutenauer [22], etc.; and it is recognized that these constructions imply the PBW theorem in the free case, cf. Širšov [32], Reutenauer [26]. For us, however, variety has little benefit, one construction is sufficient, and the PBW theorem works ultimately with respect to any basis.
9. From to the basic PBW theorem
Here we assume to know that free Lie -algebras are free -modules in every multigrade separately, and that the PBW theorem holds for them.
Proposition 9.1**.**
The PBW theorem holds if is a free -module.
Proof.
Consider a base for . Take . Let us extend by obtained from higher multigrades to a basis
[TABLE]
of . Assume that
[TABLE]
( is substituted by in ). Then let
[TABLE]
Now
[TABLE]
is still a basis. Take any ordering on that; and assume, say, that elements belonging precede the ones belonging to . The elements span an ideal in . Indeed, they span exactly the kernel of the evaluation map . If , then ; thus there is a homomorphism
[TABLE]
[TABLE]
where is the ideal generated by the image of . We claim that is -linearly generated by the elements
[TABLE]
such that , . Indeed, if we take an arbitrary product of base elements which contains at least one and we apply the basic rearrangement procedure, then at least one element in any formal product monomial will be from . A base element from is either unaffected in a step, or it gets commutated, but then the commutator is a -linear combination of elements of .
Now, the injectivity of with respect to means that the evaluation map given by
[TABLE]
[TABLE]
() is injective. Thus the evaluation map into must also be injective. (Remark: Actually, by universal algebraic reasons.) ∎
It seems to be a drawback that we obtained only the basic PBW theorem for free -modules. This can be remedied as follows. Due to the relatively transparent structure of , one can define free Lie algebras with variable coefficient structure. This means that in multigrade () the coefficient ring is (). This has the same monomial structure as ; except that some multigrades are deselected (where the coefficient ring is [math]), but this makes no essential difference. This evaluates in the noncommutative polynomial algebra , and the PBW theorem remains valid, as in every multigrade we have the same evaluation structure as in the free Lie algebra with with respect to the appropriate coefficient ring. But then the arguments of the previous proof can be modified in order to obtain the basic PBW theorem for sums of cyclic -modules.
10. Conclusions
If one is interested in the PBW theorem per se, then the approach of Witt and Lazard is rather satisfactory (as a starting point). If one is interested in Lie groups, then an existence argument for + Section 3 for the symmetric PBW theorem might be a good approach. Section 7 + Section 9 specialized to the case when is of characteristic [math] gives a relatively straightforward proof in the spirit of Poincaré. One interested in a deeper study of free Lie -algebras can obtain the basic PBW theorem essentially as a byproduct.
Appendix A The Witt–Lazard proof of the global PBW theorems
Although the classical proofs of the PBW theorem which work for general fields are quite similar to each other; the approach due to Witt [39] and Lazard [19] is characterized (as opposed to Birkhoff [2] and Bourbaki [4]) by (a) an emphatic appearance of the symmetric group, and (b) a more explicit description of the ideal structure of the universal factorization. In short terms, it algebraizes the combinatorics quite well. In fact, it allows to formulate the proof of the PBW theorem simultaneously in (i) the basic case (sum of cyclic -modules) and (ii) the symmetric case ().
(I) Actions of symmetric groups. The symmetric group acts naturally on by the presription
[TABLE]
Then is generated by where , . Let denote the permutation in . Then it is also true that is generated by where , . Let us define the by taking a Lie-commutator between the th and th positions. So,
[TABLE]
We can extend and to . In the first case the action is identity outside , and in the second case it is the zero map.
We define the action as (extended sense). Then vanishes if , . Let be the module generated by , . We see that . Let us extend to as follows. For let be the identity. For we choose an arbitrary (but fixed) decomposition , and we let . Now, still acts as identity outside , but it does not necessarily define an associative action of . However, it is not very far from it:
Lemma A.1**.**
* acts trivially on (thus invariantly), and*
[TABLE]
[TABLE]
[TABLE]
Proof.
The triviality property follows from . The equalities follow from the identities
[TABLE]
[TABLE]
if ;
[TABLE]
which, checked against , follow from the Lie-identities. ∎
Corollary A.2**.**
* extends to an associative action of modulo .*
Proof.
In (P1)–(P3) we recognize the semigroup presentation of based on (Cf. Dickson [10], P. 2, Ch. XIII). The relations are satisfied according to the previous lemma, thus the action descends to . ∎
(II) The tensorial splittings. We define the forgetting map such that
[TABLE]
and we define the evaluation map such that
[TABLE]
In case (i), we take a basis a , and introduce an ordering on . We define such that
[TABLE]
where and , implies . I. e. is the permutation which orders with the least number of involutions.
In case (ii), we simply define
[TABLE]
Then
[TABLE]
It easy to see from the definition that
[TABLE]
for any , . This is the same thing to say as . Then
[TABLE]
Indeed, .
This idempotence yields the direct sum decomposition
[TABLE]
where the first factor is named so by definition, and regarding the identification of the second factor we note that .
(III) The PBW splittings. Note that in case (i), ; and in case (ii), . Thus, the statement of the PBW theorem is that and do not intersect each other (and, in fact, they are complementer spaces in ).
Also note that very little happens in (II). It only algebraizes familiar combinatorial content which is otherwise accepted without much ado. The point is that we can modify this content as follows:
We define the evaluation map such that
[TABLE]
Lemma A.3**.**
For ,
[TABLE]
Proof.
Let us note that , , acts trivially on ; so . This implies .
Case (i): Assume that . If , then , ,
[TABLE]
If , , then . Thus
[TABLE]
Case (ii):
[TABLE]
Remark A.4**.**
The elements with , still generate only . Indeed, if the canonical decomposition is , then
[TABLE]
Using similar arguments, can be replaced by an arbitrary in equation (20). Actually, the discussion yields constructive maps such that
[TABLE]
If we consider only the -part, then this yields , which simplifies to , cf. equation (19).
(This remark is not needed to the proof.) ∎
Let be the image of under . (Strictly speaking, depends not only on but also on .) Note that any element can be reconstructed from its projection to which is in . Indeed, the projection of is , and the projection of is [math]; and formula (20) implies .
Remark A.5**.**
One can show that . Indeed, the LHS is the image of under , while the RHS is the image of (cf. the beginning of the previous Remark). Now, projects to , thus yields an idempotent on . It is straightforward to see from Lemma A.3 that the corresponding inner direct sum decomposition is
[TABLE]
(E. g. .)
(This remark is not needed to the proof.) ∎
Let .
Corollary A.6**.**
The following inner direct sum decompositions hold:
(a) ;
(b) ;
(c) ;
(d) ;
(e) ;
*(f) .
Remark: Here , , .*
Proof.
(a) Lemma A.3 implies . Adding to both sides, we obtain . The two factors in the sum must be disjoint, as projected to , the first factor projects to [math], while the second factor projects faithfully.
(b) follows from (a) inductively.
(c) follows from (b) by taking increasing unions.
(d) The first equality in obvious. The second one follows from the fact that on the RHS, as projected to , the second factor projects to faithfully, and projects to faithfully.
(e) follows from (d) inductively, the labeling uses (b).
(f) follows from (e) by taking increasing unions. ∎
In particular, we find the statement of the PBW theorem in (f).
Appendix B About free Lie algebras. Version 1
Free Lie -algebras (or any other kinds of free algebras) do not really require specific constructions. Nevertheless, it is very useful to have some structure theorems which provide some control over them, even if minimal. Let us think about the free Lie -algebra as the free nonassociative -algebra factorized further by the -submodule (ideal) . Additively, is just the free -module generated by the -monomials of the .
Proposition B.1**.**
* is generated by the elements*
- (F1)
* ,* 2. (F2)
* ,* 3. (F3)
* *
;
where are monomials of the , and is a -bracketing with many positions (but not necessarily in the indicated order), and .
Proof.
Such elements are clearly in the ideal . Conversely, whenever we take elements from and apply the Lie-identities, then they expand to sums of cases (F1)–(F3) with trivial . (Notice that case (F2) cannot be omitted.) Thus the primary relations (coming form the Lie-identities) are generated. The secondary relations (coming from are also generated due to linearity and that nontrivial are allowed. ∎
Corollary B.2**.**
* is multigraded by the number of various variables.*
In any multigrade, corresponding to finite multiset of the , is generated by finitely many monomials.
The structure in a given multigrade depends only on its multiplicity structure (independently of the presence of other variables, etc.).
Proof.
The multigradedness will be inherited from , because the relations from Proposition B.1 are multigrade-homogeneous. Furthermore, every finite multiset of the can be bracketed only in finitely many ways, so finitely generatedness is true even in . The structure of also depends only on the multiplicity pattern. ∎
The following result, the Dynkin–Specht–Wever lemma (cf. Dynkin [13], Specht [34], Wever [38]) is a simple consequence of the gradedness of the free Lie -algebra. We present the weighted version. Suppose that we assign the weight to every variable . Let be the map which multiplies by in multigrade .
Proposition B.3**.**
(Weighted Dynkin–Specht–Wever lemma.) Suppose that is a Lie-polynomial, i. e. an element of . Assume that expands in the commutator-evaluation to the noncommutative polynomial
[TABLE]
Then
[TABLE]
Proof.
Consider , and extend the Lie bracket such that , and , . This yields a Lie -algebra. (It is sufficient to check , , when are Lie-monomials or ). Then
[TABLE]
The “unweighted” Dynkin–Specht–Wever lemma is when every weight is equal to . Similar statements hold with respect to right-iterated higher commutators.
The gradedness also allows to apply the PBW theorem (for sum of cyclic submodules) to obtain the representability theorem Magnus [20] (cf. also Witt [39]) for free Lie -algebras.
Proposition B.4**.**
(Theorem of Magnus about the representability of free Lie algebras.)
(a) is a free -module (in every multigrade). In fact, naturally (in every multigrade).
(b) embeds to the noncommutative polynomial algebra by the commutator-evaluation.
Proof.
Assume . Then is a finitely generated -module in every multigrade, thus it is a sum of cyclic -modules. Then the PBW theorem (for sums of cyclic submodules) can be applied to show that embeds into . It is immediate that (b) the image is the commutator subalgebra; and (a) the image of has no torsion, so is a free -module in every multigrade.
In fact, we observe that additively \mathrm{F}^{\operatorname{n-a}}_{\mathbb{Z}}[X_{1},\ldots,X_{n}]\simeq(\text{a free \mathbb{Z}-module})\oplus\mathrm{I}^{\operatorname{Lie}}_{\mathbb{Z}}[X_{1},\ldots,X_{n}] (in every multigrade). This decomposition structure survives by tensoring with , so general case (a) follows. Then general case (b) follows using the PBW theorem. ∎
Remark B.5**.**
The approach of the proof of the Proposition B.4 is sort of the minimal if one wants to amend the basic PBW theorem (case (i)) to free Lie -algebras; although it is not very informative regarding the possible bases of free Lie -algebras. However, a generalization of the techniques used in the proof of the Dynkin–Specht–Wever lemma can be applied as an alternative:
Elimination by derivations. (We only sketch this approach.) It is easy to see that derivations of are determined by arbitrary prescriptions . Then Proposition B.1 implies easily that these derivations descend to . In particular, its derivations are also given by arbitrary prescriptions . For any Lie -algebra , we can defined the extension such that . We can apply this in the setting when is a the set of words on an alphabet, , and the derivations are given by .
Assume that is a Lie-polynomial of multigrade , ; . Then, using the universal properties of Lie-polynomials, we can substitute to for . As is a Lie-polynomial of some (), we see that the result is a Lie-polynomials of some . Back substitution of into also works, so we obtain that Lie-polynomials of multigrade , , are in bijective correspondence to Lie-polynomials of satisfying some simple multigrade conditions. This allow to clarify the structure of free Lie -algebras inductively. (In particular, Proposition C.4 can be proven.) We do not pursue this approach, because if it comes to elimination, then it is just simpler to use noncommutative polynomials.
Regarding the pattern of eliminations, we remark that we could have left intact, but substituted into . As can be expressed as a Lie-polynomial of some (), the result is a Lie-polynomial of some , etc. In fact, this is the traditional Lazard–Shirshov elimination process, cf. Širšov [30], Reutenauer [26]. (Or, we could have eliminated other subsets of variables.) It is merely the preference of the author to eliminate not one but all but one variables. ∎
Appendix C About free Lie algebras. Version 2
Elimination by polynomials. Consider the noncommutative polynomial algebra . Let be the operation which sends the monomial
[TABLE]
into the polynomial
[TABLE]
Lemma C.1**.**
The map leaves the multigrading of invariant. It acts as an isomorphism in every multigrade.
Proof.
It is obvious that the multigrading is left invariant. If is an alphabet with ordering , then let be the ordering on the words of such that longer words are greater, and equally long words are ordered lexicographically. Now let be an arbitrary ordering on the alphabet . To any monomial (21) we assign the word of words
[TABLE]
Let us order the monomials (21) in the order (22) with respect to . Then it is easy to see that in that basis the action of is triangular with ’s in the diagonal, thus it is an isomorphism. ∎
Corollary C.2**.**
Suppose that is a noncommutative polynomial over , and assume that the noncommutative polynomial
[TABLE]
yet the monomials
[TABLE]
are different from each other. Then
[TABLE]
In fact, we can also prove the following stronger statement. A divided noncommutative polynomial is a noncommutative polynomial where the monomials are of shape .
Corollary C.2’****.
Suppose that is a noncommutative polynomial over , and assume that the noncommutative polynomial
[TABLE]
yet the monomials
[TABLE]
are different from each other. Then
[TABLE]
Proof.
Let us apply the isomorphism . This gives
[TABLE]
But then the difference in the -monomials implies . ∎
Proposition C.3**.**
* embeds to the noncommutative polynomial algebra by the commutator-evaluation.*
Proof.
We have to prove that if evaluates to [math] in the commutator expansion in , then simplifies to [math] in . We can assume that is expanded to Lie-monomials, thus it is represented by a non-associative polynomial . In that viewpoint, we have to prove that if evaluates to [math] in the commutator expansion in , then can be simplified to [math] using Lie rules. We prove the statement by induction on the maximal length of the -monomials in . If , then the statement is obvious. Let us gather the terms of into groups corresponding to multigrades. The various expand to different multigrades in , thus the various must also expand to [math] in independently. Thus it is sufficient to consider the cases separately. We can assume that has monomials with variables with multiplicities respectively. If , then the statement is very easy: in the case the commutator expansion is identical, in the case obviously reduces to [math] using Lie rules. So, assume . Then by standard Lie rules we can expand to a Lie-polynomial of some () but so that formally the multiplicities of the variables remain. Thus
[TABLE]
where the sequences are different from each other, while the multiplicities of the variables on the two sides are the same. Nevertheless, the RHS of (23) must also expand to [math] in the commutator evaluation. But then according to Corollary C.2, also expands to [math] in the commutator expansion. Now due to the multiplicity structure, thus by induction we know that expands to [math] using Lie rules. But this implies that the RHS of (23) simplifies to [math] using Lie rules. So, consequently, also the LHS of (23). ∎
Then is multigraded induced from the multigrading in through commutator evaluation.
Proposition C.4**.**
(The uniformity of free Lie -algebras.)
(i) is a free -module (in every multigrade). In fact, we can choose a set of -monomials which acts as a basis (in every multigrade), independently from .
(ii) naturally.
Proof.
(i) We can assume that in a multigrade we have variables with multiplicities respectively. We proceed by induction on the degree . (Due to the previous statement we will use the terms Lie-polynomial and commutator polynomial synonymously.) If , then the statement is obvious. Assume that . If , then the statement is trivial. So assume . Using standard Lie rules, any Lie-polynomial of multigrade can be written in form
[TABLE]
such that is a Lie-polynomial, and the sequences run through every word of length at most made from .
Regarding the multigrade structure of , not every multigrade is allowed to appear nontrivially. But if an multigrade is allowed, then every commutator polynomial of multigrade is allowed to be used in . Thus we obtain that is of shape
[TABLE]
such that is of multigrade . On the other hand, this description is unique in terms of the commutator polynomials due to Corollary C.2. Hence, the situation decomposes in allowed multigrades. Thus, in particular, if
[TABLE]
form systems of base monomials in the allowed multigrades , then the elements
[TABLE]
form a system of base monomials for multigrade . However, for allowed multigrades, , so by induction we have monomial bases in allowed multigrades. It is also clear that this process can be made independent from the actual coefficients . (ii) This is transparent form the fact that formally the same monomial base can be chosen, independently from . ∎
Thinking algorithmically, the method described by (24)–(25) allows to construct bases rather easily. In fact, there are several choices due to the arbitrariness of the labeling the variables . Another thing is that we descended using simple -commutators but even those can be twisted by some multidegree-compatible maps on noncommutative polynomials. Due to this wealth of possibilities, free Lie algebra bases are interesting only as long as they have some additional combinatorial properties. E. g., accountability with respect to the PBW theorem.
Appendix D Related to II
Considering , as inverts , we see that in the noncommutative polynomial algebra
[TABLE]
Taking this for any subsequences of , and summing up, we find that
[TABLE]
i. e. modulo monomials where some variable has multiplicity more than . Taking logarithm, we find
[TABLE]
i. e. modulo monomials where some variable has multiplicity more than . This implies that
[TABLE]
Formally,
[TABLE]
Remark D.1**.**
According to a simple combinatorial argument of Strichartz [35], which we do not reproduce here (cf. also [18]), (27) quickly implies that in (5)
[TABLE]
where denotes the number of ascents, i. e. the number of pairs such that ; and denotes the number of descents, i. e. the number of pairs such that . (This is originally a result of Solomon [33] and Mielnik, Plebański [23].) In conjunction to (6), (7), (10), (11), this results several explicit formulas for . Taking (26) into account, this also allows to obtain the coefficients in (4). ∎
Substituting to the first many variables, and to the last many variables, we find that
[TABLE]
Inspecting the power series in and , we can quickly identify the coefficients of and , respectively. This yields
[TABLE]
As a consequence, regarding to the (formal) Taylor series of around , evaluated at , one finds
[TABLE]
In particular, we find that the Baker–Campbell–Hausdorff terms are commutator polynomials:
[TABLE]
(This is the viewpoint of Magnus [21], Chen [7], Cartier [5] on the BCH formula.) One obtains the full expansion of analogously.
Once we know that the components of are commutator polynomials (which can also be shown in other ways), we can apply the Dynkin–Specht–Wever lemma to (the homogeneous parts) of the power series expansion
[TABLE]
In this standard manner, commas in omitted, we obtain
[TABLE]
the formula of Dynkin [13].
Some works, e. g. Kolář, Michor, Slovák [17], or Duistermaat, Kolk [12] present
[TABLE]
as the BCH formula/ “Dynkin’s formula”, which they prove by differential equational/ geometric means, but formally just by using the old Schur(–Poincaré) argument
[TABLE]
(Cf. Schur [28], Poincaré [25], Duistermaat [11], Bonfiglioli, Fulci [3] Ch. 1, and references therein. This is also the line of reasoning which leads to the natural derivations of the (R) and (L) recursions in Proposition/Definition 2.1, cf. Magnus [21] and [18].)
Now, (32) can also be realized algebraically from (30) but by applying the weighted Dynkin–Specht–Wever lemma with weight prescription , : The part when the total weight is [math] can be seen to be easily. Then relabel to , and notice that only the , part survives weighting and commutatoring, respectively.
One can also apply the weight prescription . Another possibility is to apply , which also corresponds to the rewriting of the -version to -terminology. Altogether, this yields six formulas of Dynkin type. These formulas (in power series form) are all highly redundant, though. This is due to the particular inefficiency of expansion (30) and the general nature of commutators. Nevertheless, one can do (naive) convergence estimates as usual.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Arnal, Ana; Casas, Fernando; Chiralt, Cristina: A general formula for the Magnus expansion in terms of iterated integrals of right-nested commutators J. Phys. Commun. 2 (2018) 035024
- 2[2] Birkhoff, Garrett: Representability of Lie algebras and Lie groups by matrices. Ann. of Math. 38 (1937), 526–532.
- 3[3] Bonfiglioli, Andrea; Fulci, Roberta: Topics in noncommutative algebra. The theorem of Campbell, Baker, Hausdorff and Dynkin. Lecture Notes in Mathematics, 2034. Springer-Verlag; Berlin, Heidelberg, 2012.
- 4[4] Bourbaki, Nicolas: Groupes et algebres de Lie. Chapitre 1: Algebres de Lie. Hermann, Paris, 1960.
- 5[5] Cartier, P.: Démonstration algébrique de la formule de Hausdorff. Bull. Soc. Math. France 84 (1956), 241–249.
- 6[6] Cartier, P.: Remarques sur le théoreme de Birkhoff-Witt. Ann. Scuola Norm. Sup. Pisa (sér. 3) 12 (1958) 1–4.
- 7[7] Chen, Kuo-Tsai: Integration of paths, geometric invariants and a generalized Baker-Hausdorff formula. Ann. of Math. 65 (1957), 163–178.
- 8[8] Chen, K.-T.; Fox, R. H.; Lyndon, R. C. Free differential calculus, IV. The quotient groups of the lower central series. Ann. of Math. 68 (1958), 81–95.
