Tsallis entropy and generalized Shannon additivity
Sonja J\"ackle, Karsten Keller

TL;DR
This paper examines the axiomatic foundations of Tsallis entropy, focusing on the generalized Shannon additivity axiom, and characterizes Tsallis entropy for most parameter values.
Contribution
It provides a simplified axiomatic characterization of Tsallis entropy based on Shannon additivity, excluding specific cases where $ ext{α}=1,2$.
Findings
Shannon additivity characterizes Tsallis entropy for most parameters.
The cases $ ext{α}=1,2$ require separate analysis.
The axiomatic approach clarifies the foundational differences from Shannon entropy.
Abstract
The Tsallis entropy given for a positive parameter can be considered as a modification of the classical Shannon entropy. For the latter, corresponding to , there exist many axiomatic characterizations. One of them based on the well-known Khinchin-Shannon axioms has been simplified several times and adapted to Tsallis entropy, where the axiom of (generalized) Shannon additivity is playing a central role. The main aim of this paper is to discuss this axiom in the context of Tsallis entropy. We show that it is sufficient for characterizing Tsallis entropy with the exceptions of cases discussed separately.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Fractional Differential Equations Solutions · COVID-19 epidemiological studies
Tsallis entropy and generalized Shannon additivity
Sonja Jäckle and Karsten Keller
Abstract
The Tsallis entropy given for a positive parameter can be considered as a modification of the classical Shannon entropy. For the latter, corresponding to , there exist many axiomatic characterizations. One of them based on the well-known Khinchin-Shannon axioms has been simplified several times and adapted to Tsallis entropy, where the axiom of (generalized) Shannon additivity is playing a central role. The main aim of this paper is to discuss this axiom in the context of Tsallis entropy. We show that it is sufficient for characterizing Tsallis entropy with the exceptions of cases discussed separately.
1 Introduction
Some history.
In 1988 Tsallis [16] generalized the Boltzmann-Gibbs entropy
[TABLE]
describing classical thermodynamical ensembles with microstates of probabilities , by the entropy
[TABLE]
for in the sense that . Here is the Boltzmann constant (which as only a multiplicative constant will not be considered in the following). Many physicists argue that this was a breakthrough in thermodynamics since the extension allows better describing systems out of equilibrium and systems with strong correlations between microstates, but there is also critizism on the application of Tsallis’ concept (compare [7, 15]). In information theory pioneered by Shannon, the Boltzmann-Gibbs entropy is one of the central concept. We follow the usual practice to call it Shannon entropy. Also note that Tsallis’ entropy concept coincides up to a constant with the Havrda-Charvát entropy [8] given in 1967 in an information theoretical context.
There have been given many axiomatic characterications of Tsallis’ entropy starting from such of the classical Shannon entropy (see below). One important axiom called (generalized) Shannon additivity is extensively discussed and shown to be sufficient in some sense in this paper.
Tsallis entropy.
In the following, let for be the set of all -dimensional stochastic vectors and be the set of all stochastic vectors, where and are the sets of natural numbers and of nonnegative real numbers, respectively. Given with , the Tsallis entropy of a stochastic vector of some dimension is defined by
[TABLE]
In the case , the value is not defined, but the limit of it as approaches to is
[TABLE]
which provides the classical Shannon entropy. In so far Tsallis entropy can be considered as a generalization of the Shannon entropy and so it is not surprising that various axiomatic characterizations of the latter one have been tried to generalize to the Tsallis entropy.
Axiomatic characterizations.
One line of characterizations mainly followed by Suyari [14] and discussed in this paper has its origin in the Shannon-Khinchin axioms of Shannon entropy (see [13] and [10]). Note that other characterizations of Tsallis entropy are due to dos Santos [12], Abe [1] and Furuichi [6]. For some general discussion of axiomatization of entropies see [2].
A map is the Shannon entropy up to a multiplicative positive constant if it satisfies the following axioms:
[TABLE]
Axiom (1) called Shannon additivity is playing a key role in the characterization of the Shannon entropy and an interesting result given by Suyari [14] says that its generalization
[TABLE]
for provides the Tsallis entropy for this .
More precisely, if satisfies (S1), (S2), (S3) and (1), then is the Tsallis entropy for some positive constant . The full result of Suyari, which was slightly corrected by Ilić et al. [9] includes a characterization of the map under the assumption that also depends continuously on . We do not discuss this characterization, but we note here that the results below also provide an immediate simplification of the whole result of Suyari and Ilić et al.
The main result.
In this paper, we study the role of generalized Shannon additivity in characterizing Tsallis entropy, where for and we also consider the slightly relaxed property that
[TABLE]
It turns out that this property basically is enough for characterizing the Tsallis entropy for and with a further weak assumption in the cases . As already mentioned, the statement (iii) for is an immediate consequence of a characterization of Shannon entropy by Diderrich [4] simplifying an axiomatization given by Faddeev [5] (see below).
Theorem 1**.**
Let be given with (1), or a bit weaker (1), for . Then the following holds:
- (i)
If , then
[TABLE] 2. (ii)
If , then the following statements are equivalent:
- (a)
It holds
[TABLE] 2. (b)
* is bounded on ,* 3. (c)
* is continuous on ,* 4. (d)
* is symmetric on ,* 5. (e)
* does not change the signum on .* 3. (iii)
If , then the following statements are equivalent:
- (a)
It holds
[TABLE] 2. (b)
* is bounded on .*
Note that statement (iii) is given here only for reasons of completeness. It follows from a result of Diderrich [4].
The paper is organized as follows. Section 2 is devoted to the proof of the main result. It will turn out that most of the substantial work is related to stochastic vectors contained in and that the generalized Shannon additivity performs as a bridge to stochastic vectors longer than or . Section 3 completes the discussion. In particular, the Tsallis entropy for on rational vectors is discussed and an open problem is formulated.
2 Proof of the main result
We start with investigating the relationship of and for .
Lemma 2**.**
Let and satisfy (1). Then for all it follows
[TABLE]
in particular for
[TABLE]
and for
[TABLE]
Moreover it holds
[TABLE]
Proof.
First of all, note that (5) is an immediate consequence of (1) implying
[TABLE]
Further, two different applications of (1) to provide
[TABLE]
Therefore , and since similarly one gets , in the following we can assume that .
Applying (1) three times, one obtains
[TABLE]
and in the same way
[TABLE]
Transforming (7) to the term and then substituting this term in (6), provides
[TABLE]
which is equal to (2). Statements (3) and (4) follow immediately from equation (2). ∎
In the case condition (1) implies that the order of components of a stochastic vector does not make a difference for :
Lemma 3**.**
Let satisfy (1) for . Then is permutation-invariant, meaning that for each and each permutation of .
Proof.
It suffices to show that
[TABLE]
For this has been shown in Lemma 2 (see (3)), for it follows directly from (1) and from Lemma 2. ∎
The following lemma provides the substantial part of the proof of Theorem 1.
Lemma 4**.**
For satisfying (1) with , the following holds:
- (i)
If , then
[TABLE] 2. (ii)
If , then the following statements are equivalent:
- (a)
It holds
[TABLE] 2. (b)
* is symmetric on , meaning that for all ,* 3. (c)
* is continuous on ,* 4. (d)
* is bounded on ,* 5. (e)
* is nonnegative or nonpositive on .*
Proof.
We first show (i). Let and . Changing the role of and in (2), by Lemma 2 one obtains
[TABLE]
Moreover, one easily sees that (2) transforms to
[TABLE]
[TABLE]
Since , it follows
[TABLE]
In order to show (ii), let and define maps and by
[TABLE]
and
[TABLE]
for .
By (4) in Lemma 2, (a) is equivalent both to (b) and to for all . (c) implies (d) by compactness of and validity of the implications (a) (c) and (a) (e) is obvious.
From
[TABLE]
for one obtains
[TABLE]
and by induction
[TABLE]
with
For it holds , hence maps the interval onto the interval . Since for all , the following holds:
[TABLE]
Moreover, applying (10) to yields , hence
[TABLE]
Assuming (d), by use of (11), (12) and (13) one obtains for all , hence (a). If (e) is valid, then by (4) in Lemma 2
[TABLE]
for all , providing (d). By the already shown, (a), (b), (c), (d), (e) are equivalent . ∎
We are able now to complete the proof of Theorem 1. Assuming (1), we first show (1) for , and for bounded and . This provides statement (i) and, together with Lemma 4 (ii), statement (ii) of Theorem 1.
Statement (1) is valid for all by Lemma 4. In order to prove it for , we use induction. Assuming validity of (1) for all with , where , let . Choose some with . Then by (1) and Lemma 4 we have
[TABLE]
So (1) holds for all with .
In order to see (iii), recall a result of Diderrich [4] stating that is a multiple of the Shannon entropy if is bounded and permutation-invariant on and satisfies
[TABLE]
which is weaker than (1) with . Since under (1) is permutation-invariant by Lemma 3, Diderrichs axiom are satisfied, and we are done.
3 Further discussion
Our discussion suggests that the case is more complicated than the general one. In order to get some further insights, particularly in the case , let us consider consider only rational stochastic vectors. So in the following let with for and being the rationals. The following proposition states that for the ‘rational’ generalized Shannon additivity principally provides the Tsallis entropy on the rationals, which particularly provides a proof of the implication (c) (a) in Theorem 1 (ii).
Proposition 5**.**
Let be given with (1) for instead of and . Then it holds
[TABLE]
Proof.
For the vectors with , we get from axiom (1)
[TABLE]
implying
[TABLE]
Now consider any rational vector with for satisfying . With (1) we get
[TABLE]
Using (15), we obtain
[TABLE]
Let us finally compare (ii) and (iii) in Theorem 1 and ask for the role of (c), (d) and (e) of (ii) in (iii). Symmetry is already given by Lemma 3 when only (1) is satisfied, (1) and nonnegativity are not sufficient for characterizing Shannon entropy, as shown in [3]. By our knowledge, there is no proof that (1) and continuity are enough, but (1) and analyticity is working. Showing the latter, in [11] an argumentation reducing everything to the rationals as above has been used.
We want to resume with the open problem whether the further assumptions for in Theorem 1 are necessary.
Problem**.**
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Abe, Tsallis entropy: How unique?, Contin. Mech. Thermodyn. 16 (2004), 237 – 244.
- 2[2] I. Csiszár, Axiomatic characterizations of information measures, Entropy 10 (2008), 261 – 273.
- 3[3] Z. Daróczy, D. Maksa, Nonnegative information functions, in: Analytic Function Methods in Probability and Statistics, Colloq. Math. Soc. J. Bolyai 21, Gyires, B., Ed., North Holland, Amsterdam 1979, 65 – 76.
- 4[4] G.T. Diderrich, The role of boundedness in characterizing Shannon entropy, Inf. and Control 29 (1975), 140 – 161.
- 5[5] D.K. Faddeev, On the concept of entropy of a finite probability scheme (in Russian), Uspehi Mat. Nauk 1956, 227 –- 231.
- 6[6] S. Furuichi, On uniqueness Theorems for Tsallis entropy and Tsallis relative entropy, IEEE Trans. Inf. Theory 51 (2005), 3638 – 3645.
- 7[7] J. Cartwright, Roll over, Boltzmann, Physics World 27 (2014), 31 – 35.
- 8[8] J. Havrda, F. Charvát, Quantification method of classification processes. Concept of structural α 𝛼 \alpha -entropy, Kybernetika 3 (1967), 30 – 35.
