Jensen Inequality and the Second Law
P.D. Gujrati

TL;DR
This paper critically examines the use of Jensen's Inequality in microscopic thermodynamics, revealing potential misconceptions about its role in establishing the second law.
Contribution
It challenges the common reliance on Jensen's Inequality for proving the second law's consistency in fluctuation theorems, highlighting possible limitations.
Findings
Jensen's Inequality may be misleading in certain thermodynamic contexts
The paper questions the universal applicability of Jensen's Inequality in fluctuation theorems
Provides a critical perspective on the foundational tools used in microscopic thermodynamics
Abstract
Jensen's Inequality (JIEQ) has proved to be a major tool to prove the consistency of various fluctuation theorems with the second law in microscopic thermodynamics. We show that the situation is far from clear and the reliance on the JIE may be quite misleading in general.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Jensen Inequality and the Second Law
P.D. Gujrati
Department of Physics, Department of Polymer Science, The University of Akron, Akron, OH 44325
(August 20, 2018)
Abstract
Jensen’s Inequality (JIEQ) has proved to be a major tool to prove the consistency of various fluctuation theorems with the second law in microscopic thermodynamics. We show that the situation is far from clear and the reliance on the JIE may be quite misleading in general.
††preprint: UATP/1804
- Introduction The Jensen inequality (JIEQ) Cover has become popularized recently in modern nonequilibrium (NEQ) thermodynamics as a tool to imply that various (integral) fluctuation theorems (FT) Searles ; Mansour ; Seifert ; Harris of the form
[TABLE]
such as the Jarzynski identity, Crooks theorem, Seifert’s entropy generation theorem, etc. are consistent with the second law. These FTs are determined by the trajectories and their probabilities ; see various reviews Esposito ; Campisi ; Seifert ; Jarzynski-Rev . The collection forms the trajectory ensemble (TE) and defines the average . Jensen’s inequality, a purely mathematical result to obtain the inequality from Eq. (1), is extensively used to argue for the conformity of the FTs with the second law, a macroscopic result in physics. It thus follows that the only use of the JIEQ is to justify that the FTs can describe NEQ processes so as to make them quite suitable to gain insight into the second law. As the choice of is not unique, see below, the relationship of with its thermodynamic average, to be denoted simply by , is not clear, since the second law must only refer to thermodynamic averages such as for the entropy change of an isolated system for which the second law results in the inequality . Another consequence of the second law is the dissipated work in an isothermal process
[TABLE]
where is the thermodynamic average work (see Eq. (7) for a proper definition) done on the system Note1 ; Landau during a process, is the free energy change, and is the *dissipated work *Note-R . As an example of a FT, Jarzynski Jarzynski derived
[TABLE]
where denotes the work done on the system along the trajectory , the suffix [math] refers to a special averaging with respect to the initial equilibrium (EQ) microstate probabilities replacing , and refers to the initial microstate of . The above identity is known as the *Jarzynski identity *(JE). We will refer to as the Jarzynski average in this work.
It is implicitly assumed in the current literature that . What is the significance of if does not refer to a quantity that must obey the second law (see the discussion later for such a quantity)? Indeed, we will establish here that using the JIEQ can be misleading in suggesting that the FT applies to NEQ processes or that satisfies the second law, while in fact they do not. To the best of our knowledge, this issue has not been discussed in the literature despite the wide use of the JIEQ. We establish that (i) the trajectory ensemble average (TEA) may or may not be the same as the thermodynamic average , and (ii) even when the two are the same, may have nothing to do with the second law. As a consequence, the consequence of the JIEQ need not refer to the second law, thus casting doubts on its utility for FTs.
For simplicity, we consider a ”work-process” on a system as proposed by Jarzynski Jarzynski . It is an arbitrary process over between two EQ macrostates A and B at the same inverse temperature ; here, is the time needed to reach the EQ macrostate B. The system is driven (the driving stage) over , by and is then allowed to equilibrate (the reequilibration stage) due to interaction with only over . For simplicity, we assume that during , is not in thermal contact with . We denote by the combination and the combination by , which is an isolated system. All quantities pertaining to have no suffix, and those pertaining to () with a tilde (suffix [math]). For concreteness, we assume the work process to change the volume of the system by applying an external pressure , but the arguments are valid for any external ”work” process. The system-intrisic (SI) Note-SI pressure for the th microstate will be denoted by , where is the microstate energy, an SI-quantity. The difference denotes the ubiquitous force imbalance (FI) between the external and induced internal forces that is normally nonzero even in equilibrium Gujrati-GeneralizedWork ; Gujrati-GeneralizedWork-Expanded . Therefore, to discard FI in an irreversible process is counter-productive. We find it convenient to use Prigogine’s modern notation, which is highly suitable in NEQ thermodynamics deGroot ; Prigogine ; Gujrati-II ; Gujrati-Entropy2 ; Gujrati-Stat .
- Jensens Inequality Consider a convex function of a random variable , and let E be an expectation operator such as , etc. Then, the inequality
[TABLE]
is known as Jensen’s inequality (JIEQ) for . For the JE, represents E so the JIEQ results in . By exploiting an ad-hoc assumption without offering any justification, Jarzynski Jarzynski has argued that the JE results in in accordance with the second law; see Eq. (2). The use of the JIEQ has become widespread to establish consistency with the second law by exploiting a similar ad-hoc assumption such as by Crooks Crooks , Seifert Seifert ; Seifert-PRL ; Seifert-EPJ ; Jarzynski-EPJ and many others. The argument is crucial since it indirectly ”justifies” the results to be nonequilibrium results. The assumption is never ever explicitly mentioned but seems to have been accepted by all workers without ever been justified.
- Thermodynamic Ensemble Averages In general, an EQ or NEQ ensemble average (EA) is defined instantaneously, and requires identifying (a) the elements (microstates ) of the ensemble and (b) their instantaneous probabilities . The average is uniquely defined over using at each instant, which we identify as the instantaneous ensemble average (IEA). Let be some extensive quantity pertaining to . The instantaneous thermodynamic average is defined Prigogine ; Landau as
[TABLE]
We will usually not show the time unless clarity is needed. In thermodynamics, it is common to simply use for the average but it may cause confusion in some cases. We will use macroquantity for the average and microquantity for . The average energy is such an average system-intrinsic (SI) macroquantity. The infinitesimal thermodynamic work done on the system and the work done by the system represent such an average instantaneous macroquantities; the former is medium-intrinsic (MI) quantity and the latter a SI quantity. The first law during is expressed as a sum of two system-intrinsic (SI) contributions
[TABLE]
The first sum represents the generalized heat while the second sum represents , the generalized work Gujrati-GeneralizedWork ; Gujrati-Stat ; Gujrati-GeneralizedWork-Expanded in terms of the SI microwork done by . These generalized macroquantities should not be confused with the exchanged macroheat and macrowork and , respectively. Their differences are and , respectively, with an important identity of their magnitudes . We also observe that during generalized work, change but not ; during generalized heat, change but not . This allows us to treat work and heat separately.
- Trajectory Ensemble Averages The uniqueness inherent in Eq. (5) may not hold for the TEA , which we now discuss. Let denote the trajectory followed by during its evolution along . The average cumulative change along a process , we suppress the suffix TE for simplicity, is obtained by integrating over the process between and :
[TABLE]
we will use or simply or for unless clarity is needed. We note that retains its identity during its evolution along as indicated by the sum; no transition between different microstates is allowed. We can also introduce the cumulative change along over the interval , and rewrite the above equation as
[TABLE]
where we have introduced the trajectory probability in terms of :
[TABLE]
here . We note from Eq. (8) that can also be treated as the thermodynamic average with respect to the trajectory probability set . Using and for , we obtain the average accumulated work done on and by the system, respectively, in terms of the respective trajectory probabilities:
[TABLE]
[TABLE]
This average (over time) probability is determined by alone and can be identified as the *intrinsic *trajectory probability. We observe that . It should be evident that the three probabilities are not the same. In other words, there is no unique trajectory probability as said earlier.
The trajectory is determined by a single microstate , and proves useful in the thermodynamic macroworks or . We can also consider a mixed trajectory (mT) as a sequence of microstates starting at at and terminating at at time ; the microstate appears at time . Consider the time interval , which we divide into an earlier interval and a later interval . During , does not change as microwork is performed by ; no microheat is transferred. During , no microwork is performed by but microheat is transferred, which changes to . For , reduces to . The probability is given by
[TABLE]
in terms of the multistate conditional probability and the initial probability . The corresponding with respect to is obtained by replacing and by and , respectively, in Eq. (7) and summing over all . It is very common to assume that the sequence forms a (memoryless) Markov (M) chain so that can be expressed as a product of two-state transition probabilities to determine the Markov approximate . Using , the Markov average external work over is
[TABLE]
where and is the external work done on . Thus, in the Markov chain approximation, gives a discrete approximation of the macrowork in Eq. (10a) for which we require to be extremely short. Otherwise, and are very different as we have stated earlier. For a non-Markovian process, cannot be expressed as a product of two-state transition probabilities and we must resort to the generalization noted above of Eq. (8).
The Jarzynski Equality As our first example of a TEA different from in a FT, we consider the one proposed by Jarzynski Jarzynski noted above. Jarzynski uses the external microwork done on during to prove the JE in Eq. (3). Here, , and only during the driving stage ; over . If the system at is out of equilibrium, we denote it by b. The interaction with during is to ensure that b turns into B.
The use of the JIEQ with for E in Eq. (3) immediately results in . Jarzynski assumes that and argues that the JE represents a NEQ result so that for a reversible process and for an irreversible process.
We now consider a reversible process between A and B, for which the thermodynamic macrowork is the reversible macrowork , and demonstrate by a simple example that is not the same as ,
[TABLE]
the Jarzynski average, except when .
For the calculation, we consider an ideal gas in a -dimensional box of length , which expands quasistatically from to ; we let between A and B. As there are no interparticle interactions, we can treat each particle by itself. The microstates in the exclusive approach are those of a particle in the box with energies determined by an integer . Let denote the inverse temperature of the heat bath. The gas remains in equilibrium at all times and . The partition function at any is given by
[TABLE]
for any ; in the last equation, we have made the standard integral approximation for the sum. We then have
[TABLE]
We can now compute the two work averages with . For the Jarzynski average, we have
[TABLE]
where we have used . For the thermodynamic average, we use in Eq. (7) to obtain
[TABLE]
It should be clear that it is the thermodynamic average work that satisfies the condition of EQ and not , which is evidently different from . This, thus, contradicts the conventional assumption . We evaluate the difference . Introducing for expansion, we have
[TABLE]
The Jensen inequality is satisfied as expected, but the above nonnegative difference makes no statement about any dissipation in the system, which is most certainly absent. Thus, the JIEQ makes no statement about the second law and casts doubts on the usefulness of the indiscriminate application of the JIEQ in FTs.
We now consider two more FTs, where the JIEQ has been used to justify consistency with the second law.
CrooksApproach Crooks Crooks assumes the evolution along as a Markov process satisfying the principle of detailed balance and divides into intervals as described above. Microwork is performed during and microheat is exchanged during . We will not follow Crooks’ derivation of the JE, which we have carried out elsewhere Gujrati-Crooks , but follow the consequences of the detailed balance here. The transition probability matrix in takes a very simple form under detailed balance, which we denote by . From the Fundamental Limit Theorem or Doeblin’s theorem* *about Markov chains Strook ; Gujtrati-Crooks, we know that such a transition matrix is uniquely determined with its matrix elements corresponding to given by
[TABLE]
where is the EQ probability at fixed of the th microstate at time and ensures that the end-microstate at the end of belongs to an EQ macrostate. Thus, at the end of , the EQ-microstate turns into an EQ-microstate but the microwork done on is precisely . By induction, we have a sequence of EQ-microstates and microworks and the total microwork is given by used in Eq. (11). Using the above transition matrix , we can easily evaluate :
[TABLE]
so that
[TABLE]
We see that the Crooks process during is no different from the Jarzynski process and the quantity within the parentheses denotes a Jarzynski averaging of the exponential microwork distribution over the probabilities of the initial EQ-microstates in the interval . In other words, the Crooks process is a sequence of non-overlapping mini-Jarzynski processes , each over . For each mini-Jarzynski process, we have
[TABLE]
where the suffix denotes averaging over the initial microstate probabilities and is the change over between EQ-macrostates. It is obvious now that is not the same as the thermodynamic average , just as it was for the Jarzynski process, unless is extremely small. This means that the application of the JIEQ on does not give an inequality involving thermodynamic averages so no connection with the second law is possible.
Seiferts Approach: Here, we will continue to use a discrete formulation for simplicity. According to Seifert Seifert , denotes the microscopic entropy and its thermodynamic average gives the (average) entropy, commonly written as . Seifert defines the average change (S indicating Seifert) in terms of ,
[TABLE]
and conjectures that is nothing but . One can also determine as the integral of along the trajectory and introduce . Similarly, we also have and for the isolated system and where denotes its set of trajectories. Seifert then derives the following equality . The use of the JIEQ then results in , which is interpreted using the above conjecture that denotes . Using this interpretation, the inequality is considered a statement of the second law by taking to mean , see Eq. (2). With this, Seifert provides another proof of the JE in terms of the mixed trajectory average
[TABLE]
Since
[TABLE]
which is simply a statement of the conservation of probability, we conclude that . To determine , we follow Eqs. (7)-(9). Since is integral of , it is clear that . Thus, the above JIEQ conclusion does not prove that it encodes the second law. The second law requires considering the differentials and . Recalling that , compare with Eq. (6), and , we have
[TABLE]
Thus, is not the entropy differential . Unfortunately, this point has been overlooked.
Conclusions In summary, we have shown that the application of the Jensen inequality does not at all make any statement about the second law. It should be pointed out that while there is a consequence of the second law for , there is no second law statement about . Thus, while in the former case, the use of the JIEQ may provide a statement of the second law, its applications to has no relationship to the second law. The conclusion is that care must be exercised to draw any conclusion about the second law by applying the JIEQ in general, a point that does not seem to have been appreciated.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) T.A. Cover and J.A. Thomas, Elements of Information Theory , Second Edition, John Wiley & Sons, Hoboken, N.J. (2006).
- 2(2) R.J. Harris and G.M. Schütz, J. Stat. Mech. P 07020 (2007).
- 3(3) E.M. Sevick, R. Prabhakar, S.R. Williams, and D. J. Searles, Ann. Rev. Phys. Chem. 59 , 603 (2008).
- 4(4) U. Seifert, Eur. Phys. J. B 64, 423 (2008); Rep. Prog. Phys. 75 , 126001 (2012).
- 5(5) M. Malek Mansour and F. Baras, Chaos, 27 , 104609 (2017).
- 6(6) M. Esposito, U. Harbola, S. Mukamel, P. Talkner, Rev. Mod. Phys. 81 , 1665 (2009).
- 7(7) M. Campisi, P. Hänggi, P. Talkner, Rev. Mod. Phys. 83 , 771 (2011).
- 8(8) C. Jarzynski, Annu. Rev. Condens. Matter Phys. 2 , 329 (2011).
