Information transmission and criticality in the contact process
Marzio Cassandro, Antonio Galves, Eva L\"ocherbach

TL;DR
This paper investigates how information transmission in the one-dimensional contact process varies with the infection parameter, revealing that maximum transmission occurs not at criticality but at other values, challenging common beliefs.
Contribution
It demonstrates that information transmission, measured by sensitivity, continues to increase beyond the critical point, providing a counterexample to the idea that maximal information occurs at criticality.
Findings
Sensitivity increases for λ < λ_c
Sensitivity continues increasing after λ_c
Maximum information transmission occurs away from criticality
Abstract
In the present paper, we study the relation between criticality and information transmission in the one-dimensional contact process with infection parameter To do this we define the {\it sensitivity} of the process to its initial condition. This sensitivity increases for values of the value of the critical parameter. The main point of the present paper is that we show that actually it continues increasing even after and only starts decreasing for sufficiently large values of This provides a counterexample to the common belief that associates maximal information transmission to criticality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Information transmission and criticality in the contact process.
M. Cassandro, A. Galves, E. Löcherbach
M. Cassandro : Gran Sasso Science Institute, L’Aquila, Italy.
A. Galves: Universidade de São Paulo, Instituto de Matemática e Estatística, São Paulo, Brazil
E. Löcherbach: Université de Cergy-Pontoise, AGM, CNRS-UMR 8088, 95000 Cergy-Pontoise, France.
(Date: October 30, 2017)
Abstract.
In the present paper, we study the relation between criticality and information transmission in the one-dimensional contact process with infection parameter We introduce a notion of sensitivity of the process to its initial condition and prove that it increases not only for values of the value of the critical parameter, but keeps increasing even after before finally starting to decrease for values of sufficiently above This provides a counterexample to the common belief that associates maximal information transmission to criticality.
Key words and phrases:
Contact process. Criticality. Information transmission. Duality and coupling.
2010 Mathematics Subject Classification:
60K35; 82B27
1. Introduction
From swarms of birds to neuronal activity, an increasing number of recent papers claims that the ability of a complex system to transmit information is maximized at the critical point. Just to cite a few examples, [15] advertise that in swarms of cooperative units such as bird flocks “the information transfer is made possible by the nonlocal nature of the criticality condition.” See also [6] who claim that “flocks behave as critical systems, poised to respond maximally to environmental perturbations.” Similar ideas have emerged in neurobiology.
It is reasonable to conjecture that these ideas originate under the influential work of Per Bak, see for instance [1] where he devotes a whole chapter to the question “Why Should the Brain Be Critical? ” [3] suggest the following answer. “The fact that the critical state […] maximized information transmission in these networks is consistent with an intuitive understanding of how a branching process would work in the context of a highly parallel network. If the network were subcritical, an input signal would attenuate, causing most output units to be inactive, thus leaving little evidence of the input. If the network were supercritical, any input signal would eventually lead to most output units being active, again leaving little information as to what the input was.” This interpretation echoes Per Bak’s own answer given in [1]: “The brain must operate at the critical case where the information is just able to propagate”.
A preliminary step in this direction is to clarify what we mean by information transmission in a complex system and how to measure it. From a neuroscientific point of view, [12] suggested the following answer to this question. They propose to use the notion of dynamical range borrowed from acoustics to measure the sensitivity of the system to external stimuli. More precisely, they consider a stochastic system of interacting neurons exposed to an external stimulus modeled by a Poisson point process. In their model, the graph of interactions is given by an undirected Erdös-Rényi random graph. For this model, the authors are able to precisely define the notion of criticality, using as relevant parameter the average branching ratio. In fact they show, by numerical simulations, that a critical parameter value exists such that the dynamical range increases monotonically below this parameter and decreases monotonically above.
In the present paper, we formulate the problem of information transmission in the following way. We study to which extent a system is able to discriminate between two different initial stimuli. We do this for the one-dimensional contact process which is probably the simplest non-trivial complex system one might think of. More precisely, we compare two coupled time evolutions starting from Bernoulli product measures on a finite set of points having different densities. We then define the sensitivity of the model with respect to the initial signal as the total variation distance between the two associated processes at a given single site, at some fixed time. We prove that the sensitivity of the system to the initial condition, as a function of keeps increasing even after the critical point, before finally starting to decrease. This contradicts the common belief that information transmission is maximized at the critical point.
In Section 4, we discuss our results in comparison with those, already present in the literature, where similar features are pointed out by numerical simulations.
This paper is organized as follows. In Section 2, we recall the definition of the one-dimensional contact process, we give the definition of our measure of sensitivity in (2.3) and state Theorem 1 which is our main result. The proofs are collected in Section 3.
2. Definitions and main results
In the following, we briefly recall the definition of the one-dimensional contact process introduced in [9]. Let and write for the elements of The contact process on is the continuous time Markov process taking values in having generator
[TABLE]
for any cylinder function In the above formula, the configuration defined by
[TABLE]
and
[TABLE]
Here, is a fixed constant. We shall write for a version of the above process starting from for any fixed initial configuration
Recall that [9] proved that there exists a critical value with such that for all there is only one invariant measure, the Dirac-measure supported by the configuration, while for a second and non-trivial extremal invariant measure appears. This result was completed by [5] who prove that at the critical point only one invariant measure, the trivial one, exists. In particular, in the subcritical case, starting from any initial configuration, the process converges to the zero-configuration, while in the supercritical case it converges to a convex combination of the two extremal invariant measures. In the supercritical case, the influence of the initial configuration appears in the weighting factor defining this mixture.
Here, to measure the sensitivity to the initial condition we must adopt a non-asymptotic point of view and measure, at a given time how well the system discriminates between two different initial states. The precise definition is as follows. We suppose that the system starts with a initial configuration with a given density within a finite subset of This process will be denoted where is the intensity of the infection rate appearing in (2.1). We suppose that for all where are i.i.d. Bernoulli random variables with parameter We also suppose that for all
We consider two intensities and the associated processes and We then define the sensitivity of the process with parameter with respect to the initial condition by
[TABLE]
where and where the infimum is taken over all possible couplings of and The quantity measures the minimal distance of the two processes in a given position (here, the position [math]) at time under two initial Bernoulli configurations of density and
In the following we choose for a fixed position and we pose for any
[TABLE]
The quantity measures the sensitivity variation with respect to increasing values of the intensity at time We show that this sensitivity variation is non-decreasing for all and continues increasing even after before finally being decreasing. This is the content of our main theorem that we present now.
Theorem 1**.**
For any fixed there exist such that the following holds.
For there exist and such that for all and
[TABLE]
For there exist and such that for all and
[TABLE]
3. Proof of Theorem 1
The proof of Theorem 1 uses the self-duality of the contact process. For the sake of completeness we recall here this property. We start by introducing some notation. First of all, in the following we will not distinguish between the configuration of the contact process at time and the associated subset of given by that is, depending on the context, we will interpret as element of or as element of the set of all subsets of Moreover, for any subset we write for the contact process starting from the initial configuration if and only if We observe that if is finite, then is just a pure jump Markov process, taking values in the set of finite subsets of If then we write simply for
The duality property of the contact process can be stated as follows. For any finite subset and any initial configuration
[TABLE]
For more on duality see [4] and [10].
Our proof of Theorem 1 relies on the following result.
Proposition 1**.**
Let be a finite subset of and assume that are i.i.d. Bernoulli random variables with parameter respectively, for and that for all Then for all
[TABLE]
where denotes all possible couplings of and
Remark 1**.**
Notice that in our definition of sensitivity in (2.2) above, to make explicit the relationship with and we wrote for and for
Proof.
We take the maximal coupling of and that is, for all Moreover, we use the canonical monotone coupling of and that is, (in the sense of for all ) for all Then
[TABLE]
We obtain by (3.4) and since
[TABLE]
Therefore,
[TABLE]
We now give a lower bound, following Lemma 6.1 of [8]. In the following, write for short and We have
[TABLE]
This expression is minimized by the optimal coupling of and given by
[TABLE]
for
[TABLE]
and
[TABLE]
In this way, for any possible coupling,
[TABLE]
which, due to (3.4), equals
[TABLE]
implying the assertion. ∎
A second important ingredient for the proof of Theorem 1 is the following monotone coupling construction of and for where is some finite subset of We associate to each site five independent Poisson processes having jump times with rate with rate with rate with rate and finally with rate We assume that the processes attached to different sites are all independent. We then construct and in the following way. Firstly, both processes start from the same initial configuration at time Then we update the configurations according to the following rules.
- •
Every time that rings, both processes simultaneously upgrade the value at site to
- •
Every time that rings, both processes simultaneously try to upgrade the position at site to provided that at site or at site there is a symbol
- •
Every time that rings, both processes simultaneously try to upgrade the position at site to provided that at site or at site there is a symbol
- •
Every time that rings, only the process tries to upgrade the position at site to provided that at site or at site there is a symbol
- •
Every time that rings, only the process tries to upgrade the position at site to provided that at site or at site there is a symbol
With this construction, we obtain the following proposition.
Proposition 2**.**
For the above coupled construction of and the following holds.
- •
* for all *
- •
For any fixed site is conditionally independent of conditionally on
Finally, we will rely on the following well-known result. We define
[TABLE]
Theorem 2** (Theorems 1.6 and 2.28 in Chapter VI of [13]).**
The following properties hold.
- •
* for and for all *
- •
The function is continuous and non-decreasing in and as
- •
For all there exists a unique probability measure on such that for any cylinder function
[TABLE]
where [math] denotes the configuration “all-zero”.
- •
The measure has exponentially decaying correlations, that is, there exist with the following property. For all cylinder functions and such that depends only on and only on for some fixed which are finite subsets of
[TABLE]
The monotonicity of follows from the construction presented in Proposition 2 above. For the remaining results, we refer the interested reader to [13] for a proof and references.
We are now able to prove Theorem 1.
Proof of Theorem 1.
Step 1. We start with the case
Define for any We rely on the coupled construction of and of Proposition 2 above. By Proposition 1, we may therefore write
[TABLE]
where denotes the expectation with respect to this monotone coupling of and Then, by the monotonicity and since
[TABLE]
Notice that since by assumption, Therefore,
[TABLE]
By symmetry,
[TABLE]
and
[TABLE]
We put By monotonicity we obtain
[TABLE]
We now use the fact that the event is conditionally independent of conditionally on Thus
[TABLE]
Analogously,
[TABLE]
implying that
[TABLE]
Now, if we have that
[TABLE]
and
[TABLE]
as Therefore, there exists depending on and on such that for all
[TABLE]
Since for all finite this implies the first assertion in the subcritical case
Let us now consider values of which are slightly above Relying on Theorem 2, we have
[TABLE]
and
[TABLE]
By Theorem 2, as Observe that
[TABLE]
Using (3.5) we deduce that
[TABLE]
with analogous formulas for the other terms appearing in (3.6) and (3.7) above.
Therefore, fix some and choose sufficiently close to such that
[TABLE]
for all
Now, fix any Thanks to the above convergence results (3.6)–(3.8), it is possible to choose first and then such that for all and
[TABLE]
implying that
[TABLE]
This concludes the proof of the first assertion.
Step 2. We finally consider the case where is sufficiently larger than Let be the monotone coupling between and induced by the construction of Proposition 2. Using this coupling and the fact that we obtain thanks to Theorem 2 that
[TABLE]
We want to show that this expression is negative for sufficiently large values of and We put Since by assumption we have (this will be important in (3.11) below).
Then and Writing for short
[TABLE]
it is clear that
[TABLE]
Applying the last item of Theorem 2, we have that
[TABLE]
Moreover,
[TABLE]
Putting these results together, we conclude that
[TABLE]
Since it is possible to choose such that for all
[TABLE]
for some (sufficiently small) Recall that Since we may choose sufficiently large such that for all As a consequence, for all
[TABLE]
which implies the assertion. ∎
4. Final discussion
In the present article, we have proved that for the contact process the information transmission – as defined in (2.2) and (2.3) – is maximized at a value of the control parameter which is strictly larger than the critical value Similar issues have been discussed by many other authors, using different measures of information transmission and considering different models. In the present section, we give an overview of these results and compare our findings to the ones already established in the literature.
A commonly used measure to quantify information transfer is a recent information theoretic measure introduced by [14], the so-called transfer entropy. This transfer entropy quantifies “the statistical coherence between systems evolving in time” (cf. [14]). It is “able to distinguish driving and responding elements and to detect asymmetry in the coupling of subsystems” (cf. [14]). An important point is that this quantity measures “to which extent the individual components contribute to information production and at what rate they exchange information among each other”, when an external perturbation is absent (cf. [14]).
In the case of ferromagnetic Ising models, [2] show numerically that this transfer entropy is maximized in the disordered phase, that is, in the region where only one invariant measure exists and which would correspond to the subcritical regime for the contact process. This result is confirmed by the findings of [16]. [2] argue that their result could be related to a subtle interplay between sites within and out the boundaries of same spin domains whose probability distributions are a function of the temperature. On the other hand, [7] consider a Susceptible-Infected-Susceptible (SIS) epidemic model on a homogeneous network and provide simulations showing that the transfer entropy is maximized in the supercritical regime, confirming our result. They argue that “once the disease dynamics reach criticality, we observe strong effects of one individual on a connected neighbor (measured by the transfer entropy). However, as the dynamics become supercritical, the target neighbor becomes more strongly bound to all of its neighbors collectively, and it becomes more difficult to predict its dynamics based on a single source neighbor alone; as such, the transfer entropy begins to decrease” (see [7]). These results are very close to the ones we have found in the present paper for the contact process.
Our paper presents two main differences with respect to the above cited ones. First of all, to the best of our knowledge, our result is the first one available using analytical methods instead of numerical simulations. The second difference is that instead of relying on the transfer entropy, we measure how much a system discriminates between different external inputs to which the system is initially exposed. To do so, we have introduced the notion of sensitivity of the model with respect to the initial signal, given by the total variation distance between the two associated processes at a given single site, at some fixed time.
Let us briefly comment on this choice. Two points of view are commonly adopted to describe the influence of external stimuli in neuronal systems. On the one hand, one might think of external stimuli which are permanently influencing the system, acting as external field. This is the point of view adopted by [12]. The second approach is to think of an initial configuration, coming from another region of the brain, which is exposed as in initial stimulus to the region one is interested in, and to see how this initial stimulus is propagated by the system. This is the point of view we have adopted in the present paper, leading to our definition of sensitivity.
Although our measure of information transfer is different from those used by [2], [7] and also [12], the fact that it is maximized in the supercritical region, that is, in the ordered phase where several invariant measures coexist, is due to the very nature of the system we consider (in the very same way as what was observed in [7]). This is due to the fact that both the SIS epidemics model as well as the contact process describe the evolution and the spread of an epidemics. More precisely, our result can be related to the specific features of the stationary states of our model where is strictly larger than zero only for (cf. Theorem 2). In other words, to convey, at large times, a non trivial amount of information, has to be larger than On the contrary to these results, in the case of the 2d Ising model, [2] observe a peak on the disordered site, that is, in the region, where only one invariant measure exists. This result is characteristic of the very nature of the Ising model and shows that in terms of its information theoretic structure, the Ising model displays different features than the contact process or any other model of epidemics spread.
Acknowledgements
Many thanks to Errico Presutti and Antonio Carlos Roque da Silva Filho for stimulating discussions about this subject. We also thank two anonymous referees for helpful comments and suggestions. We thank the Gran Sasso Science Institute (GSSI) for hospitality and support. This research has been conducted as part of the project Labex MME-DII (ANR11-LBX-0023-01), USP project Mathematics, computation, language and the brain and FAPESP project Research, Innovation and Dissemination Center for Neuromathematics (grant 2013/07699-0). AG is partially supported by CNPq fellowship (grant 311 719/2016-3.)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Per Bak. How nature works . Springer New York, 1996.
- 2[2] Lionel Barnett, Joseph T. Lizier, Michael Harré, Anil K. Seth, and Terry Bossomaier. Information flow in a kinetic ising model peaks in the disordered phase. Phys. Rev. Lett. , 111:177203, Oct 2013.
- 3[3] John M. Beggs and Dietmar Plenz. Neuronal avalanches in neocortical circuits. Journal of Neuroscience , 23(35):11167–11177, 2003.
- 4[4] Françoise Bertein and Antonio Galves. Une classe de systèmes de particules stable par association. Z. Wahr. Verw. Gebiete , 41:73–85, 1977.
- 5[5] C. Bezuidenhout and G. R. Grimmett. The critical contact process dies out. Ann. Probab. , 18:1462–1482, 1990.
- 6[6] Andrea Cavagna, Alessio Cimarelli, Irene Giardina, Giorgio Parisi, Raffaele Santagati, Fabio Stefanini, and Massimiliano Viale. Scale-free correlations in starling flocks. Proceedings of the National Academy of Sciences , 107(26):11865–11870, 2010.
- 7[7] E. Yagmur Erten, Joseph T. Lizier, Mahendra Piraveenan, and Mikhail Prokopenko. Criticality and information dynamics in epidemiological models. Entropy , 19(5), 2017.
- 8[8] Antonio Galves, Nancy L. Garcia, and Clémentine Prieur. Perfect simulation of a coupling achieving the d ¯ ¯ 𝑑 \bar{d} -distance between ordered pairs of binary chains of infinite order. Journal of Statistical Physics , 141(4):669–682, 2010.
