Why collective behaviours self-organize to criticality: a primer on information-theoretic and thermodynamic utility measures

Qianyang Chen; Mikhail Prokopenko

PMC · DOI:10.1098/rsos.241655·June 25, 2025

Why collective behaviours self-organize to criticality: a primer on information-theoretic and thermodynamic utility measures

Qianyang Chen, Mikhail Prokopenko

PDF

Open Access

TL;DR

This paper explains how collective behaviors in systems self-organize to a critical state using concepts from information theory and thermodynamics.

Contribution

The paper introduces the Principle of Super-efficiency to explain why systems self-organize to criticality.

Findings

01

Thermodynamic efficiency peaks at the critical regime, balancing predictability and energy costs.

02

The Ising model helps compare how different utilities influence collective behavior.

03

Optimizing intrinsic utilities leads to distinct characteristics in collective systems.

Abstract

Collective behaviours are frequently observed to self-organize to criticality. Existing proposals to explain these phenomena are fragmented across disciplines and only partially answer the question. This primer compares the underlying, intrinsic, utilities that may explain the self-organization of collective behaviours near criticality. We focus on information-driven approaches (predictive information, empowerment and active inference), as well as an approach incorporating both information theory and thermodynamics (thermodynamic efficiency). By interpreting the Ising model as a perception-action loop, we compare how different intrinsic utilities shape collective behaviour and analyse the distinct characteristics that arise when each is optimized. In particular, we highlight that thermodynamic efficiency—measuring the ratio of predictability gained by the system to its energy…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species2

Rattus norvegicus(brown rat · species)Homo sapiens(human · species)

Diseases2

disorder infection

Figures18

Click any figure to enlarge with its caption.

Schematic representations for second-order phase transition and critical phenomena. The control parameter is a variable that influences the state of the system, such as temperature or pressure, while the order parameter quantifies the degree of order within the system, with non-zero value only in the ordered phase. (a) Order parameter changes continuously in response to changes in the control parameter during a second-order phase transition. (b) Scale-invariance observed at the critical regime indicates a power-law distribution of event sizes, e.g. avalanche or earthquake magnitudes. (c) Correlation length diverges at the critical point, facilitating long-range correlation of fluctuations between constituent parts of the system. (d) Susceptibility diverges at the critical point, reflecting the system's increased sensitivity to external perturbations.

Causal structure of perception-action loop of an agent with memory, traced over time.

Causal structure of perception-action loop of a memory-less agent traced over time. Coloured blocks represent the two components that mutual information is calculated for predictive information.

The Venn diagram of predictive information shown as the mutual information (the overlap area) between past and future sensory states: it represents how useful the past is for predicting the future.

The causal structure of the perception-action loop of a memory-less agent traced over time. Coloured blocks represent the components of mutual information that determines empowerment.

The Venn diagram of mutual information I(St+n;Atn). Empowerment is the maximum of this mutual information for a given action channel.

The diagram illustrates interactions between elements in the active inference framework. Solid lines represent influences between components. Dash lines represent directed influence from sensory to internal or from action to external, which correspond to the two stages of active inference. Figure is adapted from [49].

Thermal efficiency for a system undergoing a specific process. It is generally defined as the dimensionless ratio between the total work output and the total energy input. Adapted from [72].

Thermodynamic efficiency for a system. It is defined as the ratio between the increase in order and change in the generalized work carried out to generate the order.

A lattice of atoms with dipole magnetic moments. Links in red represent higher energy bonds (where two adjacent atoms have opposite spins), and blue represents lower energy bonds (where two adjacent atoms are aligned).

Average predictive information plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Predictive information maximizes when J is small (weak coupling), where an average site exhibits explorative behaviour.

Decomposition of predictive information into richness (dotted line) and unpredictability (dash line) components; predictive information is the difference between the two curves. Results obtained from the average over 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions.

A noiseless binary channel. Channel capacity C = 1 bit.

A single output channel. Channel capacity C = 0 bit.

Average empowerment plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Empowerment optimizes at strong coupling, where the collective stabilizes to a uniformly aligned configuration at equilibrium. Such configuration maximizes an average site's ability to retrieve the impact of its action through future sensory states.

Average negative free energy plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Negative free energy maximizes at strong coupling, where the spins are aligned at equilibrium, and the site experiences minimum surprise comparing its approximate distribution of the world state (based on local sensory information) to the actual distribution that generates the world states (global property).

Average thermodynamic efficiency plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Thermodynamic efficiency optimizes at the critical regime, where a significant portion of the work expended in tuning the parameter J contributes to reducing configurational entropy; that is, the collective is most energetically efficient in creating internal order.

Finite-size analysis near the critical coupling Jc for both Glauber and Metropolis dynamics. Results show that as lattice size increases, the peak of thermodynamic efficiency approaches the critical coupling strength Jc.

Equations31

Funding1

—University of Sydneyhttp://dx.doi.org/10.13039/501100001774

Keywords

collective behaviourintrinsic utilitycriticalityphase transitionself-organization

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Thermodynamics and Statistical Mechanics · Neural dynamics and brain function · Complex Systems and Time Series Analysis

Full text

Introduction

Self-organization is a process where a system spontaneously develops new structured patterns or functions, without being explicitly controlled by an external force. This process is observed in a wide range of natural and artificial systems, where local interactions among components generate global order. As a fundamental concept in complexity science, self-organization is extensively studied in various disciplines, including systems theory, condensed matter physics, systems biology, and social sciences. From a physics perspective, self-organization is generally viewed as entropy reduction or increase in order in an open system ‘without specific interference from outside’ [1,2]. From a biological perspective, self-organization is typically defined as a pattern-formation process that relies entirely on interactions among the lower-level components of the system [3]. There are three key aspects to self-organization [1–5]:

(i) Spontaneous order: the system evolves into a more organized state without external control.(ii) Emergence of coherent global behaviour: there is an observable transition to a more coherent collective behaviour.(iii) Local interactions and long-range correlations: system components operate on local information but exhibit long-range interaction and connectivity.

One of the underlying principles for the spontaneous order created in self-organization, as suggested by Kauffman [4], is the ‘constraint closure’, which means that the system carries out some work to create constraints on the release of energy, and those constraints, in turn, channel the energy to perform more useful work. Thus, a successful framework describing self-organization needs to account for thermodynamic characteristics of the spontaneous order, capturing the corresponding energy flows and costs.

Typically, self-organized collective behaviours, such as magnetization, ant colony foraging, swarming, slime mould aggregation, flocking of birds and neural processing in the brain, exhibit critical phenomena [6–14]. These phenomena occur at the critical point of a continuous phase transition and include scale-invariance [15], divergence of correlation length and divergence of the response function [16]. These hallmarks are observed in physical [17,18], biological [7,11,19], social [20,21] and hybrid systems [22].

Scale-invariance means that the system near the critical regime does not exhibit a typical length scale, i.e. patterns appear similar on many magnification levels. Consequently, the size of events at criticality follows a power-law distribution (figure 1). The correlation length measures the scale on which fluctuations or changes at one point in the system affect those at another point, and the divergence of this quantity implies the long-range interaction between constituent components of the system. In the context of collective systems such as groups of biological organisms, long-range interactions may generate more coherent global behaviour for the group. The response function characterizes the system’s response to perturbations. For example, magnetic susceptibility represents the change of magnetization of a material in response to an applied magnetic field and is known to diverge at criticality, as even a small field can induce large changes in magnetization. At the critical regime, systems typically become highly sensitive to small changes in parameters, showing large responses to minor perturbations. Another implication to collective behaviour in biological systems is that the groups may become more sensitive to stimuli from the external environment, such as detection of predators.

Schematic representations for second-order phase transition and critical phenomena. The control parameter is a variable that influences the state of the system, such as temperature or pressure, while the order parameter quantifies the degree of order within the system, with non-zero value only in the ordered phase. (a) Order parameter changes continuously in response to changes in the control parameter during a second-order phase transition. (b) Scale-invariance observed at the critical regime indicates a power-law distribution of event sizes, e.g. avalanche or earthquake magnitudes. (c) Correlation length diverges at the critical point, facilitating long-range correlation of fluctuations between constituent parts of the system. (d) Susceptibility diverges at the critical point, reflecting the system's increased sensitivity to external perturbations.

Physical systems, such as fluids or magnets, can be driven to criticality by adjusting a control parameter, e.g. temperature or pressure, that influences the state of the system. As the control parameter reaches a critical value, the order parameter, which measures the degree of order or organization within the system, undergoes a transition, e.g. from zero (disordered phase) to non-zero (ordered phase). However, for biological systems, there are typically no well-defined protocols to adjust the control parameters. Nevertheless, nature somehow finds its way to poise the system at or near criticality.

A canonical framework describing the mechanism behind such dynamics is the theory of self-organized criticality (SOC), initially introduced by Bak et al. [23] based on a mathematical model known as the Bak–Tang–Wiesenfeld (BTW) Sandpile Model. Central to this theory is the interplay of two opposing forces that push a system to criticality. The first force is the driving force, characterized by gradual, incremental changes that increase the system’s energy, disorder or stress (e.g. adding sand to a sand pile). When the accumulated stress or energy reaches a certain threshold, a stabilizing force comes into play, triggering a response that dissipates or redistributes the energy, typically in a sudden and possibly widespread manner (e.g. sand avalanche). Under the influence of these two opposing forces, the system ‘evolves’ to criticality and remains there. The concept of SOC inspired a series of studies that applied it to develop an understanding of the underlying mechanism that generates critical phenomena in various complex systems such as forest fires [24–26], earthquakes [27] and brain activities [7,10,28,29]. However, it can be argued that SOC provides a possible explanation for how criticality occurs, rather than why it benefits the system.

We are interested in exploring the intrinsic utility for a self-organizing system approaching and operating at the critical regime. Here, an intrinsic utility is understood broadly, as the inherent benefit or value gained by the system from its own organization, independently of external rewards and objectives [30,31]. Recent research on intrinsic utilities shaping self-organizing behaviours primarily examined how systems, especially autonomous robots and biological entities, utilize task-independent objectives in order to optimize and adapt their behaviours. Notable strategies include predictive information maximization [32–38], empowerment maximization [39–44] and free energy minimization [45–49] (which encompasses both intrinsic and extrinsic utilities). A consistent feature of these approaches is their employment of information theory in quantifying the intrinsic motivation for the spontaneous order and emergence of collective behaviours. Informally, one identifies a change in suitably defined entropic quantities with relevant pattern formation at macro-level. Although these approaches can sometimes induce critical behaviours [36–38,50], this outcome is not invariably guaranteed. Thus, to understand the fundamental drivers of critical phenomena in collective behaviours, it is essential to give a thermodynamic account of the intrinsic motivation.

The three frameworks mentioned above predominantly focus on the informational benefit (e.g. increase in predictability, order or potential influence, reduction in uncertainty or surprise) without explicitly addressing the associated energy costs. Although the free energy minimization incorporates the term ‘energy’ in its name, the utilized concept is an information-theoretic construct which does not align with the thermodynamic free energy (more details provided in §2). As a result, the trade-offs between informational benefits and thermodynamic costs are not captured explicitly. Furthermore, the lack of a common example that could be used to directly compare these intrinsic utilities makes understanding these trade-offs more challenging.

In this work, we compare several intrinsic utilities applied to the same system and explore whether they attain optimality near the critical regime. In doing so, we highlight the salient features of these approaches, providing a concise primer on information-theoretic and thermodynamic utility measures in the context of self-organization. We then formulate a unifying principle connecting (i) the intrinsic functional benefits of collective behaviour (measured as entropy reduction or gained predictability) and (ii) the associated thermodynamic costs. Studies of various complex dynamical systems, such as urban growth [51], self-propelled particles [52], contagion network [53] and the canonical Curie–Weiss model for magnetization [54], have shown that systems at critical points exhibit maximum thermodynamic efficiency defined as a ratio of the gained predictability (i.e. reduction in uncertainty, or the increase in the internal order) to the amount of work required to change the underlying control parameter. These studies strongly suggested that the rate of entropy reduction relative to the carried-out work diverges (peaks in finite systems) at critical points. We further strengthen the argument by analytically demonstrating that the thermodynamic efficiency diverges at the critical point of the canonical two-dimensional Ising model.

We argue that these studies exemplify a general principle of super-efficiency: at the critical regime, a self-organizing system of interacting agents achieves the optimal thermodynamic efficiency by gaining maximal predictability of collective behaviour per unit of the expended work. Informally, one can say that a complex system finds the regime where the cost of ‘keeping it together’ is justified. On one hand, given some available energy to change the control parameter, the system identifies the control parameter value where the gain in predictability maximizes. On the other hand, given a required predictability gain, the system finds the point (the value of the control parameter) where the energy cost associated with the change would be minimal. The principle of super-efficiency suggests that this point aligns with the critical point.

This principle encapsulates the intrinsic utility of self-organizing collective behaviour, elucidating why some systems gravitate towards criticality. Our discussion will begin with an overview of established information-based intrinsic utility measures (§2). Section 3 describes thermodynamic efficiency, which uses both information theory and thermodynamics. Section 4 presents a common example using the two-dimensional Ising model to compare the self-organizing behaviours driven by different intrinsic utilities. Section 5 offers a more in-depth discussion on the principle of super-efficiency, and §6 summarizes the findings. This study offers insights into the distinct characteristics of collective behaviour derived within each framework and emphasizes how the principle of super-efficiency captures the intrinsic utility for collective behaviour at the critical regime. Technical preliminaries and details of the simulation model are provided in the electronic supplementary material.

Information-driven self-organization

Information-driven self-organization is an active area of research that applies information theory to study the behaviours of an agent or a group of agents. The information-theoretic utility functions used to derive the behaviours have the advantage of being universal and domain-invariant. These measures are considered strong candidates for capturing the informational benefit of increased order in collective systems. Two broad categories can be identified: purely intrinsic, and ‘hybrid’ measures which incorporate, in addition, an extrinsic target or preference.

In this work, we adopt the following notation:

—Information theoretic quantities (definitions provided in electronic supplementary material, §S1 Information-theoretic quantities):

—entropy $[eqn]$
—conditional entropy $[eqn]$
—mutual information $[eqn]$
—Kullback–Leiber divergence $[eqn]$ —Capital letters $[eqn]$ ... for random variables;—Small letters $[eqn]$ ... for a realization of the corresponding random variable;—Letters $[eqn]$ for quantities computed using the information-theoretic approaches, corresponding to predictive information, empowerment and variational free energy, respectively;—Blackboard bold font $[eqn]$ for thermodynamic quantities or statistical quantities, corresponding to energy, entropy, thermodynamic free energy, work, heat and Fisher information, respectively. Technical preliminaries are provided in electronic supplementary material, §S2 Thermodynamic preliminaries and Fisher Information.

Additionally, we adopt the notation where subscript $[eqn]$ denotes the state at time $[eqn]$ in a time series, and superscript $[eqn]$ indicates the $[eqn]$ instance in the population.

The perception-action loop

2.1.

Approaches formalizing information-driven self-organization typically assume an underlying model of agent-world interaction. This interaction is generally modelled with a perception-action (or sensorimotor) loop, using random variables to reflect the probabilistic nature of the dynamic. Figure 2 illustrates the causal network of the perception-action loop traced over time, where $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ represent the state of the world, the sensor, the actuator and memory (the controller) at time $[eqn]$ . The perception-action loop captures the following dynamics:

—At any given time $[eqn]$ , the world state $[eqn]$ leads to an update of the agent’s sensory state $[eqn]$ . The mapping from $[eqn]$ to $[eqn]$ is specified by kernel $[eqn]$ , representing the agent’s sensory mechanism;—The agent’s memory (or controller) $[eqn]$ is influenced by both memory from the previous time step $[eqn]$ and the current sensory $[eqn]$ , a relationship represented by kernel $[eqn]$ ;—Depending on the memory state, the agent updates its action $[eqn]$ according to the policy $[eqn]$ . $[eqn]$ also carries through to the future $[eqn]$ ;—The action $[eqn]$ and the world state $[eqn]$ jointly update the next world state $[eqn]$ . The mapping is specified by kernel $[eqn]$ , representing the agent’s actuation mechanism.

Causal structure of perception-action loop of an agent with memory, traced over time.

It is worth noting that $[eqn]$ and $[eqn]$ are kernels that capture the agent’s embodiment in terms of the agent’s sensor and actuator capabilities. They set constraints to how the agent may explore the environment, act and learn [55,56].

In the next subsections, we will elaborate on the variations of causal network representations within different intrinsic utility frameworks. A common example will be presented in §4.

Intrinsic utility approaches

2.2.

The notion of ‘intrinsic utility’ suggests that the utility provided to the agent is internal and task-independent [30,57]. Predictive information maximization [32–38] and empowerment maximization [39–44] are two important approaches that utilize information-theoretic measures as intrinsic motivation for inducing self-organizing behaviours.

Predictive information

2.2.1.

Predictive information [58], also known as effective measure complexity [59] or excess entropy [60], measures how much the observed history reduces uncertainty about the future. In the context of robotic behaviour development, predictive information in the sensor space may serve as an objective function for behaviour learning. Predictive information maximization has been implemented for memory-less agents [32–38] but potentially can be adapted to incorporate external memory. Figure 3 illustrates a reduced causal network for a simple memory-less agent (reactive control) on which predictive information is applied.

Causal structure of perception-action loop of a memory-less agent traced over time. Coloured blocks represent the two components that mutual information is calculated for predictive information.

Predictive information (used as intrinsic utility) is defined as the mutual information between past and future sensory states [32]. This can be further decomposed into components that represent the diversity and predictability [5,32]:

[eqn]

Considering only one time step into the future, predictive information is defined as:

[eqn]

and measured in bits. Equation (2.1) indicates that predictive information is large when the entropy of the future sensory states $[eqn]$ is large, corresponding to a rich future experience, and/or when conditional entropy $[eqn]$ is small, representing a more predictable future. In both extremes, where there is complete order (no diversity) or complete randomness (no predictability), the predictive information will be zero.

The Venn diagram in figure 4 illustrates the relationship between the time series of past and future sensory states. We note that conditional entropy $[eqn]$ represents the remaining entropy of historical sensory states given the future states, which is the part of history that we are unable to reconstruct using information from the future. For example, reconstructing the question given the answer to that question.

The Venn diagram of predictive information shown as the mutual information (the overlap area) between past and future sensory states: it represents how useful the past is for predicting the future.

Behavioural rules can be derived using the predictive information maximization approach, forming a policy $[eqn]$ mapping sensory states to actions. Policy $[eqn]$ can be either deterministic, such as a simple mapping from sensor values to actions, or stochastic, which is represented by a conditional probability distribution $[eqn]$ . The general form of the objective function for a one-step predictive information-driven agent is then expressed as:

[eqn]

where $[eqn]$ denotes the optimal policy.

An agent motivated to maximize predictive information chooses policies that result in more diverse and, at the same time, predictable outcomes. In the context of collective behaviour, maximizing predictive information has also been shown to induce cooperative behaviour under decentralized control [34,36,61]. The increase of predictive information differs from merely reducing randomness in the system; it enhances the richness of structure in the collective system. As shown in §4, collective behaviour resulting from maximizing predictive information for each individual may appear random, but locally, it maintains a high level of diversity, aligned with the predictability of an individual’s sensory states.

Empowerment

2.2.2.

Alternatively, we can focus on a specific segment of the causal network that captures the influence of actions on subsequent sensory states through the external world (figure 5). Empowerment measures this influence as the maximum amount of information an agent can inject from its actuators ( $[eqn]$ ) to its sensors ( $[eqn]$ ) at a future time via the environment.

The causal structure of the perception-action loop of a memory-less agent traced over time. Coloured blocks represent the components of mutual information that determines empowerment.

The n-step empowerment is defined as the Shannon channel capacity $[eqn]$ or maximum mutual information between the current sequence of actions $[eqn]$ and future sensor value $[eqn]$ [39]:

[eqn]

Empowerment is measured in bits. Considering only the most immediate future, one-step empowerment can be computed by:

[eqn]

The definition (2.5) is referred to as general or context-free empowerment since it measures only the agent’s general ability to inject information into its future sensory states. In order to use empowerment as the driver for an agent’s action, one needs to distinguish between different states of the environment, so that the agent can make decisions accordingly. This is achieved by context-dependent empowerment. The context refers to the state of the environment $[eqn]$ that affects the perception-action loop characteristic $[eqn]$ . More specifically, the future sensory state of the agent is affected by both its past actions and the historical states of the world. In other words, the same actions can lead to different distributions of future sensory states when the external environment has changed. Instead of considering a general action-perception characteristic $[eqn]$ , empowerment should be considered for a specific world state or context [39,40,43,62]:

[eqn]

Given that an action $[eqn]$ stochastically leads to a collection of possible future world states $[eqn]$ , the resulting average context-dependent empowerment is computed as:

[eqn]

This quantity can be used as an objective function for an empowerment-driven agent to make decisions on which action to take. More commonly, the state of the world $[eqn]$ would be replaced by some context $[eqn]$ that approximates it if the full world information is not available.

The general empowerment, as defined in equations (2.4) and (2.5), is different from the average context-dependent empowerment [43]. General empowerment does not consider the varying influence of actions in different states, since the channel capacity is computed using only $[eqn]$ . In contrast, the average context-dependent empowerment, as defined in equation (2.7), captures the nuanced ways in which different states can affect the actuation-sensing channel by computing $[eqn]$ , and then average over all possible states.

The objective function for an n-step empowerment-driven agent is:

[eqn]

where $[eqn]$ denotes the optimal action under which average context-dependent empowerment is maximized.

Referring to the maximization expression in equations (2.4)–(2.6) and (2.8), we emphasize that $[eqn]$ is assumed to be chosen without constraints, that is, an empowerment-driven agent is free to act, so that the channel capacity can potentially be achieved. This needs to be distinguished from predictive information maximization, where the agent’s action is mapped to the sensory input via a policy $[eqn]$ and hence, is constrained.

Furthermore, equation (2.8) indicates that the action selected at time $[eqn]$ is such that the potential empowerment is maximized at time $[eqn]$ . Therefore, the chosen action $[eqn]$ is different from the action distribution $[eqn]$ that maximizes the mutual information [43,62]. As pointed out in [43], ‘Empowerment considers only the potential information flow, so the agent will only calculate how it could affect the world, rather than actually carry out its potential’.

Equation (2.8) also implies dependence on the current state of the world $[eqn]$ , as the context of the empowerment ( $[eqn]$ ) is a combined result of the current world state $[eqn]$ and the evaluated action $[eqn]$ .

Similar to predictive information, a decomposition of the mutual information in equation (2.4) is shown in figure 6. To intuitively understand two conditional entropies, we utilize the box-pushing example presented in [39]: a grid world with a robot that can move anywhere except where the box is. If the box is pushable but the robot’s sensors cannot capture the box’s location, then the robot cannot perceive its box-pushing actions. This is captured in $[eqn]$ , the unperceivable actions. On the other hand, if the robot can see where the box is but cannot move it, then this information is reflected in $[eqn]$ , the unactionable sensory information. Only the amount of information that is both actionable and perceivable contributes to empowerment.

The Venn diagram of mutual information I(St+n;Atn). Empowerment is the maximum of this mutual information for a given action channel.

In summary, an empowerment-driven agent takes actions that maximize its ability to influence the external world in ways that are perceivable by its own sensors. In multi-agent settings, it has been shown that empowerment-maximization for individual agents leads to spontaneous coordination among the collective [40,63,64]. This coordination arises because shared information enhances an individual’s empowerment, or informally, its ability to make an influence.

Examples of predictive information and empowerment in collective systems, along with their comparisons, are presented in §4.

Beyond intrinsic motivation

2.3.

Another prominent approach to derive behaviours based on fundamental principles is the free-energy principle, which offers a formal account for the representational capacities of physical systems [46]. The free-energy principle was initially proposed by Friston et al. [45] as an attempt to explain embodied perception-action loops in neuroscience, thus providing an understanding of the dynamics of the brain and decision-making. Adoption of this principle led to wide applications in the study of learning [47,48,65], evolutionary dynamics [66], social interactions [67] and collective intelligence [50,68]. The principle centres on the idea that self-organizing biological agents have a natural inclination to resist disorder. It is argued that, as a result, the brain attempts to minimize uncertainty or surprise.

The mechanism derived from the free-energy principle is commonly referred to as active inference. Similar to predictive information and empowerment, active inference can be conceptualized under the perception-action loop representation, although based on different relationships between state variables (figure 7).

The diagram illustrates interactions between elements in the active inference framework. Solid lines represent influences between components. Dash lines represent directed influence from sensory to internal or from action to external, which correspond to the two stages of active inference. Figure is adapted from [49].

An underlying assumption in active inference is that the brain makes Bayesian inference over the external (world) states. Bayesian inference relies on some prior probability distribution over the unknown world and updates the distribution when more information is available. The main ingredients in the formulation of active inference are the generative model $[eqn]$ and the approximate posterior distribution $[eqn]$ . The generative model $[eqn]$ maps causes (external states $[eqn]$ ) to consequences (sensory $[eqn]$ , action $[eqn]$ and internal state $[eqn]$ ). It encodes the dynamics of the external world and integrates the agent’s prior preferences of behaviour [49]. While Bayesian inference relies on updating the prior distribution $[eqn]$ to the posterior $[eqn]$ given the observed data, the posterior is notoriously costly to compute. To reduce the computational difficulty, a parameterized distribution $[eqn]$ is employed as an approximation to the true posterior $[eqn]$ . The approximation distribution $[eqn]$ is parameterized by the internal states $[eqn]$ , which supplies the sufficient statistics of the conditional distribution. The Bayesian inference process with an approximated posterior distribution is referred to as variational Bayesian inference. It is worth noting that the integration of external goals in the generative model sets active inference apart from the other pure intrinsic motivation approaches.

The variational free energy is defined as the Kullback–Leibler divergence between the approximate posterior distribution $[eqn]$ and the generative model $[eqn]$ . The expression can be expanded in terms of the difference between a term that resembles expected energy and an entropy term, hence the name ‘free energy’ [49]:

[eqn]

However, the quantity called in this approach ‘free energy’ is different from the thermodynamic free energy. In active inference, it is instead the variational free energy formulated in terms of information-theoretic quantities, relating to the Bayesian inference process [45,49]. Informally, anything that can be represented in the form:

[eqn]

can be interpreted as ‘free energy’ [69].

Active inference involves two alternating stages: belief update and action selection. During belief update, the agent optimizes the internal representation of the generative model $[eqn]$ given the sensory samples; in action selection, the agent’s action ensures that it samples sensory data that align with its current representation. The belief update stage addresses uncertainty about the current generative model, while the action selection stage addresses uncertainty about the future (including future hidden states and future observable outcomes) [48,70].

The active inference approach has been shown to generate collective behaviour in a group of individuals, each driven by the free-energy minimization scheme [50]. Collective dynamics are influenced by the individual’s belief about uncertainty and can also be tuned to the changing environment by parameter learning over a slower timescale.

Equation (2.9) can be rearranged in terms of the complexity of the internal model and the accuracy of its representation. In this configuration, minimizing free energy is equivalent to reducing complexity, consequently resulting in optimized energy consumption [49,71]:

[eqn]

While equation (2.11) implies a connection between minimizing free energy and reducing energy cost under Landauer’s principle [71], the relationship is not explicitly formulated as a ratio of informational gain to energetic costs.

A more detailed example of free-energy minimization and the comparison with other approaches is provided in §4.

Thermodynamic efficiency

At this stage, we point out that the three information-theoretic approaches reviewed in the previous section do not explicitly account for the corresponding energy costs. Thermodynamic efficiency, on the other hand, takes into consideration both the benefits and the associated costs of maintaining order within the system. Before presenting a formal definition of thermodynamic efficiency, it is important to differentiate between thermal and thermodynamic efficiency.

Thermal vs thermodynamic efficiency

3.1.

Let us consider a system undergoing a non-ideal process in which it receives energy and performs useful work. Not all the received energy is converted into work; some is inevitably lost as heat, which does not contribute to work output (figure 8). Thermal efficiency measures the system’s efficiency of converting energy to work and is defined as the ratio of useful work output to total energy input, both measured in joules, rendering it a dimensionless quantity. In a non-ideal process, the second law of thermodynamics implies that this ratio is less than one.

Thermal efficiency for a system undergoing a specific process. It is generally defined as the dimensionless ratio between the total work output and the total energy input. Adapted from [72].

In contrast, thermodynamic efficiency assesses the conversion of work into the system order, measured during a quasi-static change in the underlying control parameter. It pertains to systems involving interactions among multiple components and considers the benefit of increasing order within a collective system against the thermodynamic cost incurred. A system may transition from a disordered to an ordered state by altering a control parameter according to a specific protocol. Thermodynamic efficiency evaluates how efficiently the system converts the carried out work into order, at each specific value of the control parameter (figure 9). It is quantified as the ratio of the reduction in the system’s configuration entropy (predictability gain) to the generalized work performed during the control parameter adjustment (subject to the unit of the Boltzmann constant $[eqn]$ , e.g., see the expression for entropy, defined in the context of thermodynamics in electronic supplementary material, equation (7)):

[eqn]

Thermodynamic efficiency for a system. It is defined as the ratio between the increase in order and change in the generalized work carried out to generate the order.

where $[eqn]$ is the control parameter, $[eqn]$ denotes the configuration entropy of the system, and $[eqn]$ denotes the generalized work performed to change the control parameter.

Perspectives on thermodynamic efficiency

3.2.

The thermodynamic efficiency of interactions can be expressed in two different ways [52,54] (technical details are provided in electronic supplementary material, §S2 Thermodynamic preliminaries and Fisher Information):

[eqn]

where $[eqn]$ denotes Fisher information.

Thermodynamic efficiency offers a dual perspective on the energy dynamics within systems, encompassing both thermodynamic and computational dimensions. From the thermodynamic viewpoint, this quantity captures the gain in internal order within a collective system of interacting agents (e.g., a swarm) relative to the overall work required to adjust the agent interactions. From a computational viewpoint, thermodynamic efficiency measures the increase in predictability (reduction of uncertainty) of collective action gained by accumulating additional sensitivity to changes in the control parameter along the path $[eqn]$ . For example, a swarm may gain predictability of a collective response by adjusting the individual’s alignment strength or the number of effective neighbours influencing an individual. This, however, may come at the expense of additional sensitivity to changes in these parameters, so that coherent motion may be disrupted by a change of alignment strength or a reduced number of effective neighbours.

A common example

In previous sections, we explored different approaches to quantifying the intrinsic utility of collective behaviour. In this section, we compare the four considered approaches—predictive information, empowerment, active inference and thermodynamic efficiency—using the canonical Ising model as a common example. This example considers a system at equilibrium, in order to provide a direct comparison of the collective behaviours resulting from optimizing utility functions in the absence of external fluxes.

The 2D-Ising model

4.1.

The 2D-Ising model offers a simplified representation of ferromagnetism in statistical mechanics. It models a collection of sites that can each exhibit either an up-spin or down-spin configuration while interacting with their neighbours to create a complex aggregate dynamic. The 2D-Ising model is particularly relevant to our comparison due to its characteristic phase transition in the collective dynamics.

The 2D-Ising model considers a lattice of atoms with magnetic spins oriented either up or down (figure 10). The vertices of the lattice are referred to as ‘sites’ and the edges as ‘links’. Assuming the absence of an external magnetic field, the energy of a site is determined by the total energies in the links with its neighbours. Each site prefers to be in a lower-energy state. For ferromagnetic materials, maintaining a link between two sites with opposite spins requires additional energy, hence there is a natural tendency for a site to align its spin with those of its neighbours. The susceptibility of a site to neighbouring influences depends on the coupling strength $[eqn]$ . A high value of $[eqn]$ indicates strong coupling between sites, leading to a greater tendency for spins to align with neighbouring sites.

A lattice of atoms with dipole magnetic moments. Links in red represent higher energy bonds (where two adjacent atoms have opposite spins), and blue represents lower energy bonds (where two adjacent atoms are aligned).

In this example, the dynamics of the Ising model are interpreted from a perception-action loop perspective: each site acts as an agent that senses the energy of its neighbourhood and ‘decides’ whether to flip its spin or maintain its current state. The agency of each site is determined by the choice of the coupling strength $[eqn]$ . With a high $[eqn]$ value a site is more prone to align with its neighbours, and vice versa. This parameter governs the strength of the site’s response to the neighbourhood’s energy landscape, influencing its decision to align with neighbouring spins.

To draw a clearer connection between the Ising model and the perception-action loop, we consider that each of the four elements of the perception-action loop has a corresponding representation in the Ising model:

— $[eqn]$ (world): the magnetization (average spin) of the lattice;— $[eqn]$ (sensory): the energy state of a site (defined in equation (4.1));— $[eqn]$ (action): flipping the spin or remaining unchanged, with flip = $[eqn]$ and no-flip = $[eqn]$ ;

To quantify the energy, we define:

— $[eqn]$ : the number of sites in the lattice;— $[eqn]$ : the coupling strength between adjacent sites;— $[eqn]$ : the spin of site $[eqn]$ , with $[eqn]$ representing up spin and $[eqn]$ is down;— $[eqn]$ : the configuration of the lattice $[eqn]$ .

Let $[eqn]$ , $[eqn]$ be two sites connected by a link, then:

[eqn]

Considering the interactions between a site and its nearest four neighbours only, the total energy of this site is:

[eqn]

where $[eqn]$ denotes the set of neighbouring sites of $[eqn]$ .

We simulate the process for both Glauber dynamics [73] and Metropolis [74,75] dynamics. At each time step, a site was selected uniformly at random. Under Glauber dynamics, the site’s spin flips with the probability:

[eqn]

Alternatively, using Metropolis dynamics, the probability to flip is:

[eqn]

where $[eqn]$ is the inverse of temperature and $[eqn]$ is the change in energy after a flip. We assume $[eqn]$ for the purpose of this experiment. The energy change $[eqn]$ is computed as:

[eqn]

The simulation setting is detailed in electronic supplementary material, §S3 Simulation of Ising model, and the source code is available in [76]. For the analysis, we computed all four intrinsic utility measures for systems at different values of control parameter $[eqn]$ . The goal is to compare the range of control parameter $[eqn]$ that optimizes each utility and discuss the implications of the different optimal ranges for each utility measure and the associated characteristics of self-organized behaviours.

Computational results

4.2.

During the simulations, we hold the coupling strength $[eqn]$ constant and run the simulation until the system reaches equilibrium. We then calculate the corresponding predictive information, empowerment, free energy (active inference) and thermodynamic efficiency. To eliminate the effects of initial conditions, we average these quantities across multiple simulations for each $[eqn]$ . This process is repeated for a range of $[eqn]$ values. We aim to identify the optimal range of $[eqn]$ values under each approach in order to answer the question: ‘If the coupling strength $[eqn]$ evolves independently using each of these quantities as the fitness function, what behaviour should we expect when fitness is optimized?’

We collect the following data for the selected site and the lattice at time $[eqn]$ :

— $[eqn]$ : the action of flip (−1) or no-flip (+1) at time $[eqn]$ ;— $[eqn]$ : spin of the selected site before and after the action is performed;— $[eqn]$ : the magnetization of the lattice before and after the action is performed;— $[eqn]$ : the selected site’s sensory state before and after the action is performed.

The data form time series $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ , using which we compute the intrinsic utility measures. Leveraging the homogeneity of the lattice sites, we can aggregate the random samples from different sites to compute the measures. This approach ensures that the results represent the intrinsic utility values corresponding to the coupling strength $[eqn]$ as experienced by an average site within the lattice.

Predictive information

4.2.1.

For a given coupling strength $[eqn]$ , the corresponding one-step predictive information is the mutual information between the pre-action sensory state ( $[eqn]$ ) and post-action sensory state ( $[eqn]$ ):

[eqn]

where the probability distributions are parameterized by $[eqn]$ .

Figure 11 shows predictive information $[eqn]$ for each coupling strength $[eqn]$ , with two main observations highlighted. Firstly, predictive information is higher under weak coupling and decreases to nearly zero under strong coupling; this observation applies to both Glauber and Metropolis dynamics. Recall that predictive information can be decomposed into two terms:

[eqn]

Average predictive information plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Predictive information maximizes when J is small (weak coupling), where an average site exhibits explorative behaviour.

The overall trend is driven by the diversity component of the equation. Weak coupling promotes more exploratory behaviours, leading to diverse sensory states. In contrast, strong coupling (as $[eqn]$ ) prevents sites from flipping after settling into the lower-energy state, resulting in a predictable system with little sensory diversity.

Secondly, predictive information optimizes at different coupling strengths for Glauber and Metropolis dynamics—a phenomenon driven by the unpredictability term (figure 12). Due to different behaviours of the conditional entropies in the sub-critical regime, predictive information maximizes close to the critical point in Glauber dynamics, while peaks as $[eqn]$ under Metropolis dynamics. A detailed comparison of the two dynamics is provided in electronic supplementary material, §S3 Simulation of Ising model.

Decomposition of predictive information into richness (dotted line) and unpredictability (dash line) components; predictive information is the difference between the two curves. Results obtained from the average over 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions.

Empowerment

4.2.2.

To compute the average empowerment of a site at equilibrium, we first analytically derive the channel capacity of the action channel. In this model, the action channel is defined by the conditional probability $[eqn]$ . The state of pre-action sensory $[eqn]$ determines this conditional probability, and thus, two cases must be considered separately: $[eqn]$ and $[eqn]$ .

When $[eqn]$ , meaning that the up and down spins of the neighbours are not perfectly balanced, a flipping action will result in the next sensory state becoming the opposite of what it was before the flip. The action channel, in this case, resembles the one shown in figure 13. This is a noiseless binary channel and, by definition, has channel capacity $[eqn]$ bit. Full capacity is achieved when the site follows action distribution $[eqn]$ .

A noiseless binary channel. Channel capacity C = 1 bit.

If $[eqn]$ , the channel simply reduces to the one shown in figure 14, that is, a channel that carries no information as the output is always the same. This means that the channel capacity is zero, $[eqn]$ bit.

A single output channel. Channel capacity C = 0 bit.

Combining these two cases, we obtain:

[eqn]

The average one-step empowerment, averaged over the distribution of channels, is computed as follows:

[eqn]

Empowerment optimizes at a strong coupling (figure 15), where the lattice stabilizes with most atoms aligned at the equilibrium. Empowerment measures an agent’s ability to inject information into the environment via current actions and later retrieve the information via its sensors. A site’s action is most perceivable when all its neighbours align in the same direction, in which case the action of flip or no-flip leads to distinct sensory outputs ( $[eqn]$ or $[eqn]$ , $[eqn]$ ). Conversely, if four neighbours have an equal split between up and down spins, flipping the spin of a site does not change its sensory state. That is, the site will not be able to perceive the impact of its action. A large positive $[eqn]$ value increases the probability of an average site being at the configuration where all its neighbours have the same spin, thereby maximizing the site’s empowerment by ensuring its actions produce noticeable changes to its future sensory inputs. Empowerment is not affected by the choice of spin-flip dynamics (Glauber or Metropolis).

Average empowerment plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Empowerment optimizes at strong coupling, where the collective stabilizes to a uniformly aligned configuration at equilibrium. Such configuration maximizes an average site's ability to retrieve the impact of its action through future sensory states.

Variational free energy (active inference)

4.2.3.

For the purpose of this study, the active inference framework is adopted from [77], thus focusing solely on its intrinsic component. A negative sign is placed before the expression, effectively transforming the minimization problem into maximizing the action value $[eqn]$ . For each possible action $[eqn]$ , we compute the one-step negative free energy following the derivation from [77]:

[eqn]

where the probability distribution $[eqn]$ is parameterized by coupling strength $[eqn]$ .

The average negative free energy at equilibrium is computed by weighted average across the proportion of $[eqn]$ (flip) and $[eqn]$ (no-flip) actions:

[eqn]

For a given coupling strength $[eqn]$ , this measure represents how much discrepancy an average site should expect between its internal model (based on the local sensory history) and the external world (described by the lattice magnetization); in other words, how well the site’s local information aligns with the underlying global situation.

The negative free energy plot (figure 16) reveals a similar optimal region for $[eqn]$ as empowerment. Maximizing negative free energy effectively minimizes surprise. When $[eqn]$ , all the sites are actively flipping their spins. A chosen site’s local sensory states can take on all possible values, while the global magnetization is averaged to zero. This mismatch results in large surprise for an average site. On the other hand, more atoms are aligned at large $[eqn]$ , increasing the likelihood that an average site correctly predicts the overall spin direction, hence reducing surprise. The same trend holds for both Glauber and Metropolis dynamics.

Average negative free energy plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Negative free energy maximizes at strong coupling, where the spins are aligned at equilibrium, and the site experiences minimum surprise comparing its approximate distribution of the world state (based on local sensory information) to the actual distribution that generates the world states (global property).

Thermodynamic efficiency

4.2.4.

Thermodynamic efficiency $[eqn]$ for each corresponding $[eqn]$ is computed as:

[eqn]

The numerator is the derivative of the configuration entropy $[eqn]$ of the lattice with respect to the control parameter $[eqn]$ . It represents the reduction of uncertainty in the lattice’s configuration as a result of a small variation in the coupling strength $[eqn]$ . Using the Kikuchi approximation, the configuration entropy is [78,79]:

[eqn]

where $[eqn]$ is the entropy of size k sub-lattices.

The denominator in this calculation is the integral of Fisher information with respect to the control parameter $[eqn]$ , representing the work required by the system to instigate the change $[eqn]$ . The integration limit extends from $[eqn]$ , the point of evaluation, to $[eqn]$ , the zero-response point. Ideally, $[eqn]$ , but in this numerical experiment, setting $[eqn]$ is sufficient, ensuring that the system reaches perfect order at equilibrium and no further work can be done. The method for numerically computing Fisher information is detailed in electronic supplementary material, §S4.

Thermodynamic efficiency reaches optimum when $[eqn]$ is near the critical value $[eqn]$ [80], as shown in figure 17. At the vicinity of the critical point, even a small increase in the control parameter $[eqn]$ results in a significant reduction in the system’s disorder. Consequently, the work performed to establish order in the system achieves the highest efficiency. This observation indicates that the collective systems that optimize thermodynamic efficiency at the same time operate at the critical regime. The same result holds for both Glauber and Metropolis dynamics.

Average thermodynamic efficiency plotted against different values of J, computed from 100 simulations, each of 20 million time steps on a 50×50 square lattice with periodic boundary conditions. Thermodynamic efficiency optimizes at the critical regime, where a significant portion of the work expended in tuning the parameter J contributes to reducing configurational entropy; that is, the collective is most energetically efficient in creating internal order.

It is important to note that the numerical values of thermodynamic efficiency tend to exhibit more noise than other metrics. This is due to the computation of the entropy derivative and Fisher information. Thermodynamic efficiency peaks slightly to the right of the critical value due to finite-size effect, as shown in figure 18. Despite these numerical nuances, the presented computational results suggest that optimizing thermodynamic efficiency within a collective system is achieved at the critical regime. This argument frames the thermodynamic efficiency as an intrinsic utility and provides an explanation why collective behaviours induced by this utility often exhibit critical phenomena.

Finite-size analysis near the critical coupling Jc for both Glauber and Metropolis dynamics. Results show that as lattice size increases, the peak of thermodynamic efficiency approaches the critical coupling strength Jc.

Principle of super-efficiency

Many natural systems with a large number of interacting components exhibit self-organization, forming larger structures or coherent collective behaviours without external coordination. Locally interacting neurons collectively perform complex brain functions while processing diverse stimuli [7,10]. Active matter comprising self-catalytic colloidal particles produces polar collective motion [81,82]. Starlings form flocks that move in intricate patterns in response to environmental changes [11,83]. These self-organizing collective behaviours are often observed at critical regimes which seem to balance the fluidity (or adaptability) offered by amorphous, disordered groups (e.g. granular materials or liquids) and the stability (or persistence) provided by rigid, ordered structures (e.g. crystalline materials).

The ubiquity of collective behaviours that tend to self-organize near or at critical regimes suggests that there is an underlying principle governing such behaviour across different systems. By abstracting from the specific details of each system, we may uncover not only why collective systems self-organize but also why self-organization often brings the collective close to the critical regime. A possible underlying principle would need to interpret these behaviours in terms of generic intrinsic utilities.

Motivated by these observations, we propose the principle of super-efficiency:

Self-organizing collective systems strive to maximize the thermodynamic efficiency of interactions by gaining maximal predictability of collective behaviour per unit of the expended work. This efficiency is maximized at the critical regime.

In general, the gained predictability enhances coordination within the system, facilitating efficient interactions. Systems with high or maximal thermodynamic efficiency tend to operate at or near the critical regime where long-range correlations and scale invariance bring collective benefits. Thus, thermodynamic efficiency may provide an intrinsic utility to the system by balancing energy costs with group coherence. This would explain the ubiquity of collective animal behaviours, such as swarming, herding, flocking, which self-organize to criticality.

Formally we express the principle of super-efficiency via maximization of the ratio between the reduction in configuration entropy and the incurred generalized work (equation (3.2)). Given a protocol adjusting the interactions among the constituent components of the system, maximization of the thermodynamic efficiency occurs through tuning the corresponding control parameter $[eqn]$ 1:

[eqn]

The expression above quantifies the abstract notion of super-efficiency: a collective system maximizes the ratio of predictability gain to the additional work required to tune the interactions among its components. In the following subsection, we justify this conjecture by taking evolutionary, cognitive and social perspectives.

Maximizing thermodynamic efficiency as an intrinsic utility

5.1.

Thermodynamic efficiency can be optimized in two ways: either by reducing the additional work (i.e. energy consumption) for a fixed predictability gain or by increasing the predictability gain given a fixed energy budget to carry out the additional work.

From an evolutionary perspective, minimizing energy consumption when establishing structural or functional order preserves resources for survival and reproduction [84,85]. Organisms that achieve goals with less energy per adaptation gain a selective advantage. On the other hand, given energy constraints on behavioural change, increasing predictability enhances group coordination and collective survival [86–88].

In terms of cognitive economy [89,90], individuals use heuristics and mental shortcuts to reduce the energy needed for policy and behavioural changes, freeing resources for other cognitive tasks. Under a fixed energy budget, maximizing predictability improves coordination in complex cognitive and social tasks [91,92].

For social dynamics, minimizing the energy required for a social transformation conserves resources for the long-term benefit of society, for both current and future generations [93,94]. Alternatively, given energy limitations, maximizing predictability enhances strategic planning and policy implementation, particularly in times of crisis [95–97].

Super-efficiency in canonical models

5.2.

An earlier study by Nigmatullin et al. [54] derived an analytical expression for the thermodynamic efficiency of interactions for the canonical Curie–Weiss model (a fully connected ferromagnetic model), showing that it diverges at the critical point as the temperature $[eqn]$ :

[eqn]

that is:

[eqn]

We can also show that thermodynamic efficiency diverges at the critical point for the canonical Ising model with nearest neighbour interactions. The thermodynamic efficiency $[eqn]$ as the coupling strength $[eqn]$ is given by (see electronic supplementary material, §S5):

[eqn]

that is:

[eqn]

Evidence from simulated systems

5.3.

Several previous studies of thermodynamic efficiency [51–54], as well as the computer simulation presented in §4, exemplify the general principle of super-efficiency.

Crosato et al. [52] explored this relationship near criticality using a model of self-propelled particles. They defined thermodynamic efficiency $[eqn]$ as the reduction in entropy relative to the work done on the system and demonstrated that as particles undergo a kinetic phase transition from disordered to coherent motion, $[eqn]$ peaks at the critical regime. In other words, the collective motion becomes coherent at the critical point, where the system is most energetically efficient in reducing its configurational entropy. Hence we can argue that at this point, the system of self-propelled particles is super-efficient (i.e. maximally efficient) in coordinating its collective behaviour, offering a clear utility to the group.

This concept was also applied for urban transformations [51], where the maximum entropy principle coupled with Lotka–Volterra dynamics was used to study shifts in population and income distribution in urban areas. The study considered the thermodynamic efficiency $[eqn]$ expressed as the increase in predictability of the population income flows relative to the thermodynamic work required to adjust the social disposition (i.e. the factor balancing the suburbs’ attractiveness). The study identified a phase transition in urban dynamics, where the number of affluent (i.e. service-abundant) suburbs would change abruptly in response to a small change in the underlying social disposition parameter. Importantly, the thermodynamic efficiency was observed to peak at the phase transition, which occurred precisely at the point balancing monocentric and polycentric urban configurations, providing an intrinsic utility for social dynamics.

In the context of epidemic modelling, Harding et al. [53] examined the thermodynamic efficiency $[eqn]$ of contagions diffusing on a network, defining $[eqn]$ as the ratio of uncertainty reduction in the system to work expenditure required to quasi-statically change the control parameter (e.g. the infection transmission rate). Their numerical analyses identified a phase transition between sub-critical (non-epidemic) and super-critical phases (epidemic) as the infection transmission rate increased, with the highest thermodynamic efficiency observed at the critical regime. In this case, the intrinsic utility of an intervention process, considered as social and public health dynamics, is offered by the reduction of pathogen transmission probability: this utility is represented by thermodynamic efficiency maximized at the transition from super-critical to sub-critical phase. Alternatively, the intrinsic utility for the pathogen evolution, considered as a biological phenomenon, would be provided by the increase in the transmission probability: this utility is captured by thermodynamic efficiency maximized at the transition from sub-critical to super-critical phase.

In summary, the principle of super-efficiency is supported by the observed peak in thermodynamic efficiency at the critical regime in various simulated systems, including self-propelled particles, urban transformations and epidemic modelling. Despite differences in the application settings, the intrinsic utility could be characterized by maximization of thermodynamic efficiency achieved at the critical regime.

Evidence from empirical systems

5.4.

The principle of super-efficiency is further supported by empirical studies of collective behaviours in natural systems. While no single empirical study fully justifies the principle, evidence from multiple studies shows that collective systems tend to maximize predictability (through internal order) while minimizing energy consumption.

Empirical studies of European starling show that their flocks exhibit scale-free correlations in velocity and speed fluctuations—a signature of criticality [11,19,83]. Long-range correlations enhance collective responsiveness to external stimuli (e.g. predators, food resources), allowing flocks to achieve higher coordination of group behaviour (i.e. gain predictability). Friman et al. [98] found that starlings flying in small groups save metabolic energy compared to flying solo, especially when the individuals maintain consistent follower positions behind leaders in V-like formations. These energy savings result from structural coordination, reflecting group-level energy optimization through positional ordering. Together, these findings suggest that starling flocks operate near a critical regime, balancing predictability gain with energy efficiency. Importantly, these studies contrasted different flocking configurations (ranging between extremes such as solo and coherent groups), so that higher coordination or lower energy consumption can be interpreted as a gained predictability or additional work required to change the configuration. The thermodynamic efficiency is precisely the ratio of these two quantities.

Experiments with ant colonies [6,99,100] also show evidence of disorder-to-order phase transition and signatures of criticality (scale-invariant dynamics) in ant foraging activities. These studies suggest that ant colonies benefit from operating at the vicinity of critical point where they maintain a certain level of coordination. These studies show that ants move faster in larger groups, indicating efficiency gains from social coordination. Porfiri et al. [101] explained how ant colonies achieve energy savings through social interactions balancing positive and negative feedbacks. Such balance keeps metabolic cost growing sublinearly with colony size, permitting lower energy consumption per individual for larger colonies. Collectively, these findings show that ant colonies self-organize near criticality for maximum efficiency of interactions, where group order is established with low energy consumption. Again, the comparison between different group sizes allows for the interpretation of the coordination increase and energy savings as the predictability gain and additional work, respectively. The ratio of these changes is captured by the thermodynamic efficiency.

The organization and function of brain network also show evidence of balancing predictability gain and energy consumption. Empirical analysis of rat cortex data shows that sensory representation is achieved by reducing entropy (and conditional entropy) of the neuronal responses to stimulus [102]. In other words, sensory adaptation maximizes predictability gain in the neuronal activities. Takagi [103] modelled brain network formation by minimizing the ratio of activity cost over wiring cost and found structural similarities—such as hubs and clusters—between the simulated network and empirical brain networks from various species. By contrasting different network configurations, that is, from completely random networks to small-world networks to fully connected networks, these studies suggest that the brain network evolves to maximize predictability gain while minimizing energy consumption per adaptation.

Implication: why collective systems self-organize to criticality

5.5.

The principle of super-efficiency aims to explain why it is beneficial for a group of interacting agents to operate at the critical regime, rather than how the system could self-organize to this point—the latter question is pursued by SOC models. The underlying rationale is that for a self-organizing system with many interacting components, being energetically efficient in reducing disorder and creating internal coordination is advantageous, as has been demonstrated by the considered simulation and empirical studies.

In short, the thermodynamic efficiency offers an intrinsic utility for a collective. When this utility is maximized at a specific configuration (e.g. coupling strength, network connectivity, social disposition), it optimizes evolutionary fitness, cognitive economy or social welfare. In the considered studies, the super-efficiency of collective behaviour has occurred at the critical regime, which is expected, as confirmed by the divergence of thermodynamic efficiency at the critical point in canonical models. Thus, super-efficiency may provide a general principle for understanding why collective systems self-organize to the critical regime.

To re-iterate, a super-efficient self-organizing system approaches the point where it can gain maximal predictability of its collective behaviour, given the amount of additional work available to change the control parameter. Alternatively, given a desired predictability gain, a super-efficient system seeks the point where the energy cost of changing the control parameter is minimal.

Discussion

Is there an intrinsic utility for self-organizing collective systems to operate at the critical regime? In attempting to explore this question, we overviewed notable intrinsic utility measures, using both information-theoretic and thermodynamic perspectives. The considered measures were directly compared using a common example that we constructed in order to connect the canonical 2D-Ising model to the perception-action loop.

The connection is established by conceptualizing each site in the Ising lattice as an agent possessing sensory-motor capabilities, thereby linking the model to the perception-action loop framework. The choice of spin-flip dynamic is analogous to the embodiment of the agents. Optimization of the control parameter—the coupling strength $[eqn]$ —may be considered analogous to choosing the sensory channels [104] that maximize specific intrinsic utility given the embodiment.

Optimal $[eqn]$ values were computed for different approaches, including predictive information maximization, empowerment maximization, free energy minimization and thermodynamic efficiency maximization. For the considered example, each approach exhibited a distinct optimal range of parameter values, offering intuitive insights into the underlying driver shaping collective behaviour:

—Predictive information maximizes at sub-critical coupling strength for Metropolis dynamics and near-critical regime for Glauber dynamics, balancing sensory richness with predictability;—Empowerment maximizes at super-critical coupling strength, where the individuals have maximal influence over the environment;—Free energy minimization (with intrinsic component only) also leads to super-critical coupling strength, where local observations align most closely with the global configuration, hence surprise is minimized;—Thermodynamic efficiency maximization optimizes near the critical regime, achieving maximum entropy reduction per unit of work expended.

Thus, thermodynamic efficiency, measured by the entropy reduction or predictability gain relative to the associated thermodynamic work carried out, might be a candidate for the intrinsic utility of criticality.

In the Ising model example, each measure exhibits the same characteristic behaviour reported in previous studies—balancing diversity and predictability for predictive information [32,33], maximizing perceivable influence for empowerment [39,40], aligning the internal model with external generative model for active inference [45,46], and optimizing entropy reduction relative to work for thermodynamic efficiency [51,52]. This consistency indicates that the findings we observed in our simulations reflect the measures’ underlying properties rather than artefacts of a specific model. We also tested different Ising model dynamics, both yielding similar results. The considered model represents a broad class of collective systems; this establishes an adequate range of applicability for our comparative analysis.

One limitation of the considered model is its equilibrium dynamics. In contrast, many biological systems operate far from equilibrium, continuously exchanging energy, matter and information with the external environments. Here, we restricted our analysis to equilibrium thermodynamics for two main reasons: (i) previous studies on thermodynamic efficiency are based on equilibrium models, making this a logical starting point for a primer, and (ii) we sought a simplified setting that highlights the characteristics of self-organizing behaviours driven by each intrinsic utility without introducing too many additional assumptions.

Extending the study to non-equilibrium systems is an important direction for future research. For example, a previous study on non-equilibrium flocking [105] has shown that as velocities of the birds become more aligned, the entropic force can break the flock apart unless it is finely counterbalanced by a cohesive force. Hence, an intrinsic utility such as thermodynamic efficiency might serve as a candidate fitness function for achieving such balance. This would establish whether the principle of super-efficiency is applicable to non-equilibrium systems when significant entropic forces are present.

Informed by this analysis, as well as the relevant studies of thermodynamic efficiency [51–54], we proposed a general principle, the principle of super-efficiency, that may explain why collective systems self-organize to the critical point. The principle of super-efficiency states that at the critical point, a self-organizing system achieves an optimal entropy reduction relative to the thermodynamic costs. The ability to reduce entropy efficiently grants the collective system an advantage, offering an intrinsic motivation to operate near the critical point. We believe that the principle of super-efficiency has implications for the broader field of guided self-organization, informing the design of intelligent, adaptive systems that achieve superior coordination, decision-making and resource management.

Bibliography107

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Haken H. 1983 Synergetics: an introduction. In Springer series in synergetics (ed. H Haken), 3rd edn. Heidelberg, Germany: Springer Berlin. (10.1007/978-3-642-88338-5) · doi ↗
2Haken H. 2006 Information and self-organization: a macroscopic approach to complex systems. In Springer series in synergetics (ed. H Haken), 3rd edn. Heidelberg, Germany: Springer Berlin. (10.1007/3-540-33023-2) · doi ↗
3Camazine S, Deneubourg JL, Franks NR, Sneyd J, Theraulaz G, Bonabeau E. 2003 Self-organization in biological systems. In Princeton studies in complexity (eds SA Levin, S Strogatz). Princeton, NJ, USA: Princeton University Press.
4Kauffman SA. 2000 Investigations. New York, NY, USA: Oxford University Press.
5Prokopenko M, Boschetti F, Ryan AJ. 2009 An information‐theoretic primer on complexity, self‐organization, and emergence. Complexity 15, 11–28. (10.1002/cplx.20249) · doi ↗
6Beekman M, Sumpter DJT, Ratnieks FLW. 2001 Phase transition between disordered and ordered foraging in Pharaoh’s ants. Proc. Natl Acad. Sci. USA 98, 9703–9706. (10.1073/pnas.161285298)11493681 PMC 55516 · doi ↗ · pubmed ↗
7Beggs JM, Plenz D. 2003 Neuronal avalanches in neocortical circuits. J. Neurosci. 23, 11167–11177. (10.1523/jneurosci.23-35-11167.2003)14657176 PMC 6741045 · doi ↗ · pubmed ↗
8Shmulevich I, Kauffman SA, Aldana M. 2005 Eukaryotic cells are dynamically ordered or critical but not chaotic. Proc. Natl Acad. Sci. USA 102, 13439–13444. (10.1073/pnas.0506771102)16155121 PMC 1224670 · doi ↗ · pubmed ↗