Endogenous labour flow networks
Kathyrn R. Fair, Omar A. Guerrero

TL;DR
This paper introduces a new model for understanding how job transitions change over time, adapting to shifts in the labor market.
Contribution
The novel model generates labor flow networks from agent-level behavior without assuming static network structures.
Findings
The model accurately generates empirical labor flow networks using UK microdata.
It explores how shocks to job and wage distributions alter network topology.
The framework provides a foundation for modeling future labor market dynamics.
Abstract
In the last decade, the study of labour dynamics has led to the introduction of labour flow networks (LFNs) as a way to conceptualise job-to-job transitions, and to the development of mathematical models to explore the dynamics of these networked flows. To date, LFN models have relied upon an assumption of static network structure. However, as recent events (increasing automation in the workplace, the COVID-19 pandemic, a surge in the demand for programming skills, etc.) have shown, we are experiencing drastic shifts in the job landscape that are altering the ways individuals navigate the labour market. Here we develop a novel model that emerges LFNs from agent-level behaviour, removing the necessity of assuming that future job-to-job flows will be along the same paths where they have been historically observed. This model, informed by economic theory and microdata for the United…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —http://dx.doi.org/10.13039/501100000266Engineering and Physical Sciences Research Council
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRegional resilience and development · Regional Economics and Spatial Analysis · Labor market dynamics and wage inequality
Introduction
Labour markets are highly heterogeneous complex systems that shape the economy of every country in the world. Recently, technological changes such as the automation of work and worldwide shocks like the COVID-19 pandemic have produced structural changes that are reshaping labour mobility in new ways. For example, remote working has enabled new employment opportunities to people who, previously, may not have applied or being considered for those positions due to geographical constraints. A decade ago, a body of literature that employs network-science methods emerged and grew under the umbrella of labour flow networks (LFNs) [1–12]. Much of these works, however, can only address short-term dynamics, as they operate under the assumption of a constancy in labour market conditions (no structural changes). Thus, new modelling frameworks that can explain the endogenous formation of LFNs from agent-level behaviour (as opposed to historical data on labour flows) are necessary. This paper develops one such framework by combining insights from the LFN literature with well-accepted microfoundations of economic labour market models. We calibrate our model to the UK labour market and conduct a systematic analysis of how the UK LFN responds to changes in the distribution of job positions and wages; two factors that are susceptible to global shocks, free trade agreements, technological change, global supply chain disruptions, etc. We find that LFN structure is more sensitive to changes in the job distribution than in the wage distribution, and that the extent of impacts depends not only on how many positions are affected but on which industries these positions belong to. To the extent of our knowledge, this is the first model that is able to reproduce the topology of LFNs with a high fidelity (i.e. it is not limited to reproducing stylised facts like degree distributions) through agent-level behaviour and, thus, it represents a milestone towards creating models that can address a broad class of dynamical problems about labour and, more generally, the future of work.
The rest of the paper is organised in the following way. First, we review recent research concerning labour flow networks to situate our own work within this literature. Then, we present the main results from our modelling exercise. Following this, we provide conclusions and discuss avenues for advancement. Finally, we detail our modelling framework, including the data used to inform the model and the methodology used for calibration.
On the study of labour flow networks
In recent years, network science methods have helped improving our understanding of labour dynamics at a highly desegregated level. This strand of research can be encompassed under the umbrella of LFNs. By using data on employment histories recorded through surveys, administrative databases, and online recruitment platforms, LFN studies characterise the labour market as a complex network where nodes represent jobs or sets of jobs (e.g., firms, industries, occupations, etc.) and edges indicate observed flows between them. These networks reveal structural properties of the labour market with potential implications in their dynamics. Importantly, this literature differentiates itself from the sociological [13] and economic [14] traditions focusing on the diffusion of vacancy information through social networks (see [15, 16] for comprehensive surveys). Instead LFN studies view realised labour flows as a source of information to infer structural properties of the labour market. For example, a core-periphery structure in a firm-level LFN reveals highly compartmentalised dynamics in the sense that workers need to gain employment in a core firm at some point in their career in order to flow to a different part of the periphery.
While LFN studies have become increasingly popular, most of this work focuses on descriptive analysis of network topologies. For example, using employee-employer matched records from the entire economy of Finland, [1] are the first to apply a systematic analysis of LFNs by looking at metrics such as degree distributions, clustering coefficients, associativity, and their relationship to firm properties such as size and profits. [2] use a similar dataset from Sweden to determine if certain flows can be predicted by the fact that workers were previously employed by the same organisation. [3] applies community detection algorithms to improve the definition of industrial and occupational groups using US survey data. [4] deploy configuration models to demonstrate that the matching function of aggregate economic models cannot account for the topology of LFNs. [7] use large-scale LinkedIn data to identify geo-industrial clusters at a global scale.
Recently, the interest on LFNs has shifted towards a generative modelling point of view. In particular, the focus has been on understanding the dynamics of labour flows when constrained by an LFN. For instance, in the same spirit as in econophysics, [5] develop a random-walk model to study the effect of different network topologies and firm-level parameters in the concentration and dissipation of unemployment after a shock. [8] generalise such a model and show that it can predict empirical firm size distributions while enabling the inference of firm-specific unemployment. [9] applies a similar model to analyse the mobility of workers across occupations. [10] advance upon traditional labour market models by accounting for occupations as sub-markets and updating the job matching function accordingly, using this model to relate the economic resilience of cities to their job connectivity. [6] go beyond the econophysics approach and provide economic microfoundations to the parameters describing the firms’ hiring rates. They find that, in equilibrium, firm behaviour correlates and amplifies the impact of economic shocks on unemployment. [11] utilise a model based on [6] to explore the relationship between labour mobility, savings, wages, and debt. To the best of our knowledge, the studies by [6, 11] provide the only models of worker dynamics on LFNs with economic microfoundations.
Unfortunately, both the data analysis and the existing models in the LFN literature rely on one crucial assumption: that the network is fixed or exogenous. In other words, existing modelling ([5, 6, 8–11]), relies on LFNs constructed from historical data on labour market movements. This is a reasonable assumption in the study of short-run dynamics,1 or when one can discard structural transformations. In other situations such as aggressive technological changes or the shock of a global pandemic, the job and salary landscape may transform in ways that historical labour flows are not a reliable source of information to explain future labour dynamics. Thus, it is necessary to develop new modelling frameworks that generate observed LFNs endogenously–emerging LFN structure rather than assuming it–and to go beyond pure stochastic processes by providing economic microfoundations. This is a challenging task because, in coupling agent-level economic behaviour, one risks losing parsimony and, hence, empirical usability.
In this paper, we develop a model that fits within the proposed framework and overcomes the associated challenges. In addition, we provide an efficient calibration algorithm, fit the model to comprehensive microdata of UK labour mobility, and systematically analyse the sensitivity of the network topology to restructures in the job and wage distributions across industries, regions, and occupations. Our approach achieves a balance between the parsimony of econophysics models and the insights into the causal mechanisms provided by agent-level labour market models in order to achieve economic meaningfulness and empirical reliability.2
This work advancing the LFN literature comes at a time when the economics community is more receptive to the potential of data-driven agent-based economic models [20–22]. More recently, mainstream labour economists have opened up to the idea of including LFNs as part of the rational equilibrium apparatus [23, 24]. However, economists are usually concerned with adjustments in wages and how positions are destroyed and created due to technological changes. None of these are the focus of this paper, as both wages and positions are exogenous. While, in the past, we have analysed the problem of wage formation and hiring strategies ([6]), studying these features requires modelling firms explicitly, and this is something that we decided to leave out for this paper as it is challenging enough to generate mobility patterns endogenously purely from the perspective of workers’ behaviour (and it has not been done before).
The labour economics community that is closest to our work is the one specialised in search and matching models. While previous works have built bridges between matching models and LFNs through network analysis [1, 4], agent-based models [5], Markov processes [8], data science [25], and neoclassical modelling [6], LFN studies have evolved as a separate field for more than a decade now. In fact, our contribution in this paper builds almost entirely on the LFN literature. Should the reader be interested in a comparison of the advantages and disadvantages of this approach, we refer them to [1, 5].
This work also produces an advancement upon the co-occurrence literature that has adopted metrics of economic complexity [26] to analyse the co-occurrence of skills in occupations [12, 27–31] (or industries [30], or regions [31]). Our model, while it employs between-occupation skill similarity, goes beyond this idea to present a more comprehensive picture of the factors that impact labour mobility (e.g. the decision-making of individual workers, the importance of their age in discounting behaviour, their expectations about employment prospects, the impact of geography and industry).3
Materials and methods
Data
The main data source is the UK’s Labour Force Survey (LFS), consisting of longitudinal information about British households and individuals [32]. We utilise data from 2012-2020, collected on a quarterly basis, with individuals being sampled for 5 consecutive waves, and a fifth of the sample being replaced every wave. Currently, each quarterly dataset (covering 5 waves) contains information on approximately 37,000 households (90,000 individuals) from Great Britain and Northern Ireland [33].
The LFS identifies the region, industry, and occupation of the respondent’s job at the time of the interview. It can thus be used to track changes in these variables over time such that regional, industrial, and occupational mobility can be measured and modelled. These data allow us to identify job-to-job flows, where employed individuals move to a new position. Several individual characteristics are also present, for example, sociodemographic ones (age) and employment-related variables (net weekly earnings, hours worked, etc.)
The flows identified in the LFS data characterise the complex mobility patterns that emerge in the labour market through time. LFNs have become a standard way to encode such information in an intuitive and analytically tractable way [1, 4, 6]. We construct three LFNs by accumulating observed labour flows from the LFS data, and grouping these flows by geographical regions (based on the GORWKR variable from LFS, hereafter referred to as regions), industries (UK Standard Industrial Classification sections), or occupations (UK Standard Occupation Classification major groups). Therefore, these LFNs are directed weighted graphs where each node represents a region (or industry, or occupation) and an edge between two nodes represents a flow of labour from one region (or industry, or occupation) to another. If a node were to represent a combination of these three characteristics, these data would yield LFNs that are too sparse, making them unsuitable for model calibration. Nevertheless, with more comprehensive administrative data such as employee-employer matched records–like the ones held by social security agencies and treasuries–this sparsity problem could be overcome.
Model
Our model builds on the housing market model developed by [34, 35]. It aims to achieve a balance between the parsimony of stochastic processes à la econophysics and the microfoundations of economic models. This balance is important because the econophysics approach allows highly disaggregated analysis of systems with heterogeneous and rationally-bounded agents, while agent-level microfoundations set socially meaningful causal mechanisms through which these agents respond to the economic environment. In particular, we construct a model where agents optimise leisure and consumption in a myopic fashion. Myopia, as opposed to solving infinite horizon inter-temporal optimisation problems, produces disequilibrium at the micro-level as agents continuously find themselves in situations with incentives for unilateral deviations from the status quo (triggering a stream of labour flows).4 Despite micro-level disequilibrium, the aggregate dynamics reach steady-state behaviour, and this facilitates model calibration. We develop a multi-output stochastic gradient descent algorithm for this purpose, and show (Figure S4) that the calibration method is robust to variation in the number of agents simulated (except in the trivial case of a very low number of agents), and to the number of Monte Carlo simulations performed. We also show that calibration results are able to recover the true parameters of the model when used on simulated data (Figure S5).
Our model consists of N heterogeneous agents representing individuals within the labour market. Each agent has an age, consumption preferences, and budgetary constraints, which they consider when deciding whether to apply or not to vacant job positions. These characteristics are drawn from LFS microdata. With each simulation period, agents age and may die with a probability \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}1- \omega \end{document} , where ω is a function of their current age. Dead agents are replaced by new ones, aged 18, with all other characteristics randomly drawn from their empirical distributions. Positions are individually modelled as objects with characteristics such as wage, industry, region, and occupation. They are exogenous and are created and destroyed at a constant rate (such that filled positions that are destroyed generate unemployed agents). The agents search for and apply to new positions (both while being unemployed and employed). The probability of applying to a position depends on certain similarity properties (that we explain ahead) between the new position and the current one (or the previous one in the case of unemployment), as well as the wage differential. The likelihood of being hired, on the other hand, depends on the affinity of competing applicants. The number of agents and positions remains constant and we focus on studying the impact of two sources of structural change in the LFN topology: a redistribution of job positions and a change to the mean wage, across industries, regions, and occupations. This focus is informed by the large body of evidence suggesting that shocks to the economy (e.g. advances in artificial intelligence and automation, the COVID-19 pandemic [40–43]) alter the types of jobs that are available, where they are located, and the wages associated with them. Next, we provide the full details of each model component.
Job creation and destruction
Suppose there are P job positions in the economy. Each position has the following attributes: a wage, a region, an industry, an occupation, and a state (vacant or filled). If vacant, the position is not associated with any agent. If filled, the position becomes associated with the agent that performs the job. Every step, each position (regardless of whether it is filled or vacant) can be destroyed with a probability λ. Parameter λ is both the job destruction and the job creation rate so, in the same step, on average λP positions are destroyed and subsequently replaced with new ones. The job creation/destruction rate and wages are assumed to be exogenous.
While the reader may wish to make λ or the wages endogenous variables, this would require additional assumptions (e.g. about firm behaviour) and data. Such adjustments complicate the model in ways that we consider unnecessary for the analysis conducted in this paper, and demand further data which are not commonly available in labour force surveys. Additionally, there is plenty of evidence suggesting that firms rarely change wages and, if they do, they do it for very different reasons to those assumed in traditional economic models (e.g. worker loyalty). For instance, in an extensive book, [44] shows through comprehensive employer interviews that wages rarely change during crises, and that employers are reluctant to adapt to the new economic situation in terms of wages. Rather, what tends to be more common is a change in real wages due to shifts in prices, something that requires a macroeconomic model much larger and complicated than the one presented in our paper. Finally, deciding to keep wages exogenous is a model-closing assumption that is well justified in the scope of our question of interest: can the model generate the nuanced mobility patterns observed in the empirical LFN?
Agents
Agents can be in one of two states: employed or unemployed. They are always associated to the region, industry, and occupation of their current (if employed) or last (if unemployed) position (following [5, 6]). Agents whose positions are destroyed become unemployed.
The agent behavioural component follows the canonical leisure-consumption model. In period t, agent i benefits from l time units of leisure (where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}l \in [0,1]\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}1-l\end{document} denotes the time units devoted to work) and c units of consumption through a utility function
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ U_{i,t} \left (c_{i,t}, l_{i,t}\right )=c_{i,t}^{\alpha _{i}} l_{i,t}^{1- \alpha _{i}} $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\alpha _{i}\end{document} is a consumption preference parameter. The intuition behind this standard model is that agents receive utility from the consumption that they can afford through their labour income, and from the leisure time they can spend by not working. The trade-off between working to consume and not working to gain more leisure time is partly determined by the preferences of the agent, which are modelled through parameter \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\alpha _{i}\end{document} . Hence, a higher \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\alpha _{i}\end{document} means that the agent would prefer to work more, as this would enable higher consumption levels. As is customary, we assume that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}0 < \alpha _{i} < 1\end{document} .
This leisure-consumption model has a more general form that considers inter-temporal choices. For simplicity, we assume that these choices do not involve savings, so all the income earned in period t is spent. This allows us to write the inter-temporal utility function as the series
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ U_{i} = \sum _{t=0}^{\infty }\gamma ^{t} U_{i,t} (c_{i,t},l_{i,t}), $$\end{document}where γ is a discounting factor. This discount factor operates on consumption and is consistent with existing literature. Since consumption choices are time independent, we can rewrite the previous series as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ U_{i} = \frac{\gamma ^{L_{i}}}{1 - \gamma} c_{i}^{\alpha _{i}} l_{i}^{1 - \alpha _{i}}, $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}L_{i}\end{document} is the age of the agent, so utility is age-dependent as individuals tend to discount future utility differently as they age.
Equation (3) may exaggerate the cognitive capabilities of agents since such calculations in an infinite horizon may seem unrealistic. However, it offers the benefit of replacing computationally intensive (simulation-wise) numerical algorithms with a single calculation, which is important for the computational efficiency of the model.
Next, let us introduce the budget that constrains the consumption choices of the agent. Let \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}w_{i}\end{document} denote the wage (per time unit) received by an employed agent (remember that this is a feature of the position). Every period, all this income is used for consumption purposes, so we say that the identity \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}c_{i} = (1-l_{i}) w_{i}\end{document} holds. Then, provided the utility function and the budget constraint, the agent solves the maximisation problem
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \begin{aligned} \max _{c_{i},l_{i}} U_{i} = \frac{\gamma ^{L_{i}}}{1 - \gamma} c_{i}^{\alpha _{i}} l_{i}^{1 - \alpha _{i}} \\ \text{s.t.} \quad c_{i} = (1-l_{i}) w_{i} \end{aligned} $$\end{document}While we may sacrifice realism in terms of assuming utility-maximising agents, we emphasise it in how these decisions are difficult to perfectly couple them with employment choices. Agents are rationally bounded in the sense that they do not produce sophisticated expectations on the trajectories of future utility such as considering different employment scenarios (i.e. calculating probability distributions on all the potential job opportunities that may arise and the corresponding employment outcomes). Instead, we assume that the leisure-consumption problem is separable from the one of choosing whether to stay in the same job or to take on a new position.5 Thus, when solving the utility maximisation problem, agents assume their wages are fixed. However, in the potential case of being presented with a job offer, they are able to form a counterfactual scenario with the potential new wage and to compare the utility differential.6
To finalise the agent component of the model, the solution of the agent’s optimisation problem in period t is
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ U_{i,L_{i,t}}^{*} = \frac{\gamma ^{L_{i,t}}}{1-\gamma} w_{i,t} \alpha _{i}^{\alpha _{i}} \left (\frac{1-\alpha _{i}}{w_{i,t}}\right )^{(1- \alpha _{i})}. $$\end{document}Matching, application, and hiring
The probability of a match between a position and an agent depends on the industrial, regional, and occupational similarity between the vacant position and the agent’s current position (or previous position in case of an unemployed worker). These similarity metrics or scores are weighted through free parameters that we calibrate to match the empirical LFN. The idea is that these similarities and their weights capture more fundamental or structural elements of the labour market than the idiosyncratic factors that may be reflected in observed flow data. Thus, should a substantial change in the distribution of jobs or wages take place, this approach would allow for a new network topology to emerge.
Let us consider an agent employed in position k. At any given period t, this agent may be matched to another position g with a probability
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \xi (k,g) \propto S(k,g), $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}S(k,g)\end{document} is the degree of similarity between the two positions.
Once a match takes place, the agent applies to position g only if they expect to gain more utility from working in the new job. The agent’s choice is determined by a utility comparison. Let \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}w_{k}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}w_{g}\end{document} denote the wages associated to positions k and g respectively. To make a decision, the agent assesses their future utility by calculating the counterfactual with a new salary and comparing it to the current utility outcome. This choice is expressed through the condition
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ U_{i}^{*}(g) >U_{i}^{*}(k). $$\end{document}Thus, if the future utility stream under the new job is larger than the one from the current job, then the agent has incentives to switch jobs.7 For simplicity, we assume that wage \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}w_{g}\end{document} is public information and that, if Equation (7) is satisfied, the agent will apply to position g.
After all job searchers have submitted applications, each vacant position ranks its applicants; first, according to the similarity score \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}S_{k,g}\end{document} and, then, by the order of submission. Following this, a sequential process of making offers takes place. Here, the vacant positions (in random order) hire the best-ranked applicant. Therefore, there may be positions that remain vacant because they are unable to attract suitable candidates (generating frictional unemployment). For agents who are currently unemployed, the position from their most recent employment spell is used to calculate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}S_{k,g}\end{document} for both the matching protocol and the applicant ranking process.
Job search intensity
Consider that, with probability \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mu_{i,t}\end{document} , agent i participates in the matching protocol in period t. We define this probability in terms of the agent’s current employment status, assuming that the likelihood of actively searching differs based on this status, such that
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mu _{i,t} = \textstyle\begin{cases} \theta_{e} \ \text{if agent $i$ is employed in period $t$}, \\ \theta_{ue} \ \text{if agent $i$ is unemployed in period $t$}. \end{cases} $$\end{document}This search intensity is also known in the agent-computing literature as the activation rate: the probability that an agent is active to engage in interactions during a given period [18].
Model summary
Here we provide a brief overview of the model parameters (Table 1) and of the dynamic processes occurring during each period within a simulation (Algorithm 1). For parameters representing agent attributes, when a new agent is created, a random value is drawn for each attribute from the empirical marginal distributions. Overall, the model is quite parsimonious since its equations are directly interpretable, the causal channels are explicit, and there are few free parameters. Algorithm 1Model pseudocodeTable 1Model parameters. Variables with a partially exogenous nature are those where the initial value is exogenously determined through a random draw from the marginal distribution, but its evolution is endogenously determined by the model. Unless specified otherwise, heterogeneity is measured with respect to the agentsParameterDescriptionNatureSourceDiversityNnumber of agentsendogenous[45]NAPnumber of positionsendogenous[46]NAωsurvival probability (age-specific)exogenous[47]heterogeneousαconsumption preferenceexogenous[32]heterogeneousγdiscount rateexogenous[48]homogeneouswnet wage (annual)partially exogenous[32]heterogeneousLagent agepartially exogenous[32]heterogeneousλjob creation/destruction rateexogenous[49]homogeneousSsimilarity metricexogenouscalibration, [47, 50–52]heterogeneous across nodes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\theta _{e(ue)}\end{document} activation rate for employed (respectively unemployed) individualsexogenous[32]homogeneousUutilityendogenousNAheterogeneous
Calibration
To instantiate the agent population and run the model, it is necessary to assign values to the model parameters. This is done via a combination of (1) direct imputation from microdata and (2) calibration.
Direct imputation
For the macro-level parameters, we have the number of agents N and of positions P. In 2019 there were approximately 35,000,000 individuals within the UK labour force [45], and the number of job positions in the UK was roughly 36,000,000 with around 800,000 of these standing vacant [46]. Due to the computational costs of calibrating the model at full scale, we specify N and P at a 1:10,000 scale. This scale is selected, as moving to a 1:1,000 (or finer) scale does not lead to an accompanying increase in the level of accuracy attained during calibration (Figure S4). The job creation/destruction rate is set to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\lambda = 0.0463\end{document} by averaging the job creation and destruction rates in the UK from 2011-2019 [49]. Age-stratified survival probabilities are obtained from the UK Office for National Statistic’s Life Tables for 2017-2019 [47]. The discount rate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\gamma =0.9662\end{document} is taken from HM Treasury’s Green Book [48]. The Green Book is a document providing guidance to ministers within the UK government as to how to achieve policy objectives [53].
The initial age \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}L_{i,0}\end{document} of each agent is taken directly from the LFS microdata [32]. Several other parameter values are imputed from these microdata. The preference parameter \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\alpha {i}\end{document} is imputed through the identity \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\alpha {i}=\frac{c{i}}{w{i,t} l_{i}+c_{i}}\end{document} from the first-order conditions of the utility maximisation problem presented in Equation (4). Activation rates for employed (respectively unemployed) agents are determined by computing the fraction of employed (respectively unemployed) individuals who are actively searching for a new job. Every time a new position is created, a random wage is generated from a normal distribution with mean equal to the empirical mean wage of the industry-region-occupation of the position and with standard deviation equal to the empirical standard deviation of wages associated with the industry-region-occupation of the position.8
Similarity metrics
The similarity ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}S_{k,g}\end{document} ) between positions k and g is determined by their geographical proximity, industrial affinity, and occupational closeness. To measure the geographical proximity between two regions, we employ the inverse of the physical (great-circle) distance between their major urban centres [50]. Within the regional similarity matrix R, entry \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}R_{i,j}\end{document} contains values given by the geographical proximity between regions i and j
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ R_{i,j} = 1 - \frac{d_{i,j} - \min \left (d_{i,j}\right )_{\forall i, j}}{\max \left (d_{i,j}\right )_{\forall i, j} - \min \left (d_{i,j}\right )_{\forall i, j}}, $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}d_{i,j}\end{document} denotes the physical distance between i and j. This specification is consistent with the literature on urban economics and human geography, where gravity models are built on the same principle.
The affinity between two industries is determined by the relative importance of one of the sectors as a supplier of the other. To quantify this, we employ the UK’s Input-Output Analytical Tables from 2018 [54]. Within the industrial similarity matrix I, the affinity between industry i and j is defined by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ I_{i,j} = \frac{x_{i,j}}{\sum _{k=1}^{n_{i}} x_{i,k}}, $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}x_{i,j}\end{document} is the value of the resources produced by industry j that are being used as inputs for industry i and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}n_{i}\end{document} is the total number of industries. This metric implies that the affinity scores are not symmetric. Thus, the direction of the employment path of an agent influences their future mobility, which is consistent with the directed nature of LFNs.
Finally, we determine how close two occupations are by considering their skill composition. The ONET ^®^ system quantifies the level of a given skill required to perform a given occupation [51]. While ONET ^®^ is built for the US, the Labour Market Information for All API provides a mapping from the SOC codes to their O*NET ^®^ counterparts [52]. To construct a closeness score between two occupations we apply the following steps:
- Collect data on the skills associated with each SOC code. If the SOC code covers more than one ONET ^®^ code, we take the average skill level across all ONET ^®^ codes.
- Group the skills associated with these SOC codes by their SOC major group and calculate mean values for each skills’ level to generate a skill-vector for each occupation (at the SOC major group level).
- Calculate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}O_{i,j}\end{document} , the closeness between occupations i and j, given by the cosine similarity of their skill-vectors as follows:
where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\pi ^{i}\end{document} is the skill-vector of occupation i; \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}n_{s}\end{document} is the number of different skill categories (e.g. the length of the skill-vectors); and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\lVert \cdot \rVert \end{document} denotes the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}L^{2}\end{document} norm.
All three matrices are then normalised such that their entries fall within [0,1]. Using these matrices, we construct \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}S_{k,g}\end{document} , the measure of similarity between positions k and g as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ S_{k,g} = R_{k,g}^{\nu ^{R}_{k,g}} I_{k,g}^{\nu ^{I}_{k,g}} O_{k,g}^{ \nu ^{O}_{k,g}}, $$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{I}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{O}\end{document} are matrices of parameters for weighting the importance of geographical proximity, industrial affinity, and occupational closeness to defining the similarity between two positions. These are free parameters and must be calibrated.
Calibration algorithm
Calibration is performed to determine values for the parameters contained in the matrices \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{I}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{O}\end{document} . This calibration procedure is performed using only data generated in the steady state of the model. First, each simulation is run for a sufficient length of time to reach the steady state (where the network of flows has stabilised).9 Following this, the simulation runs for an additional period of time at its steady state (enough for the network of flows occurring at the steady state to stabilise), from which the data on labour flows are retrieved and used to inform the calibration. The transition density matrices corresponding to labour flows observed in the steady state are denoted by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mathbb{R}{*}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mathbb{I}{}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mathbb{O}_{}\end{document} . Their counterparts generated from the observed LFNs are defined as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mathcal{R}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mathcal{I}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\mathcal{O}\end{document} .
The steady state labour flows inform a multi-objective gradient descent algorithm (Algorithm 2) proposed by [55] and fully developed by [56]. Within each iteration of the algorithm we run a set of M Monte Carlo simulations (i.e. independent simulations run with the same set of parameters). We define error matrices for each of the LFNs (region, industry, occupation) by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e^{R} = \mathcal{R} - \overline{\mathbb{R}{*}} \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e^{I} = \mathcal{I} - \overline{\mathbb{I}{}} \end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e^{O} = \mathcal{O} - \overline{\mathbb{O}_{}} \end{document} where, for example, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\overline{\mathbb{R}_{*}}\end{document} indicates a matrix of flow density values averaged across the M Monte Carlo simulations. The mean is representative of the underlying flows as, for a large enough agent population, flows will not vary substantially across the M simulations. Thus, we are aiming to minimise the difference between the observed flow densities (indicated within each cell of the flow density matrix) and their point estimates from the model. Algorithm 2Calibration pseudocode
We describe the parameter updating protocol taking the regional LFN as an example. For each cell of the error matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e^{R}\end{document} , if \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e_{j,k}^{R} < 0\end{document} then the density of flows between regions j and k in the simulation are higher than in the observed regional LFN. To reduce the magnitude of this error, we would like to decrease the density of flows between regions j and k in the simulation. We may be able to do so by making regions j and k less similar (from the agent’s–subjective–point of view). This reduces the probability that an agent who currently holds a position in region j (and is actively search for a new position) is matched with a position in region k. It also decreases the probability that, if the agent from region j is matched with and applies to a position in region k, they will subsequently be hired. For all pairs of regions where the geographical proximity value \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}R_{j,k}\end{document} (calculated in Equation (9)) satisfies \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}R_{j,k} \in (0,1)\end{document} , such a reduction can be achieved by multiplying \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu {j,k}^{R}\end{document} by a factor \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}1 + \delta ^{R}{j,k}\end{document} . Here, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\delta ^{R}{j,k} = |e^{R}{j,k}|/\mathcal{R}{j,k}\end{document} and is thus proportional to the magnitude of the error \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e^{R}{j,k}\end{document} . Increasing the value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}{j,k}\end{document} reduces the value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}R{j,k}^{\nu ^{R}{j,k}}\end{document} , making regions j and k less similar to each other. For any \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}R{j,k} \in {0,1}\end{document} , updating \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}{j,k}\end{document} has no effect on the value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}R{j,k}^{\nu ^{R}{j,k}}\end{document} . In these cases, we rely upon adjustments made to other cells of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}\end{document} in response to the errors in the corresponding cells of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e^{R}\end{document} . A similar logic holds for the case where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}e{j,k}^{R} \geq 0\end{document} and we wish to reduce the magnitude of the error by increasing the density of flows between regions j and k.
We track the behaviour of the combined mean absolute error (CMAE) metric, given by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\text{CMAE} = \frac{1}{3} \left ( \overline{|e^{R}|} + \overline{|e^{I}|} + \overline{|e^{O}|} \right )\end{document} , as we iterate over this error-reduction procedure, and select the threshold for this value such that the number of iterations is sufficient for the CMAE value to plateau at a minimum (Figure S4). Similarly to [55], we bound our adjustment factor \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}(1 \pm \delta ^{\cdot})\end{document} by 3/2 (respectively 1/2) as this substantially increases the rate at which we achieve error stabilisation. The calibration error is robust to the number of Monte Carlo simulations (M), and to the simulation scale (i.e. the number of agents, N) except in the case of a very low N-value (Figure S4). We note that, since we update all \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}{j,k}\end{document} values simultaneously, we achieve good efficiency in comparison to a case where parameters need to be updated one-by-one. Additionally, the fact that we are able to directly estimate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}\nu ^{R}{j,k}\end{document} values is something that is often not possible in models with such a large number of parameters (where indirect inference and surrogate models are common approaches).
Shock experiments
Once the model has been calibrated, we run Monte Carlo simulations including a shock to the underlying job or wage distribution to determine how alterations to these inputs impact network structure. Shocks manifest as the homogenisation of characteristic(s) (i.e. region, occupation) associated with all positions in one or more industries, or changes to the wages associated with these positions (Fig. 1). It is possible to implement less generic shock scenarios (e.g. ones that specify a plausible change to the job and/or wage distribution driven by a specific shock, such as a pandemic). However, we utilise a systematic and stylised approach as we are seeking a more general understanding of how wages and job characteristics impact the topology of the LFN. This avenue of inquiry motivates the need for models that emerge network structure. Figure 1Shock implementations. Subplots indicate the method of implementing shocks that impact the a) characteristics and b) wages associated with all jobs within an industry
For a shock impacting positions in n industries the methods described in Fig. 1 are applied to each industry individually. For example, we do not pool unique values for region and occupation across industries when applying a positional shock. If a shock impacts wages, the actual change in mean wage will vary for positions within the industry according to their associated region and occupation. This is because each set of positions (defined according to their region, occupation, and industry) has an associated mean and standard deviation for wage.
In all shock scenarios, the impact of the shock on the LFN structure is determined by comparing the edges in the unshocked network (A) to the edges in the shocked network (B). We compare the vectors of flow densities associated with these edges ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}d^{A}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}d^{B}\end{document} respectively) using the weighted Jaccard distance, given by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ 1 - J_{W}(A,B) = 1 - \frac{\sum _{i=1}^{N} \min{\left (d^{A}_{i}, d^{B}_{i} \right )}}{\sum _{i=1}^{N} \max{\left (d^{A}_{i}, d^{B}_{i} \right )}} $$\end{document}where N is the total number of flows. This weighted metric allows us to move beyond a consideration of the presence/absence of flows (as would be captured by the unweighted Jaccard distance) to understand how densities are redistributed under a shock. We also examine how shocks impact the average clustering coefficient–which we interpret as the extent to which flows are localised–for shocked and unshocked networks. We limit ourselves to considering these features in order to focus on aspects of the network structure that have intuitive real-world interpretations.
Finally, we examine shock impacts at the level of individual flows (e.g. job-to-job transitions from region i to j). We consider whether the distribution of differences in flow densities between shocked and unshocked networks differs significantly (according to a two-sided Mann-Whitney U test at a 5% significance level) from the distribution of differences in flow densities between two sets of unshocked networks. In other words, we ask ourselves: do the flow densities under a simulation including a shock differ significantly from those under a simulation without a shock, if we account for the natural variation in flow density due to the stochastic nature of the model? By focusing on flows where the density does differ significantly, we can understand how shocks re-distribute job-to-job transitions across the LFNs.
Results
Our analysis focuses on UK labour flows between 2012 and 2020. Due to the coarseness of these data,10 we present our results in terms of relatively aggregate LFNs: one for industries, another for geographical regions, and one more for occupational groups (all part of the same system, not independent of each other). First, we show how the model is able to reproduce these three LFNs (simulated LFNs are shown in Fig. 2 and Figure S3).Then, we present results on counterfactual analyses where two types of shocks are introduced: one on the distribution of jobs and another on wages. Figure 2Simulated labour flows between geographical regions within the UK. The plot shows inter-region flows only, as intra-region flows tend to be substantially higher due to the localised nature of job search. The colour of a flow corresponds to the region from which that flow originated. Labels and descriptions for regions are provided in Table S2
Calibration
The full results from our parameter calibration (refer to Materials and Methods: Calibration) are shown in Table 2 and Table 3. This calibration method estimates all parameters simultaneously and is robust when subjected to changes in scale and to the number of Monte Carlo simulations; it produces consistent results across multiple runs of the calibration method (Figure S4) 11 Here, we summarise the goodness of fit obtained from our calibration procedure across the three LFNs using alternative measures: the Pearson correlation between the job-to-job flows in the empirical and the simulated LFN, the Frobenius norm (again calculated on flows), and the p-value for a permutation test on the Spearman’s ρ for the PageRank of the nodes in the observed and simulated LFN.12Table 2Correspondence between observed and simulated LFNs. Values presented first in each cell correspond to calculations comparing observed LFNs to those generated from Monte Carlo simulations. Values subsequently presented in brackets correspond to calculations performed to compare observed LFNs with their associated similarity matrix (see Materials and Methods: Calibration - Similarity metrics). Smaller values for the Frobenius norm indicate better agreement between the two matrices considered. The row labelled “Total” corresponds to a weighted average value taken across the LFNs for region, industry, and occupationNetworkPearson correlation coefficientFrobenius normRegion0.98 (0.32)0.05 (1.85)Industry0.96 (0.39)0.06 (0.19)Occupation0.93 (0.42)0.09 (0.89)Total0.96 (0.38)0.07 (0.98)Table 3Comparison of the significance of the Spearman’s ρ for the PageRank of nodes in the observed and simulated LFNs. p-values presented in each cell correspond to calculations comparing observed LFNs to those generated from Monte Carlo simulationsNetworkp-value for Spearman’s ρRegion0.000002^∗∗∗^Industry0.000002^∗∗∗^Occupation0.003067^∗∗∗^^^p<0.01, ^^p<0.05, ^^p<0.1
Across all the LFNs, values for the Pearson correlation coefficient and the Frobenius norm (displayed in Table 2) indicate a high level of agreement between the observed and simulated job-to-job flows (i.e. the LFNs). We additionally calculate the same metrics to compare observed LFNs and observed similarity matrices (refer to Materials and Methods: Calibration - Similarity metrics). In summary, similarity matrices reflect more fundamental constraints that, arguably, would not change so easily with a shock, for example, the essential (most basic) skills required to perform the job. If there was a high agreement between these similarity matrices and the observed LFNs, one could say that labour flows are fully explained by these data, and that our model would have little to contribute. We show the Pearson coefficients and the Frobenius norm in brackets in Table 2, and confirm that the simulated flows (i.e. the model output) provide a substantially better fit to the observed flows than the similarity matrices do.
Finally, the p-values for all permutation tests on the Spearman’s ρ values for the PageRank of the observed and simulated LFNs are significant at the 1% level (Table 3). This indicates that we can reject the null hypothesis that the PageRank of the nodes in the observed and simulated LFNs have no positive correlation. From these analyses, we conclude that the model provides useful information to explain the structure of these LFNs.13
Shocks
We introduce shocks to our simulations to gain an understanding of how changes to the underlying job and wage distributions (such as could result from rapid technological advancement, or a global financial crisis) might alter the job-switching decisions made by individuals, and thus the structure of the LFN. We consider two types of shocks: one where we alter the occupation and region associated with all jobs within a given industry (or set of industries), and another where we alter the wages of all positions within an industry (or set of industries) by shifting the mean wage associated with a group of positions up/down by two standard deviations. As all positions within a shocked industry are subject to the perturbation, the proportion of total positions shocked depends on how many positions are associated with the perturbed industry (or industries). We measure the impact of a shock using the weighted Jaccard distance, which indicates the dissimilarity between the flow densities (i.e. the proportion of job-to-job transitions occurring along a given edge of the LFN) in the shocked and unshocked LFNs. The weighted form is used instead of the unweighted to capture information regarding what fraction of all job-to-job transitions occur along any given edge (i.e. the weight of that edge), not only along which edges job-to-job transitions occur. In other words, the volume of flow matters, not only the path.
Shock size
When a shock impacts the job distribution, the pattern of flows that emerges as individuals switch jobs differs substantially from the pattern generated from a simulation where no shock has occurred. This is evident from the value of the weighted Jaccard distance, which indicates the dissimilarity between the flow densities (i.e. the proportion of job-to-job transitions occurring along a given edge) in the shocked and unshocked LFNs (Fig. 3). However even shocks that impact a single industry (and thus only a small proportion of jobs) will, in some cases, cause the Jaccard distance value to rise above the range of values that we would expect to observe when comparing two unshocked networks (Fig. 4). We treat this additional distance (i.e. difference between the weighted edge sets) as a result of the shock restructuring the network. We also note that shocks increase the weighted average clustering coefficient of the LFNs; i.e. shocks increase the locality of job-to-job transitions, particularly in the case of the occupation LFN. Figure 3Relationship between the size of a shock and its impact on labour flows. Subplots indicate weighted (a-c) Jaccard distance and (d-f) average clustering coefficient values for the region, industry, and occupation LFNs. Each point corresponds to the average value taken across a suite of Monte Carlo simulations that all include a shock to the same set of industries. The fraction of positions shocked is calculated as the fraction of all positions that are contained within the set of industries shocked, based on the underlying job distribution. Dashed lines indicate the range of values obtained from simulations where no shock has occurred. This variation between simulations in the absence of shocks is a result of the stochastic nature of the modelFigure 4Changes to LFN structure resulting from a single-industry shock. Subplots indicate weighted Jaccard distance values for a) region b) industry and c) occupation LFNs obtained by simulating shocks on an industry-by-industry basis. Each point corresponds to the average value taken across a suite of Monte Carlo simulations using the same shocked industry, with the size of the point indicating the number of positions contained within the industry. Dashed lines indicate the range of values obtained from simulations where no shock has occurred. This variation between simulations in the absence of shocks is a result of the stochastic nature of the model
Shock location
The extent of shock impacts depends not only on the proportion of jobs that have been shocked, but on which industries have been shocked. Industry-by-industry shocks (Fig. 4) confirm this result. Health & social, Motor trade, and Education are the three largest industries, by number of positions, accounting for roughly 38% of positions spread relatively evenly between the three (approx. 12-13% of positions within each). While shocks to these industries have substantial impacts on the LFNs, in the industry and occupation LFNs shocks to Motor trade suffer of lesser impacts than Education and Health & social. Returning to multi-industry shocks, we observe similar results for the occupation LFN. Shocks impacting roughly 40-50% of positions and those impacting 80% or more lead to similar values for the weighted Jaccard distance (roughly spanning 0.15-0.30) and the average weighted clustering coefficient (0.025-0.040) (Fig. 3C,F). However, in general, there is a positive relationship between the fraction of positions impacted and the values of the weighted Jaccard distance and average clustering coefficient.
Distribution of shock impacts
The way that alterations to job-to-job transition densities (resulting from shocks) are distributed across the LFNs also depends on the specific industry that is being shocked (Fig. 5). For some industries (e.g. education) the change in flow density within that industry–in the case of education a substantial decrease in flow density within the industry–in response to a shock is not accompanied by a substantial increase in flow density elsewhere (Fig. 5a-c). This suggests that numerous industries are “sinks” that absorb the workers who, prior to the shock, would have remained within the education industry when switching jobs. In contrast, when other industries (e.g. public, encompassing public administration, defence, and compulsory social security) are shocked, certain industries (e.g. motor trade) act as the primary “sinks” for those transitioning from jobs in the public industry (Fig. 5d-f). This pattern of multiple sources and sinks is evident across all three LFNs, though for the region and occupation LFNs we primarily see changes to movements within regions (respectively occupations), not between them. In addition to these differences in how shocks are distributed depending on the industry being shocked, we also observe differences in the distribution of shock impacts across suites of simulations where a given industry has been shocked. Figure 5Changes in labour flow densities resulting from a shock. Subplots indicate changes to labour flows between/within regions, industries, and occupations for a shock to (a-c) education and (d-f) the public sector. Values are calculated by obtaining the mean density across a set of shocked LFNs and a set of unshocked ones both generated by running suites of Monte Carlo simulations, and taking the difference between these two quantities (change = shocked - baseline). Only changes to flow densities are larger in magnitude than we would expect to result from model stochasticity– and thus are assumed to be a consequence of the shock–are shown
Other shocks
We also consider shocks impacting the wages associated with positions in a given set of industries (see Figure S6, Figure S7). The impact of these shocks on LFN structure is negligible in comparison to the effect of shocks impacting position characteristics. In general, the weighted Jaccard distance and average clustering coefficient values for shocked networks do not vary substantially from those values obtained from unshocked networks. This is not unexpected; even if an increase (respectively decrease) to the wage associated with a position means that an agent currently receiving a high wage would now apply to (respectively no longer apply to) that position, there will still likely be some lower paid agent on the same node as this high earning agent who will be willing to apply. As such, we are likely to observe a very similar set of flow densities (but not necessarily micro-level trajectories). This is mainly due to the broad categories used for our region/industry/occupation characteristics, which may mean that individuals with dissimilar wages are grouped together on the same node. However, this highlights an aspect of the model not leveraged within this study; the ability to track micro-level trajectories. If we were to track these paths, it would be possible to gain further insight into the impact of wage shocks by observing which agents (e.g. in terms of their wages) alter their trajectories in response to a shock. While this falls outside the scope of our current analysis, this avenue of exploration will be pursued in future work.
Our results demonstrate that the UK’s LFNs can be replicated with a high level of accuracy using our individual-level model. This is the first instance in which a model has been used to emerge LFN structure from agent-level behaviour. Furthermore, we show how changes to the nature of positions (their occupation and region, or the associated wage) impact LFN topology. These experiments, where LFN topology evolves in response to a shock, move beyond previous methodologies where ad hoc assumptions needed to be made about how LFN structure would be altered by the introduction of a shock.
Discussion
This paper advances upon previous studies of LFNs by introducing an agent-level model that emerges LFNs from micro-level behaviour. The model, once calibrated, is used to explore changes to LFN structure resulting from the restructuring of underlying job and wage distributions, something that demands attention in a time of a substantial restructuring in many labour markets. As alterations to the wage distribution have little impact on LFN structure (though they may affect worker income and other micro-level outcomes), we focus on positional restructuring. These changes alter job-to-job flow densities and increase the average clustering within the LFN. The magnitude of the resulting impacts increases with the proportion of positions affected. However, the magnitude and distribution of impacts are not solely dependent on the number of positions affected, with considerable variation in outcomes when changes are applied to industries containing similar numbers of positions. For example, shocking the Health & social sector has a much larger impact on job-to-job flow densities across the region, industry, and occupation LFNs than shocking the Real estate sector, despite both containing a similar number of positions. These results highlight the need for models that endogenously emerge network structures.
This modelling framework opens up numerous possibilities for exploring labour market dynamics. By removing the need to assume a static network structure, it can be used to provide insights into plausible trajectories along which the labour market might evolve under different scenarios (e.g. a period of rapid technological change, the establishment of new international trade agreements, etc.) Scenarios where multiple shocks impact the labour market could also be explored. By exploring alternatives where one or more of these shocks are not present, an understanding of how these shocks interact to impact labour mobility could be achieved. Additionally, as the model is built on agent-level behaviour and leverages large-scale microdata, it provides highly granular insights into career trajectories to inform evidence-based policymaking. Due to the flexible nature of the modelling framework, it has broad applicability beyond the UK context we have focused on. Furthermore, because of its micro-level economic specification, it allows to analyse shocks and interventions different from the ones discussed in this piece, e.g. income taxes, skill-up policies, social transfers, unemployment and matching programmes, etc. The model–which is already quite parsimonious–can be adapted to accommodate what data is available. These data could come from the labour force surveys administered in other countries. In particular, there is great flexibility in terms of the level of aggregation of the characteristic variables (geography, industry, occupation). As such, a broad spectrum of data can be admitted into the model, from the highly disaggregate data found in the labour force surveys of Nordic countries, to the more aggregate data collected in developing nations. Additionally, model structure and agent behaviour can be altered should the user wish to explore some aspect of labour markets (e.g. job precarity, or the impact of skills on career trajectories) that the model does not currently account for.
Supplementary Information
Below is the link to the electronic supplementary material. (PDF 11.7 MB)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1The four futures of work: Coping with uncertainty in an age of radical technologies. https://www.thersa.org/reports/the-four-futures-of-work-coping-with-uncertainty-in-an-age-of-radical-technologies
- 2The Amazonian Era: the gigification of work - IFOW. https://www.ifow.org/publications/the-amazonian-era-the-gigification-of-work
- 3The impact of automation on labour markets: Interactions with COVID-19 - IFOW. https://www.ifow.org/publications/the-impact-of-automation-on-labour-markets-interactions-with-covid-19
- 4The World Bank (2022) Labor force, total - United Kingdom | Data. https://data.worldbank.org/indicator/SL.TLF.TOTL.IN?locations=GB, February 2022
- 5Office for National Statistics (2020) Vacancies and jobs in the UK - Office for National Statistics. https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/jobsandvacanciesintheuk/september 2020, September 2020
- 6Office for National Statistics (2021) National life tables: UK. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/lifeexpectancies/datasets/nationallifetablesunitedkingdomreferencetables
- 7HM Treasury (2022) The Green Book: appraisal and evaluation in central government. https://www.gov.uk/government/publications/the-green-book-appraisal-and-evaluation-in-central-governent
- 8Office for National Statistics (2020) Business dynamism in the UK economy. https://www.ons.gov.uk/businessindustryandtrade/changestobusiness/businessbirthsdeathsandsurvivalrates/bulletins/businessdynamismintheukeconomy/quarter 1jantomar 1999 toquarter 4octtodec 2019, 2020
