Urban retail dynamics: insights from percolation theory and spatial interaction modelling
Duccio Piovani, Carlos Molinero, Alan Wilson

TL;DR
This paper explores the relationship between retail activity clustering and road network structure in cities, using percolation theory and spatial interaction models to reveal their interdependence.
Contribution
It demonstrates a strong link between retail dynamics and the hierarchical structure of urban road networks through comparative analysis.
Findings
High agreement between retail clustering and road network hierarchy
Evidence of interdependence between retail activity and road structure
Insights into city spatial organization from combined models
Abstract
The study of the properties and structure of a city's road network has for many years been the focus of much work, as has the mathematical modelling of the location of its retail activity and of the emergence of clustering in retail centres. Despite these two phenomena strongly depending on one another and their fundamental importance in understanding cities, little work has been done in order to compare their evolution and their local and global properties. The contribution of this paper aims to highlight the strong relationship that retail dynamics have with the hierarchical structure of the underlying road network. We achieve this by comparing the results of the entropy maximising retail model with a percolation analysis of the road network in the city of London. We interpret the great agreement in the hierarchical spatial organisation outlined by these two approaches as new evidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Urban retail dynamics: insights from percolation theory and spatial interaction modelling
D. Piovani
Centre for Advanced Spatial Analysis (CASA), University College London (UCL), 90 Tottenham Court Road , London, W1T 4TJ
C.Molinero
Centre for Advanced Spatial Analysis (CASA), University College London (UCL), 90 Tottenham Court Road , London, W1T 4TJ
A. Wilson
Centre for Advanced Spatial Analysis (CASA), University College London (UCL), 90 Tottenham Court Road , London, W1T 4TJ
The Alan Turing Institute, British Library, 96 Euston Road, London NW1
Abstract
The study of the properties and structure of a city’s road network has for many years been the focus of much work, as has the mathematical modelling of the location of its retail activity and of the emergence of clustering in retail centres. Despite these two phenomena strongly depending on one another and their fundamental importance in understanding cities, little work has been done in order to compare their evolution and their local and global properties. The contribution of this paper aims to highlight the strong relationship that retail dynamics have with the hierarchical structure of the underlying road network. We achieve this by comparing the results of the entropy maximising retail model with a percolation analysis of the road network in the city of London. We interpret the great agreement in the hierarchical spatial organisation outlined by these two approaches as new evidence of the interdependence of these two crucial dimensions of a city’s life.
Introduction
A well known fact in the study of cities is that retail activities tend to agglomerate. Understanding and describing this phenomenon has interested scientists from different backgrounds for many years, [1, 2, 3, 4, 5], but despite this multidisciplinary effort, in the last decades, fresh approaches have struggled to emerge. Recent advances in spatial networks [6, 7], and in road networks in particular [8, 9, 10, 11], and the large increase of available data in urban systems, have renewed the interest in the field with efforts aimed at modelling and measuring the formation of retail agglomerations [12, 13, 14], and relating it to centrality measurements of the city’s road network [15, 16].
The retail model introduced in [17], a benchmark in its field, describes flows of spending power, or money, from population centroids to retail centres. For decades this model has proven itself successful in predicting the behaviour of retail centres’ dynamics [18]. In the model, retail centres compete for the limited amount of resources, represented by the population, and only the more attractive and better positioned manage to survive. This is elegantly done through an entropy maximising model [19], which quantifies the aggregate flow from population centroid to retail centre with only two parameters: one that sets the scaling between a retailer’s attractiveness and floorspace and another which defines the cost of moving. Indeed, highly visited retail centres grow proportionally to the number of visits, while poorly visited centres shrink in size and eventually are removed from the system. The identity, number and position of retail centres that survive, strongly depends on the values of the two parameters, and it has been shown that the model undergoes a phase transition[20, 21, 22] from a diverse and heterogenous retail landscape to one where only the most attractive centre, by defeating all other competition, manages to survive. In between those two extreme cases the model describes the formation of retail clusters.
Moreover, in [23] it has been shown how the road network contains footprints of the socio-economic and cultural evolution of a country and its regions. This has been done by applying percolation theory to the network of the street intersections in the UK, which allowed to clearly uncover regional economical patterns in relation to their infrastructure. In this approach clusters are the outcome of some thresholding process and reveal a hierarchical organisation, which is in outstanding agreement with the historical evolution of the same regions and country.
In this paper we want to repeat the same analysis done in [23], but at the city level, on London street’s network, in order to study and compare the road clusters with the retail clusters that emerge from the model. At a macroscopic level the way road clusters merge is very similar to what happens between retail clusters in the model. A low threshold scenario where many small clusters scattered through the system appear, corresponds to the parameter values that form a heterogeneous and varied retail landscape, while the formation of the giant road cluster, corresponds to the configuration consisting of only one large retail centre. It is therefore tempting, to bridge these two formalisms and to interpret the formation of retail clusters, described by the model, as a fingerprint of a hierarchical organisation in the economic activities of a city. As we will show in great detail the two approaches describe a very similar urban hierarchical structure, in that the spatial distribution and size of the clusters are in great agreement. This result seems to bring new evidence on the polycentric organisation of the city, and indeed sheds new light on the relationship between the road network of a city and the economical activities that develop on it.
1 Material and Methods
In this section we will go through the details of the methodologies we want to compare, and the data used both for calibration and testing. We will start by defining the retail model, and analysing its main results, to then present the application of the percolation process on London’s road network. Finally, we will comment on the data sources and the calibration procedures used.
1.1 The Retail Model
By following the procedure outlined in [22], we can define the flows from population centroids to retail centres as described by the equation
[TABLE]
where is the population in the origin , is the aggregated floorspace of retail center , and the cost of moving from to , we will simply quantify it as the distance. We can see how these flows are defined by two parameters, namely , which sets the scaling between the attractiveness and the floorspace of a retailer and which tunes the cost of moving. is the normalisation factor, which under the constraint of the total outflow being equal to the population, i.e. , becomes
[TABLE]
As one can see in [22], the form of the flows in eq.(1) comes out of an entropy maximising process, and is obtained under the constraints that come from the observed data: the population’s , the aggregated floorspace’s and the cost matrix’s spatial distribution. This means that the set of flows are an equilibrium configuration, that depends on the input data as well as on the values of the parameters and . Any small change in the input data would yield a rapid reconfiguration to a new attractor state. We can therefore interpret this process as a fast dynamics one.
Moreover we can, by exploiting eq.(1) and eq.(2), predict the evolution of the floorspace distribution , considered constant during the fast dynamics. By calculating the total inflow to retail centre as , we define the dynamics equation as
[TABLE]
which tells us that will increase if and shrink in the opposite case. The constant is there to make sure that all quantities are measured in commensurate units, and converts the flow of people into floorspace. Its value must be calibrated on the data.
The solution to eq.(3) is given by the set of equations which explicitly become
[TABLE]
The set of equations in eq.(4) are complicated non linear equations that can only be solved iteratively. This is because every variation in any retailer’s floorspace , modifies all other equations.
The model we have just defined has a rich behaviour and describes different types of retail structures , according to the two parameters and . For larger shops will be more attractive and a small implies higher probabilities of interaction over longer distances to achieve the benefits of size. Hence large and small combinations generate structures with a small number of large centres and vice versa. In the bottom panels of fig.(1) we can see how, by fixing and increasing , the number of retail centres decrease with the ones remaining becoming larger and larger.
1.2 Percolation on London’s road network
Percolation processes [24] are a highly studied field of research given their multiple applications to several and very distinct fields. We can find percolation-like processes in fields that space from oil extraction [25] to the study of the electrical conductivity of materials [26], from polymerization processes [27] to fire spreading [28], from epidemiology [29], to other health aspects such as obesity [30] and indeed they are used to study the structure of brain networks [31].
In this work we apply a percolation process to London’s road networks in order to uncover its hierarchical structure, following the procedure used in [23]. In the approach, the nodes of the weighted network are the road intersections, while the links are the roads joining two intersections. These are weighted by their lengths, so two intersections connected by a long road will have a link with a high weight connecting them and vice versa. The approach undertaken to calculate the percolation of London’s road network consists of the following steps: we begin by setting a threshold , then we select every link who’s weight falls below that threshold, and extract the subgraph formed by those links. The weakly connected components of the subgraph are the clusters of the network generated by the percolation process for a given threshold. The clusters are constructed such that they have at least a link connecting them with a weight smaller than the given threshold. These clusters form a tree structure given that for two thresholds and , if , a cluster generated using will be completely contained into a cluster obtained using . This allows us to construct a hierarchical tree that follows the ordering of the regions induced by the percolation which uncovers the intrinsic structure of the system.
Percolation is a critical process, that presents a phase transition at a critical probability (in our case threshold). This phase transition coincides with the maximum entropy of the distribution of the cluster sizes, just before the giant cluster takes over and the sizes of the secondary clusters drop to a negligable size. As we can see in fig.(2e) the threshold that generates the maximum entropy configuration is and that is the point where the clusters are simultaneously maximizing their sizes while equilibrating their differences. Below it, the clusters are small, while above it the giant cluster starts to take over the whole distribution. It is therefore, the threshold with a larger information in terms of the distribution of the cluster sizes. The clusters of the percolation separate the network into regions that have a similar density of intersections. Considering that the population is located in buildings which are located into streets that meet at intersections, it is easy to imagine that those clusters actually correspond to concentrations of population. The study undertaken in this paper, that of finding a correspondance between the retail model and the percolation of the network structure is, therefore, one that equates the clusters of the percolation with the location of the population that in turn attracts retail centres to the centroid of their masses.
1.3 Data and Calibration
We have gathered data on the retailers in London from the Valuation Office Agency 2010 dataset (https://www.gov.uk/government/organisations/valuation-office-agency). There we have found information on more than retail activities scattered on 20707 different post-codes and for each post-code we have summed the floorspace of the retailers that belong there, and assigned it to its position. This results in different retail centres, which is a level of detail never tested before using the retail model in [17]. Furthermore we use census data at the LSOA level for the population centroids. As mentioned in the previous section, is just a distance matrix which we were able to fill in, given the data on the spatial distribution of the population centroid and the retailer centres at the post-code level.
Furthermore by knowing the total amount of floorspace and the total population in the city we can calibrate
[TABLE]
where we have assumed the wealth as uniformly distributed among the population.
Furthemore we have obtained the road network of London and its surroundings from the OS OpenRoads dataset [32]. This dataset is ideal, not only because it is open-source while maintaining the correct topology of the network thanks to the effort of the people at Ordnance Survey, but also because it comes pre-simplified in the sense that lanes are collapsed into centre-lines of roads. The network has been further simplified by collapsing details such as roundabouts and removing all nodes of degree 2 leaving purely the structure of the network. The final network contains 365967 nodes and 438375 links.
2 Results
As explained in the introduction, our aim is to compare the evolution and the spatial distribution of the road clusters emerging from the percolation process with those described by the model we have just presented. This will allow us then to study the extent of the agreement of the hierarchical structure described by these two different formalisms. We will start by a qualitative observation of the two evolutions: in Fig.(1) we show the evolution of the spatial distribution of the clusters in the two approaches. The figures in the top panel show the evolution of the percolation clusters, where nodes of the same colour belong to the same cluster, and where the colour indicates the rank of the cluster. In the bottom panels, following the same logic, we present the configuration that comes out of eq.(3), where we have fixed the parameter to and where we vary the value of . At a first glance we can see how by increasing , which will always be measure in meters, in the road percolation approach and by fixing and increasing in the retail model the behaviour is very similar: in both cases the number of cluster decreases while their size tends to increase, and high ranked clusters tend to position themselves in similar positions (see fig.(4a) for an overlap of the two configurations).
For a more quantitative comparison of the macroscopic properties of the two evolutions we study the size of the giant cluster and of the entropy of the cluster sizes for increasing values of and . To calculate the entropy we have used Shannon’s formula
[TABLE]
where runs on the cluster sizes and is the probability of finding a cluster of size , for each value of and . In Fig.(2) we can see how the evolution of these two quantities follows the same behaviour in both approaches. In the percolation on the road network (bottom panels) fig.(2d) and fig.(2e),, for low values of , increases in the threshold imply increases in the entropy. This corresponds to a slow increase in the size of the giant cluster. Around the Entropy reaches its maximum and we can see a change in the curvature of the giant cluster which starts a steeper increase. From then on as one may expect the entropy of the system decays to zero and the giant cluster spreads to the whole network.
The top panels in fig.(2a) and fig.(2b), show how the clusters that form during the dynamics of the retail model follow a very similar dynamics. In this case, for all values of , both the decrease in entropy and as well as the increase in the giant cluster size are very slow for . One can see a clear transition happening at for any value of we have tested, although the type of transition depends on its specific value. Higher values of yield smoother transitions, or in other words converge for higher values of , while for lower values of we get sharper transitions. However no matter the value of the system always ends up with the same winner: the same retail cluster manages to outplay the rest of the competitors and have all the flows in the systems directed towards it. This means that for each value of the parameter we are always observing the same transition, which begins from the same initial condition and ends up in the same ground state. We could think of as setting the scale of the transition, in terms of : the system always explores the same states, but in a low scenario one needs more coarse grained values of than in a high system to actually observe them all. This allows us to fix the value of (we will arbitrarily set to ) and study the behaviour of the system only varying the values of . In the two figures on the right fig.(2c) and fig.(2f) we show the distribution of the sizes of the clusters, i.e. the aggregated floorspace for the retail centres and number of nodes on the road network clusters, of the 10 largest clusters. In the insets we can see how the distribution has an exponential form in both cases, and how increasing and has the same effect on the distribution, namely increasing its steepness.
In this paragraph we have seen how at a macroscopic level the two approaches describe a very similar dynamics in the formation of clusters, and eventually of a giant cluster. We have also shown how we can fix the without loss of generality in the results and how, in the retail model, the parameter plays the same role as the threshold plays in the percolation. In the following we will measure the spatial similarity of the cluster’s distribution.
2.1 Retail distribution on percolation road clusters
Before moving to a more detailed comparison and analyse their local distribution, we must take a step back, and see how the retail centres we found in the data are distributed on the percolation road clusters bearing in mind that it represents the initial condition for the model’s evolution. This step is interesting for two reasons: on one hand it will tell us if we can learn something on the real retail distribution by analysing its relationship to the road clusters, and on the other hand it will serve as a benchmark to then better quantify the effects of the retail model’s dynamics. In [23] the authors have shown how starting from the road network of the whole of the UK, the cities emerged as clusters of the road network for . Given that our analysis is applied at the city level in London, we will take that as our maximum threshold. Furthermore we have considered as the minimum size of a cluster to be allowed in the system as .
We consider the retail floorspace spatial distribution aggregated the post code level, and assign it to the closest node in the road network, and for each we then study the fraction of floorspace assigned to the emerging clusters. To do so we compare the fraction of nodes of the road networks in the system of a given threshold, namely
[TABLE]
where is the sum of nodes that form all clusters in the system and the total number of nodes to the amount of retail floorspace contained in them:
[TABLE]
where once again is the amount of floorspace assigned to the nodes that belong to the percolation’s clusters and the total amount of floorspace in the system. We can then study their difference
[TABLE]
A case would indicate a random spatial distribution of the retailers on the road network. Meaning that the distribution of retail floorspace on the road network would be independent of the hierarchy indicated by , and one would obtain the same by selecting the same fraction of nodes, , using any other criteria. On the other hand , would indicate a tendency of retailers to be on roads that are not yet in the system, while the opposite case would unveil a spontaneous tendency of retailers in positioning themselves on highly connected clusters.
In Fig.(3) we show the behaviour of the distribution of the three quantities , and in fig.(3a), (3b), (3c) respectively. We have measured the quantities on the full network (black curve), the network without the giant cluster (yellow curve) and only considering the giant cluster (blue curve). This has been done to make sure the results where not being dominated by the giant cluster. By comparing the figures we can see how grows much faster in with respect to , and we constantly get . Furthermore the red curve shows that up to , this is true even if we exclude the giant cluster from the analysis. If it is clear from these results that retailers tend to position themselves in central locations, what also emerges is a tendency to choose highly connected clusters, or in other words clusters formed by a dense grid of alleys and road intersections. An in depth analysis of these results would require a study of its own and we leave it to future research, and will now exploit these results to understand the effects introduced by the dynamics described in eq.(1)-(4).
2.2 Comparing the spatial distribution of retail and road clusters.
We have seen how the retailers are more likely to be found on roads that belong to very connected clusters, but we still did not apply the model’s dynamics to the retailer’s spatial distribution. We have also seen that just like in the road network percolation shows the hierarchical relationships of the roads, in the model describes the formation of retail clusters described in . Now we want to study the analogies and differences of these emerging structures. We do this by repeating the same analysis we have done on the full data set, this time on the equilibrium configurations that comes out of eq.(3). Of course, the introduction of adds a new degree of freedom, and now and .
In Fig.(4) we start by showing an overlap of the clusters obtained with and . At a first glance we can see how big retail clusters tend to lay on big road clusters, and vice versa, and how this is true even for clusters appearing at the periphery of the city. Some clusters that did not exactly overlap lie one next to the other. To quantify this impression, in Fig.(LABEL:fig:tas) and Fig.(LABEL:fig:ratio_ta) we show and for values of ranging from to . For greater than that, the floorspace is mainly contained in the giant cluster which dominates any analysis. Perhaps surprisingly and , indicating that retailers belonging to the clusters have survived the dynamics more than those not belonging to the clusters, and have grown in size. This is true both if we include the giant cluster and if we leave it out. For however of the retail floorspace is concentrated in the retail giant cluster which lies on the road’s giant cluster, therefore not considering it ruins the results. Finally, in Fig.(LABEL:fig:size_corr) we show the correlation between the log of the amount of floorspace on a road cluster and its size. The high levels of correlation imply that big retail clusters tend to position themselves on big road clusters, and this effect is improved by the model.
3 Discussion and Conclusion
Cities have been successfully characterised both by analysing the structure of their road networks [9, 11, 33, 8] and the distribution and nature of their retail activity [17, 34, 35]. Some work has been done to relate these two approaches [15, 16] and the contribution presented in this paper goes in that direction. The presence of a hierarchical structure in its road networks has been showed [23] as has the mechanism that leads to it [7]. Percolation theory has proven itself as a useful tool to study urban areas [23, 36] but to our knowledge no work has been done to export these concepts to analyse the organisation of retail activities in cities. To do so we have used the single constrained entropy maximising retail model [17], perhaps the most widely used model of the field, and characterised its results by embedding them on the city’s road network. We have used the city of London as a test case, and given the quality of our results now plan to extend our research to the whole of the UK.
In other words we have presented an attempt to relate the different configurations of clusters obtained through purely geometrical means based on the road network with the evolution described by the retail model. We have quantified their agreement by measuring the amount of retail floorspace contained within the clusters and studied how far this is from a random distribution. Furthermore we have studied the correlation between the size of the clusters and the amount of retail floorspace they contained. As we have seen the configurations obtained by using these two formalisms are very similar, with the spatial distribution of retail clusters being very close to the distribution of road clusters, and the correlations levels we have found are very high. By comparing these results with those obtained using the spatial distribution of retailers in London aggregated at the post code level, we have shown how the model improves the agreements, which therefore does not depend on the original distribution. We believe that the results presented in this paper are important for a number or reasons. We have bridged the results of a model first presented many years ago with an approach that only recently has been applied to study urban spaces, and comparing the results we have interpreted them in a new way. Indeed we have brought evidence to the existence of a hierarchical spatial organisation of retail activity in London, and believe this result can be very useful in future modelling.
4 Acknowledgements
The authors wish to thank Elsa Arcaute and Michael Batty for insights and comments, and Stanislao Gualdi for precious suggestions on the presentation of the results.The authors wish to acknowledge the support of the EPSRC grant: EP/M023583/1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. L. Huff. A programmed solution for approximating an optimum retail location. Land Economics , 42(3):293–303, 1966.
- 2[2] W. Christaller. Central places in southern Germany . Prentice-Hall, 1966.
- 3[3] S. Brown. A perceptual approach to retail agglomeration. Area , pages 131–140, 1987.
- 4[4] A. G. Wilson. The use of entropy maximising models, in the theory of trip distribution, mode split and route split. Journal of Transport Economics and Policy , pages 108–126, 1969.
- 5[5] D. Mc Fadden. Econometric models for probabilistic choice among products. Journal of Business , pages S 13–S 29, 1980.
- 6[6] M. Barthélemy. Spatial networks. Physics Reports , 499(1-3):1–101, 2011.
- 7[7] R. Louf, P. Jensen, and M. Barthelemy. Emergence of hierarchy in cost-driven growth of spatial networks. Proceedings of the National Academy of Sciences , 110(22):8824–8829, 2013.
- 8[8] M. Barthélemy and A. Flammini. Modeling urban street patterns. Physical review letters , 100(13):138702, 2008.
