^{1}

^{2}

^{3}

^{4}

^{1}

^{5}

^{*}

Analyzed the data: CR SMK M. Batty M. Barthélemy. Wrote the paper: CR SMK M. Batty M. Barthélemy.

The authors have declared that no competing interests exist.

The spatial arrangement of urban hubs and centers and how individuals interact with these centers is a crucial problem with many applications ranging from urban planning to epidemiology. We utilize here in an unprecedented manner the large scale, real-time ‘Oyster’ card database of individual person movements in the London subway to reveal the structure and organization of the city. We show that patterns of intraurban movement are strongly heterogeneous in terms of volume, but not in terms of distance travelled, and that there is a polycentric structure composed of large flows organized around a limited number of activity centers. For smaller flows, the pattern of connections becomes richer and more complex and is not strictly hierarchical since it mixes different levels consisting of different orders of magnitude. This new understanding can shed light on the impact of new urban projects on the evolution of the polycentric configuration of a city and the dense structure of its centers and it provides an initial approach to modeling flows in an urban system.

The structure of a large city is probably one of the most complex spatial system that we can encounter. It is made of a large number of diverse components connected by different transportation and distribution networks. In this respect, the popular conception of a city with one center and pendular movements going in and out of the business center is likely to be an audacious simplification of what actually happens. The most prominent and visible effects of such spatial organization of economic activity in large and densely populated urban areas are characterized by severe traffic congestion, uncontrolled urban sprawl of such cities and the strong possibilities of rapidly spreading viruses, biologial and social, through the dense underlying networks

World cities

The main results that we will discuss in this section are that (i) flows are generally of a local nature (ii) they are also organized/aggregated around polycenters and (iii) the examination and decomposition of these flows lead to the description of entangled hierarchies, and (iv) hence one likely structure describing this large metropolitan area is based on polycentrism. This perspective thus draws new insights from data that has become available from electronic sources that have so far not been utilised in analyzing the urban spatial structure and in this sense, are unprecedented in the field.

To get a preliminary grasp on the data, we observe that the flow distribution (normalized histogram of flows of individuals) is fitted by a power law with exponent

Loglog plot of the histogram of the number of trips between two stations of the tube system. The line is a power law fit with exponent

Spatial separation is another primary feature of movement and we show in

(a) Superimposition of the distance distribution of rides (circles) and of the distance distribution between stations (squares). The distribution of the observed rides can be fitted by a negative binomial law of parameters

While this graph exhibits actual commuting patterns, it does not tell us much about commuter behavior, all other things being equal. Indeed, the geographical constraints are important and the distance distribution between stations (shown superimposed in

In addition to being strongly heterogenous, rides are therefore to some extent essentially local. At a more aggregated level, and in order to infer the city structure at a larger scale, we can study the distribution of incoming (or outgoing) flows for a given station. We show in the

Zipf plot for the total inflows (

The exponential decay of these plots demonstrate that most of the total flows are concentrated on a few stations. Indeed, an exponential decay of the form

To examine further this polycentric structure, we will aggregate different stations if their inflow is large and they are spatially close to one another. Various clustering methods could be used and we choose one of the simplest described in the section

Breakdown of centers in terms of underlying stations and inflows. We gather stations by descending order of total inflow and we aggregate the stations to centers when taking into account more and more stations. In this process, all stations within

We represent the ten most important polycenters defined in the dendrogram of

In the inset, we show the entire tube network while in the main figure, we zoom in on the central part of London. We represent the ten most important polycenters defined in the dendrogram of

We now examine how the flows are distributed into and outside centers, focusing on the morning peak hours. We first aggregate the flows by centers by computing the total flow incoming to a certain center

We then rank all flows

When considering the most important flows from stations to centers such their sum represents

At this scale, it is clear that we have three main centers and sources (with various outdegree values), which mostly correspond to intermodal rail-subway connections. Adding more links, we reach a fraction

We can summarize this result with the graph shown in

Proportion of links going from sources to centers of a certain group (I, II, III), considering links of decreasing importance for each given source, when raising

We can quantify in a more precise way how the structure of flows evolves when we investigate smaller flows by exploring the list of flows

World cities such as London have tended to defy understanding hitherto because simple hierarchical subdivision has ignored the fact that their polycentricity subsumes a pattern of nested urban movements. Using the Oyster data we can identify multiple centers in London, then describe the traffic flowing into these centers as a simple hierarchic decomposition of multiple flows at various scales. In other words, these movements define a series of subcenters at different levels where the complex pattern of flows can be unpacked using our simple iterative scheme based on the representation of ever finer scales defined by smaller weights. Casual observation suggests that this kind of complexity might apply to other world cities such as Paris, New York or Tokyo where spatial structure tends to reveal patterns of polycentricity considerably more intricate than cities lower down the city size hierarchy. Our approach needs to be extended of course to other modes of travel, which will complement and enrich the analysis of polycentricity. The Oyster card is already used on buses and has just expanded beyond the tube system to cover other modes of travel such as surface rail in Greater London. With GPS traffic systems monitoring, in time, all such movements will be captured, extending our ability to understand and plan for the complexity that defines the contemporary city.

Our analysis of individual movements is based on a dataset describing the entire underground service between

The subway infrastructure imposes a certain number of physical constraints which can affect various distributions. This is for example the case of the ride distribution where rides between two stations with large outflow and inflow, respectively, are likely to be over-represented. As such the ride distribution could simply be a result of the peculiar subway spatial structure. In order to eliminate this type of biases, we use for comparison a null-model constructed in the following way. We randomize rides in a such a way that the total outflow and total inflow of each station is conserved while actual ride extremities are reshuffled. This model is basically a configuration model

We can then divide the real values of flows

We used the null model in order to extract the part due to the behavior of the commuters in their ride distribution. We can also study the relative orientation of the incoming flow normalized by its corresponding quantity given by the null model which gives the anisotropy

Clustering methods for point in spaces has been the subject of many studies and are used in many different fields. In particular, in computational biology and bioinformatics, clustering is used to build group of genes with related expression patterns. Many different methods were developed and the most common ones are hierarchical clustering methods (such as those based on K-means and their derivatives, see for example

We varied the value of

We face here a difficult problem: we have a complete weighted directed network featuring flows from stations to centers, and the goal is to extract some meaningful information. We started with the analysis of the dominant flows and we would like to understand how the flows are structured when we explore smaller values. In order to do this, we introduce a ‘transition’ matrix

As an example, when we go from

The matrix

Typical form of the outdegree transition matrix

In the case of the transition

(a) Number of new sources (

The Oyster card data was collected by Transport for London (TfL), and we are grateful for their permission to use it in this paper. We also thank Cecilia Mascolo for access to TfL and the Oyster card data, and Andrew Hudson-Smith for providing the London underground map.