^{1}

^{1}

^{*}

The authors have declared that no competing interests exist.

Wrote the paper: KJS RRW. Performed the research: KJS RRW.

Understanding models which represent the invasion of network-based systems by infectious agents can give important insights into many real-world situations, including the prevention and control of infectious diseases and computer viruses. Here we consider Markovian susceptible-infectious-susceptible (SIS) dynamics on finite strongly connected networks, applicable to several sexually transmitted diseases and computer viruses. In this context, a theoretical definition of endemic prevalence is easily obtained via the quasi-stationary distribution (QSD). By representing the model as a percolation process and utilising the property of duality, we also provide a theoretical definition of invasion probability. We then show that, for undirected networks, the probability of invasion from any given individual is equal to the (probabilistic) endemic prevalence, following successful invasion, at the individual (we also provide a relationship for the directed case). The total (fractional) endemic prevalence in the population is thus equal to the average invasion probability (across all individuals). Consequently, for such systems, the regions or individuals already supporting a high level of infection are likely to be the source of a successful invasion by another infectious agent. This could be used to inform targeted interventions when there is a threat from an emerging infectious disease.

‘Compartmental’ models

To reflect the probabilistic nature of the real-world process of invasion, and disease transmission in general, stochastic descriptions are required. Moreover, the probability of invasion from a single infectious individual clearly depends on that individual's particular relationships with others, e.g. some individuals may be better connected than others. In order to capture such heterogeneity, the population can be represented as a contact network

In the next section we will introduce the network-based SIS stochastic model and explain, following Harris

In Markovian SIS dynamics, an individual is able to flip repeatedly between two states: susceptible and infectious. This happens via a locally influenced time-homogeneous Poisson transmission process and an individual-specific time-homogeneous Poisson recovery process (on recovery an individual returns to the susceptible state). In the context of individuals interacting in this way on a regular square lattice the SIS model is also known as the contact process

Assuming the system is in a specific stochastic state, let

This framework, which represents the dynamics of several sexually transmitted diseases

Harris

The vertical lines are the time lines corresponding to each individual. The short diagonal lines indicate the points of cure and the horizontal arrows are the arrows of infection. A path from 0 on

Assume that we have a network

The property of ‘duality’ pertaining to Markovian SIS dynamics (and the contact process) can be expressed as follows:

To understand

When the network

An immediate problem with finite systems is that there are no genuine stationary distributions corresponding to endemic infection because the long-term behaviour is always guaranteed extinction (of the infectious agent). However, from a practical point of view, it is sometimes possible to obtain something like the endemic stationary distribution;

(a) is a plot of the total number of infected individuals against time in a simulation where the outbreak was initiated on a single infectious individual. (b) is a histogram of the number of infection events in 100 simulations of an outbreak, which were allowed to run up to a maximum of 300 infection events, initiated on the same individual each time. In both cases, the weighted network matrix was multiplied by 0.01 and the recovery rate was set to unity for all individuals.

From a theoretical perspective, the situation can be made precise. Our system is finite and Markovian with a single absorbing state (extinction of the infection). Also, if the network under consideration is strongly connected such that infection can be transmitted, via some route, from any individual to any other individual, then the transient states form a single commuting class. In this case, if we initiate the system in a transient state and condition on the survival of the infection, then the system tends to a unique distribution referred to as the quasi-stationary distribution (QSD)

It can be argued that the quasi-stationary distribution has practical relevance (i.e. is a good ‘representation’ of the endemic situation

Let us consider the following quantities for an arbitrary strongly connected network

Note that in the limit as

In

If

We have defined

In the context of finite contact networks, invasion probability has not been given a rigorous theoretical definition (see Nåsell

In this section we show that the quantifier of invasion probability, which is as equally meaningful and relevant as our quantifier for endemic quasi-prevalence, for outbreaks initiated on the members of a subset

Let us now consider the following quantities:

It follows from duality that the three quantities, 4, 5 and 6, are all equal respectively to the three quantities, 1, 2 and 3, provided that we transpose

Quantity 4 denotes the survival probability up to time

Our quantifier of invasion probability can be generalised as:

Our main result can be stated as the following (prevalence-invasion) relationship:

Note that for a single individual we have:

For the case where

The probability

In practice, we run the simulation for a sufficiently long time such that the system is likely to be described by the QSD before we start to compute

In measuring the probability of quasi-invasion through stochastic simulation, the key requirement is separating major outbreaks from minor outbreaks. Therefore, we look for dichotomised behaviour in relation to the time until extinction by carrying out large numbers of simulations, each initiated in the same stochastic state. For example, we can measure

In general, so long as this kind of dichotomised behaviour is found, the practical issues which emerge in measuring quasi-invasion probability by stochastic simulation for the SIS framework are minor. In other words, this method of measurement is valid, and easily carried out, in systems where

By varying a scalar multiplier of a network matrix we can attempt to investigate infections of varying transmissibility spreading through the same population.

The recovery rate was set to unity for all individuals while the multiplier of the network matrix was varied. In (a), these two quantifiers are plotted against each other for each of 20 different multipliers of the network matrix. The faint dashed line indicates equality. On this scale it is not possible to determine any deviation from the equality of the two quantities. (b) is a ‘zoomed-in’ view of the perpendicular deviation of each of the data points from the straight line (equality), in the bottom right to top left direction.

To obtain measurements of

To obtain measurements of

An undirected and homogeneously weighted square lattice

For (a), the global transmission parameter (

In

Through duality, we can approximate

By considering the unique QSD associated with Markovian SIS dynamics on finite strongly connected networks, along with its implications under duality, we have provided meaningful mathematical definitions for both endemic prevalence (quasi-prevalence) and invasion probability (quasi-invasion). Utilising these definitions, we have also provided a general statement of the exact relationship between invasion probability and endemic prevalence at the individual and population level, for any finite undirected network of arbitrary heterogeneity (including undirected networks with weighted links and individual-specific recovery parameters). The relationship also has implications in the context of directed networks.

We note that for two specific homogeneous networks (infinite square lattice and infinite ‘great circle’), invasion probability (in these cases, the probability of indefinite persistence) from a single initial infected has been shown to be equal to the fraction of the population infected in the upper invariant measure

It is generally easier to collect empirical data on endemic prevalence rather than directly on invasion risk. In the case of undirected networks, prevalence data can thus be utilised to inform invasion risk. This method echoes Anderson and May's

(ZIP)

The authors thank Megan Selbach-Allen for discussions, Jane Rees for comments on the final manuscript, Art Jonkers for assistance in producing the example network and Ian Smith for assistance with high throughput computing. We thank two anonymous reviewers for comments which enhanced the clarity of the manuscript.