^{1}

^{2}

^{4}

^{*}

^{3}

^{1}

^{4}

^{1}

^{4}

^{5}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: DF ALG TH ART. Performed the experiments: TH DF. Analyzed the data: DF ALG TH ART. Contributed reagents/materials/analysis tools: TH DF. Wrote the paper: DF ALG TH ART.

Communicable disease outbreaks of novel or existing pathogens threaten human health around the globe. It would be desirable to rapidly characterize such outbreaks and develop accurate projections of their duration and cumulative size even when limited preliminary data are available. Here we develop a mathematical model to aid public health authorities in tracking the expansion and contraction of outbreaks with explicit representation of factors (other than population immunity) that may slow epidemic growth.

The Incidence Decay and Exponential Adjustment (IDEA) model is a parsimonious function that uses the basic reproduction number R_{0}, along with a discounting factor to project the growth of outbreaks using only basic epidemiological information (e.g., daily incidence counts).

Compared to simulated data, IDEA provides highly accurate estimates of total size and duration for a given outbreak when R_{0} is low or moderate, and also identifies turning points or new waves. When tested with an outbreak of pandemic influenza A (H1N1), the model generates estimated incidence at the i+1^{th} serial interval using data from the i^{th} serial interval within an average of 20% of actual incidence.

This model for communicable disease outbreaks provides rapid assessments of outbreak growth and public health interventions. Further evaluation in the context of real-world outbreaks will establish the utility of IDEA as a tool for front-line epidemiologists.

Outbreaks of novel emerging pathogens such as the SARS coronavirus

Mathematical models provide a useful framework for characterization and quantification of ecological processes, including outbreaks of infectious diseases

Here we propose a simple phenomenological model derived from observations that estimates of the basic reproduction number R_{0} fail to accurately project the contours of outbreaks when control interventions are put into place, and in a manner that cannot be attributed simply to misspecification of depletion of susceptible individuals. We propose that this simple model could find application early in the course of an outbreak for provision of credible and easily interpreted projections on outbreak timing, control, and final size.

The study was approved by the Research Ethics Board, University of Toronto. The Incidence Decay and Exponential Adjustment (IDEA) model is based on concept of the basic reproduction number, R_{0}, defined by Vynnycky and White as “the (average) number of successful transmissions per infected person” _{0} and the average serial interval, which is defined as the time between symptoms developing in an index case and symptoms developing in a secondary case

The basic reproduction number thus describes initial exponential growth of an outbreak or epidemic. As this process continues, the effective reproduction number _{0} x S/N_{0} itself (disease duration, contact rate, and infectiousness of cases) either because of public health interventions, or due to concern about disease among members of the public. As a decline in

In order to more fully understand the model's performance based on varying disease and disease control characteristics, we created a difference equation model with discrete time steps, each representing a single disease generation. The model was specified as follows:_{t} is the time varying effective reproductive number: the number of new infectious cases in a given generation created by each infective individual in the last generation. Re_{t} is a function of the basic reproductive number, R_{0}. Typically, Re_{t} is expressed as R_{0}S_{t}/N but such a formulation fails to account for control activities and dynamic changes in population behavior that may reduce transmissibility of infection.

We defined Re_{t} as: Re_{t} = R_{0} κ_{t} S_{t}/N where κ is a function of time and represents the proportionate reduction of risk of transmission via control activities. κ_{t} is defined as the relative risk of disease transmission (RR) raised to some power, such that κ_{t} = RR^{x}. Here x is some exponential function of t such that x = t^{n} and n is an integer > = 0. We refer to ^{th} order control, the impact of control does not change over time, and Re is simply reduced by a constant fraction throughout the epidemic. For first order control, disease risk is reduced in a manner that accelerates with time; second and third order control represent “accelerating acceleration of control”, and so on.

We used this simple difference equation model to evaluate the fit of the IDEA model to simulated epidemics under different assumptions about infectiousness (R_{0}), varying orders of control, under-reporting of cases, and multiple waves of infection. Models were fit by minimizing root-mean-squared differences (RMSD) between generation-specific case counts by adjustment of the R_{0} and d parameters of the IDEA model. When evaluating the performance of the IDEA model as applied to an SIR difference model under different assumptions about the order of κ, we normalized RMSD by dividing by total case counts, as higher order control resulted in smaller epidemics (and consequently smaller RMSD).

In addition to generating empirical estimates of _{0}_{max}, the generation where the number of new cases is <1, such that the outbreak is effectively over. Multiplication of t_{max} by serial interval duration in calendar time provides an approximate estimate of outbreak duration. By manipulating [2.0] it can be seen that:

Integration of [2.0] over

To test the ability of the model to describe simple epidemic dynamics in an actual outbreak, we applied the model to an outbreak of pandemic influenza A (H1N1) from the territory of Nunavut, Canada, using an empirically derived serial interval of 5 days

We obtained the daily number of laboratory-confirmed cases of pandemic H1N1 influenza (in which the cases were reported based on the earliest date of symptom onset, initial care, specimen collection, hospital admission, or ICU admission) for each community under study. A laboratory-confirmed case was reported as an individual with influenza-like illness or severe respiratory illness who tested positive for pandemic H1N1 influenza A virus by real-time reverse-transcriptase PCR (RT-PCR) or viral culture as is typical for Canadian influenza surveillance. As such, cases likely represent a subset of total true influenza cases

Cases were normalized to the first day of the outbreak (day 1). The definition of an outbreak was based on the Ontario Ministry of Health and Long Term Care (MOHLTC) guidelines

Simulations were performed using the Berkeley Madonna dynamic systems modeling package (University of California, Berkeley;

Normalized sum of squares fits of the IDEA model to simulated data were best with first order control (i.e., κ = RR^{t}), and were better for systems with low or moderate R_{0} (i.e., R_{0}< = 5) than those with higher R_{0} (_{0} values; however, as R_{0} increased beyond 5.5, model projected end dates for epidemics were later than those seen in simulated data (

Relationship between final-size-normalized root-mean squared differences (RMSD, Y-axis) between SIR model outputs and IDEA model fits, for R_{0} ranging from 1.5 to 7 (legend), with variation in order of control in SIR models (X-axis). It can be seen that for all R_{0} best-fits are achieved with first order control. Model fits were however better with low R_{0} simulations than with higher R_{0} simulations.

For systems with low or moderate R_{0}, and assuming first order control, stable parameters were identified for the IDEA model within 3–4 generations, and the use of these parameter values accurately projected the full extent of the epidemic curve (_{0} models. Best-fit R_{0} values identified for the IDEA model tended to be slightly higher than true R_{0} values, and the proportionate degree of over-estimation increased as the true R_{0} increased.

Comparison of prevalent infections and cumulative infections from data generated using the SIR difference equation model described in the text (gray curves), and an IDEA model fitted to the first four generations of the simulated SIR epidemic (dashed curves). The true R_{0} used in the SIR model was 3.0. It can be seen that the IDEA model projections reproduce future case counts in the SIR model almost perfectly.

In simulated epidemics with high R_{0} initial convergence occurred rapidly as the epidemic grew, with best-fit values of d approximately 0.054 or 0.055, and accurate estimation of true R_{0} values, in approximately 4 generations. However as the simulated epidemic peak occurred, best-fit R_{0} estimates, and d estimates for the IDEA model both increased sharply diverging from initial estimates and, allowing the IDEA model to reproduce epidemic peaks and subsequent declines (_{0} systems, R_{0} estimates obtained via fitting after the epidemic had peaked were far higher than true R_{0} values and than values estimated prior to peaks.

Concordance between simulated data from an SIR difference model for a higher-R_{0} system (R_{0} = 6) (solid gray curves) and IDEA fits based on early (T < = 10) generations (gray dashed curves), and based on fits from generation 15 onwards (black dashed curves). Prevalent infections are shown in the left hand panel while cumulative infections are shown on the right. Fits from generations prior to the epidemic peak (T< = 10) reproduce the initial growth of the epidemic well, and also provide accurate estimates of the true R_{0} (R_{0}∼6.34, d = 0.054); however, these parameters result in IDEA projections of far larger epidemics than actually occur. Once IDEA models are fit using generations that include and follow the epidemic peak (i.e., T> = 15) projections of both prevalent and cumulative infections become fairly accurate (black dashed curves); however, estimated R_{0} is much larger than the true value (R_{0}∼7.56) and the best-fit value for d increases as well (from 0.054 to 0.069).

Under-reporting of cases is expected to occur for a variety of diseases of public health importance; we evaluated IDEA fits to SIR model outputs where increasing fractions of cases were unobserved and consequently unavailable for fitting. In fact, we found parameter estimates and final-size-normalized RMSD model fits to be quite stable as long as case reporting fractions exceeded 5% (

Many outbreaks are characterized by sequential “waves” that may either signify the impact of seasonal or behavioural influences on disease transmission

As the IDEA model appeared to provide a reasonable means of modeling epidemics, especially for R_{0}< = 5, we evaluated the expected relationship between R0, d, tmax and Itotal mathematically, using formulae 4.0 and 4.1 for a range of possible R0 and d values. The IDEA model generates an estimate of R0 and d at each point in an outbreak, and it is then possible to rapidly project the estimated duration and total cases of the outbreak. These results are presented graphically in

The overall behaviour of the IDEA model based on a range of possible R_{o} and d values (a) the variation of t_{max} or outbreak duration as a function of R_{o} and d (b) the variation of I_{total} or the final cumulative incidence as a function of R_{0} and d.

The Nunavut, Canada data illustrate the behaviour of the model in a real outbreak situation (_{max}, or outbreak duration, of 74 serial intervals. By SI = 6 (the model fit with 6 serial intervals), the projected t_{max} is drastically dampened to 15 generations. In these early stages of the outbreak, the IDEA model is able to rapidly determine whether the outbreak is growing or stabilizing, based on the change in t_{max} and the change in Δd.

The IDEA model applied to an outbreak of influenza A (H1N1) in Nunavut, Canada, with the model parameters R_{0}, d, t_{max}, I_{total} and Δd. (a) the early stages of the outbreak, with largely exponential growth, (b) dampened growth with reduced projected t_{max} values by serial interval 7, (c) a second wave in the outbreak and (d) the fit of the model at 24 out of 27 generations.

In later stages of the outbreak (shown in

Estimating the impact of public health interactions and the degree of control over an outbreak is a considerable challenge while an outbreak is ongoing. As a result, the IDEA model was used to compare actual versus projected cases as a means of judging whether the outbreak was under control.

(a) The utility of the IDEA model in evaluating the level of control over the outbreak. Each projection is based on the outbreak up to i intervals, projected to the i+1^{th} interval. With the exception of serial intervals 6 and 7 illustrated in the figure, the projected case counts were less than the actual case counts implying that at each serial interval the outbreak grew more than would be expected by its previous course. During this outbreak, the model underestimated the actual number of cases except during two serial intervals. (b) Percent error between the projection for the next generation and actual case counts according to generation.

With the development of the IDEA model, we have demonstrated a simple, versatile model for emerging communicable disease outbreaks that has the capacity to provide short term projections of outbreak growth and contraction. To the best of our knowledge, this is the first application of this particular descriptor to epidemic growth, though other fitting methods of varying complexity are well described _{0} were exceedingly good, with parameters derived within 3–4 generations able to project the full extent of simulated epidemics with remarkable accuracy. If validated, the implications of such a finding may be profound (e.g., the ability to project, with a high degree of accuracy, the final size and duration of a seasonal influenza outbreak within 2 weeks of onset).

The application of the model to simulated epidemics with higher R_{0} (>5) was more challenging, as best-fit parameters derived from early outbreak generations, while close to true R_{0} values, resulted in epidemic curves that dramatically overshot true epidemics (a difficulty similar to that often encountered when attempting to fit an SIR model to early outbreak data). Nonetheless, the application of this technique to high R_{0} epidemics may be useful for a variety of reasons: first, early (pre-epidemic peak) IDEA estimates of R_{0} closely matched true R_{0} values in simulations, suggesting that the use of this technique for early R_{0} estimation when novel diseases emerge may be reasonable regardless of whether R_{0} is low or high. Furthermore, the Δd metric, and the abrupt shift in R_{0} estimates that occurs with the epidemic peak would provide a helpful signal to epidemiologists that the epidemic is peaking or changing. Finally, as parameter estimates stabilize again for high R_{0} systems, the IDEA model remains a useful tool for projecting the total size and duration of an outbreak. It is also possible that challenges in fitting the IDEA model to simulated data represent not a limitation of the IDEA model, but are rather an artefact of our use of SIR difference equation models, which tend to peak and collapse suddenly with at high R_{0}.

The utility of this model was evaluated further with data from a large outbreak of pandemic influenza A (H1N1) and the potential of the IDEA model to begin to understand the impact of public health interventions and structural and human behavioural factors in outbreaks was also explored. Although the IDEA model can provide no hypothesis about which factors caused a sudden acceleration or deceleration of the outbreak, it provides a fast barometer of the situation, based on all known cases.

Further testing and development in real-time outbreak situations will be needed before the IDEA model can be used in public health interventions for nearcasting (short term outbreak projection) and to assess the impact of public health interventions and to separate the impact of such interventions from spontaneous behavioural changes. The model's main asset is its simplicity and the fact that it does not require consideration of population immune status for parameterization. The model is constructed entirely on a case count time series that is likely to be available to public health professionals charged with outbreak control. IDEA requires no sophisticated knowledge of mathematics or computing, and can be realized using commonly available spreadsheet programs. The model's outputs, which include both cumulative case counts under best-fit conditions, and cumulative outbreak duration, would be valuable to front-line public health professionals seeking to budget material and human resources needed to see an outbreak through to its conclusion. This simplicity may make the model especially useful in resource-limited settings where rapid assessment of both outbreak behaviour, and

Nevertheless, the simplicity of the IDEA model is also a limitation, as it cannot provide insight into the fundamental workings of outbreaks. The factors driving contraction of growth are non-specific and could include the impact of public health interventions, changes in population behaviour, saturation of sub-populations with infection, and changes in the physical environment that speed or slow epidemic spread (e.g., rainfall or change of season).

In situations where limited public health resources must be allocated to one region at the expense of another, this model may aid in deciding which region is experiencing an outbreak that is growing more rapidly, and which region has stabilized, while using minimal data. Moreover, the model may aid in the assessment of public health interventions. If a drastic intervention is implemented, such as the closing of schools, the model may be able to rapidly identify (by means of a sudden reduction in the expected length of the outbreak t_{max}) that the intervention is having a positive impact on slowing the outbreak.

Our application of this simple model to influenza outbreak data in an isolated Canadian population has been encouraging, and it is our hope that other groups will assess the usefulness of this model in the context of other diseases and demographic groups. We also hope to translate knowledge regarding this model to front-line public health professionals who may be able to assess its usefulness in real-time. Given the ceaseless emergence of novel communicable disease threats that challenge current public health professionals, we expect no shortage of opportunities for such applications.

_{0}). Across a broad range of values of R_{0}, final size estimates from the IDEA model remained accurate. However, when R_{0} exceeded a threshold of ∼6, there was an increasing tendency for the IDEA model to project the epidemic to end later than was in fact the case. This may represent a limitation of the IDEA model, but may also be an artifact of the sudden “collapse” of epidemics with high R_{0} in SIR simulations. _{0} and d by generations of data available._{0} derived via IDEA model fits, according to generations of data available, with varying R_{0}, from SIR model simulations with first order control. True R_{0} values are presented in the legend; fitted R_{0} estimates are presented on the Y-axis. It can be seen for R_{0}< = 5, best-fit R_{0} values and true R_{0} values agree closely. High R_{0} models demonstrate similar concordance prior to epidemic peaks (which occur for high R_{0} models in generations highlighted by the shaded rectangle). However, in order to reproduce peaks and subsequent declines, IDEA model fits to simulated epidemic curves required higher R_{0} values than true R_{0} values, or R_{0} estimates obtained prior to the epidemic peak. _{0} and d by generations of data available._{0}, from SIR model simulations with first order control. True R_{0} values are presented in the legend; Estimates of d are presented on the Y-axis. It can be seen for R_{0} < = 5, d stabilizes with a value of around 0.054, in fewer than 5 generations and remains stable. High R_{0} models demonstrate similar stability in d (and empiric values of d) prior to epidemic peaks (which occur for high R_{0} models in generations highlighted by the shaded rectangle). However, in order to reproduce peaks and subsequent declines, IDEA model fits to simulated epidemic curves required extremely high d values; the greater the true R_{0} the higher the value of d required to reproduce the epidemic curve in its totality. _{0} utilized in the SIR model, fits remained good except where under-reporting resulted in extremely small absolute case numbers. This is reflected in the fact that low-R_{0} fits are more sensitive to under-reporting than high R_{0} fits. The legend presents R_{0} values used in SIR models. _{0} generated using the IDEA model. Best-fit values of R_{0} values are fairly stable; notably, as under-reporting increases, the best estimate of R_{0} for the high-R_{0} SIR outputs actually becomes a progressively better approximation of the true R_{0}. _{0} of 3, as described in the text (solid black curve). IDEA model fits, based on early generations (pale gray curve, for generations up to generation 16, prior to the onset of the second wave) and on all generations up to and including the peak of the second wave (dashed curve) are superimposed on the biphasic epidemic curve. The IDEA model's structure makes fitting to multiple peaks impossible; the best-fit IDEA model is based on parameters that create a single peak epidemic with a duration similar to that seen with the biphasic epidemic. _{0} and d for a biphasic epidemic, according to the number of generations available for model fitting. It can be seen that fits are perturbed by the onset of a second peak. The difference in d between sequential fits increases with the second wave (denoted by the shaded area), such that increases in this delta d parameter represent a potentially useful indicator of the onset of a second epidemic wave.

(PPTX)

The authors acknowledge the assistance of Dr. Geraldine Osborne, Chief Medical Officer of Health and Mike Ruta, Epidemiologist, from the Nunavut Department of Health and Social Services, for their kind provision of pandemic H1N1 influenza data. The authors also gratefully acknowledge the mathematical assistance of Dr. Ian D. Leroux. Finally, we are grateful for the insights of an Anonymous Reviewer who astutely pointed out the similarity between our approach to epidemic modeling and that previously described by Wu and Huberman for the evaluation of growth and decay of popularity of Internet news items.

^{st}Century

_{0}? PLoS ONE 2.