_{t}

^{1}

^{1}

^{1}

^{2}

^{3}

^{3}

^{3}

^{3}

^{3}

^{3}

^{2}

^{2}

^{2}

^{2}

^{2}

^{2}

^{4}

^{5}

^{6}

^{7}

^{7}

^{8}

^{6}

^{7}

^{8}

^{9}

^{10}

^{2}

^{3}

^{1}

The authors have declared that no competing interests exist.

Estimation of the effective reproductive number _{t} is important for detecting changes in disease transmission over time. During the Coronavirus Disease 2019 (COVID-19) pandemic, policy makers and public health officials are using _{t} to assess the effectiveness of interventions and to inform policy. However, estimation of _{t} from available data presents several challenges, with critical implications for the interpretation of the course of the pandemic. The purpose of this document is to summarize these challenges, illustrate them with examples from synthetic data, and, where possible, make recommendations. For near real-time estimation of _{t}, we recommend the approach of Cori and colleagues, which uses data from before time _{t} estimates may be biased if the underlying structural assumptions are not met. Two key challenges common to all approaches are accurate specification of the generation interval and reconstruction of the time series of new infections from observations occurring long after the moment of transmission. Naive approaches for dealing with observation delays, such as subtracting delays sampled from a distribution, can introduce bias. We provide suggestions for how to mitigate this and other technical challenges and highlight open problems in _{t} estimation.

The effective reproductive number _{t} is a key epidemic parameter used to assess whether an epidemic is growing, shrinking, or holding steady. _{t} estimates can be used as a near real-time indicator of epidemic growth or to assess the effectiveness of interventions. But due to delays between infection and case observation, estimating _{t} in near real time, and correctly inferring the timing of changes in _{t}, is challenging. Here, we provide an overview of challenges and best practices for accurate and timely _{t} estimation.

The effective reproductive number, denoted as _{e} or _{t}, is the expected number of new infections caused by an infectious individual in a population where some individuals may no longer be susceptible. Estimates of _{t} are used to assess how changes in policy, population immunity, and other factors have affected transmission at specific points in time [

We consider 2 potential forms of bias in _{t} estimates, systematic over- or underestimation and temporal inaccuracy. Misspecification of the generation interval is a large potential source of over- or underestimation, and we find that _{t} estimates are most prone to this kind of bias when the true value is substantially greater or less than 1. This situation might arise at the beginning of the Coronavirus Disease 2019 (COVID-19) pandemic (when _{t} is relatively high) or after particularly effective interventions (when it might be low). Over- or underestimation would have particularly strong practical consequences near the control threshold of _{t} = 1, but the biases we observe are smallest in absolute terms in this range.

Another challenge is that depending on the methods used, _{t} estimates may be leading or lagging indicators of the true value [_{t} estimation is particularly concerning when trying to infer how changes in behavior have affected transmission [_{t} falls below 1, is a focus of this Perspective. We find that it has several possible causes and can be difficult to avoid.

This Perspective focuses on the 3 main empirical methods to estimate _{t} [_{t} estimates obtained in this way should be assessed on a case-by-case basis, giving sensitivity to model structure and data availability.

We use synthetic data to compare the accuracy of 3 common empirical methods with the estimate _{t}, first under ideal conditions, in the absence of parametric uncertainty, and with all infections observed at the moment they occur. This idealized analysis is intended to illustrate the inputs needed to estimate _{t} accurately, to highlight the intrinsic differences between the methods, and to examine specific causes of bias and temporal inaccuracy 1 by 1. However, we emphasize that our idealized analyses overestimate the potential accuracy of _{t} estimates obtained from real-world data, even if best practices are followed. The results show that the method of Cori and colleagues [_{t}. For retrospective analysis, the methods of Cori and colleagues or of Wallinga and Teunis may be appropriate, depending on the aims.

Later, we add realism and address practical considerations for working with imperfect data. These analyses emphasize potential errors introduced by uncertainty in the intrinsic generation interval and imperfect case observation, the need to adjust for delays in case observation and right truncation, and the need to choose an appropriate smoothing window given the sample size. Finally, we emphasize that most off-the-shelf tools leave it up to the user to account for these 5 sources of uncertainty when calculating confidence intervals. Failure to propagate uncertainty in _{t} estimates can lead to overinterpretation of the results and could falsely imply that confidence or credible intervals have crossed the critical threshold.

We used synthetic data to compare 3 common _{t} estimation methods. Synthetic data were generated from a deterministic or stochastic SEIR model in which the transmission rate changes abruptly. Results were similar whether data were generated using a deterministic or stochastic model. For simplicity, we show deterministic outputs throughout the document, except in the section on smoothing windows, where stochasticity is a conceptual focus.

In our model, all infections are locally transmitted, but all 3 of the methods we test can incorporate cases arising from importations or zoonotic spillover [_{t} are likely to be inaccurate if a large proportion of cases involve transmission outside the population. This situation could arise when transmission is low (e.g., at the beginning or end of an epidemic) or when _{t} is defined for a population that is connected to others via migration.

A synthetic time series of new infections (observed at the _{t} estimation methods of Wallinga and Teunis, Cori and colleagues, and Bettencourt and Ribeiro [

In the synthetic data, _{0} was set to 2.0 initially, then to 0.8 and 1.15, to simulate the adoption and later the partial lifting of public health interventions. To mimic estimation in real time, we truncated the time series at

The effective reproductive number at time _{t} estimates, given an age-structured contact matrix [

For each definition of _{t}, arrows show the times at which infectors (upwards) and their infectees (downwards) appear in the data. Curves show the generation interval distribution (A, B), or serial interval distribution (C). (A) The instantaneous reproductive number quantifies the number of new infections incident at a single point in time (_{i}, blue arrow), relative to the number of infections in the previous generation (green arrows) and their current infectiousness (green curve). The methods of Cori et al. and of Bettencourt and Ribeiro estimate the case reproductive number. This figure illustrates the Cori method. (B-C) The case reproductive number is defined as the average number of new infections that an individual who becomes infected on day _{i} (green arrows in B) or symptomatic on day _{s} (yellow arrows in C) will eventually go on to cause (blue downward arrows in B and C). The first definition applies when estimating the case reproductive number using inferred times of infection, and the second applies when using data on times of symptom onset. The method of Wallinga and Teunis estimates the case reproductive number.

More formally, the instantaneous reproductive number is defined as the expected number of secondary infections occurring at time

Solid and dashed black lines show the instantaneous and case reproductive numbers, respectively, calculated from synthetic data. Colored lines show estimates and confidence or credible intervals. To mimic an epidemic progressing in real time, the time series of infections or symptom onset events up to _{t} is falling or rising produces similar results as in _{t} when the true value is substantially higher than 1. The method is also biased as transmission shifts. (B) The Cori method accurately measures the instantaneous reproductive number. (C) The Wallinga and Teunis method estimates the cohort reproductive number, which incorporates future changes in transmission. Thus, the method produces _{t} estimates that lead the instantaneous effective reproductive number and becomes unreliable for real-time estimation at the end of the observed time series without adjustment for right truncation [

The method of Cori and colleagues estimates _{t} as
_{t} is the number of infection incidents on day _{s} is the generation interval, or the probability that _{t−s}) and current infectiousness (_{s}) of individuals who became infected

The only parametric assumption required by this method is the form of the generation interval. The standard assumption is that _{s} follows a discretized gamma distribution [_{t}, even tracking abrupt changes (

Bettencourt and Ribeiro [_{t} and the exponential growth rate of the epidemic, where

Under the assumption that _{t} evolves through time as a Gaussian process, _{t} estimation [_{t} estimates, especially when _{t} is substantially higher than 1 (

In its current form, we do not recommend using the method of Bettencourt and Ribeiro, given that unrealistic structural assumptions lead to bias. However, a generalized version capable of accommodating more realistic generation intervals, which implicitly involves different assumptions about the underlying epidemic process [_{t} across consecutive time steps, it returns smoother estimates than the method of Cori and colleagues, which is advantageous if unmodeled reporting effects, rather than bursts in transmission, are the dominant cause of variability in daily observations.

Finally, the case or cohort reproductive number is the expected number of secondary infections that an individual who becomes infected at time

The method of Wallinga and Teunis [_{j}) infected case _{j} for all individuals infected at time _{t} estimates do not account for uncertainty from partial observation.

Practically speaking, there are several important differences between the case reproductive number (estimated by Wallinga and Teunis) and the instantaneous reproductive number (estimated by Cori and colleagues or Bettencourt and Ribeiro). First, the case reproductive number is shifted forward in time relative to the instantaneous reproductive number. It produces leading estimates of changes in the instantaneous reproductive number (

Next is the issue of real-time observation. Estimators of the instantaneous reproductive number were partly developed for near real-time estimation and only use data from before time _{t} is rapidly falling (_{t} estimates to the end of a truncated time series [_{t} at the end of the time series, even in the absence of reporting delays. Mathematically, this underestimation occurs because calculating the case reproductive number involves a weighted sum across transmission events observed after time _{t} early in the time series.

Overall, for real-time analyses aiming to quantify the reproductive number at a particular moment in time or to infer the impact of changes in policy, behavior, or other extrinsic factors on transmission, the instantaneous reproductive number will provide more temporally accurate estimates and is most appropriate. The case reproductive number of Wallinga and Teunis considers the reproductive number of specific individuals and therefore is more appropriate for analyses aiming to incorporate individual-level covariates such as age [_{t}.

The Cori method most accurately estimates the instantaneous reproductive number in real time. It uses only past data and minimal parametric assumptions.

The method of Wallinga and Teunis estimates a slightly different quantity, the case or cohort reproductive number. The case reproductive number is conceptually less appropriate for real-time estimation but may be useful in retrospective analyses, especially those involving individual-level covariates.

In its current form, the method of Bettencourt and Ribeiro [_{t} estimates, but generalized versions of the method could be accurate and computationally efficient.

When estimating _{t} from observed data, misspecification of the generation interval is a large potential source of bias. Regardless of the method used, _{t} estimates are sensitive not only to the mean generation time but also to the variance and form of the generation interval distribution [

The renewal equation is a cornerstone of demographic theory and forms the mathematical backbone of the _{t} estimators described above [

Originally developed in the context of population biology, the renewal equation is usually expressed as _{0}_{t}.

The difficulty is that the “intrinsic” generation interval of the renewal equation, which is the interval needed for accurate _{t} estimation, is conceptually and quantitatively different from the generation intervals observed in practice [_{t} [

The serial interval, defined as the time between symptom onset in an infector–infectee pair, is more easily observed than the generation interval and often used in its place. Although the serial and generation intervals are often conflated, failure to understand the differences between these related quantities can bias _{t} estimates [

_{t} values will typically be further from 1 than the true value—too high when _{t}>1 and too low when _{t}<1. If the mean is set too low, _{t} values will typically be closer to 1 than the true value. These biases are relatively small when _{t} is near the critical threshold of 1, but these increase as _{t} takes substantially higher or lower values (_{t} values to bias may be compounded by limited data and highly uncertain generation interval estimates.

The intrinsic generation interval is required to correctly define the relationship between _{t} and incident infections.

The intrinsic generation interval is rarely observable, and care must be taken to estimate it from proxies such as the serial interval.

Estimating _{t} requires data on the daily number of new infections (i.e., transmission events). Due to lags in the development of detectable viral loads, symptom onset, seeking care, and reporting, these numbers are not readily available. All observations reflect transmission events from some time in the past. In other words, if _{t−d}, not _{t} (_{t} estimates thus requires assumptions about lags from infection to observation. If the distribution of delays can be estimated, then _{t} can be estimated in 2 steps: first by inferring the incidence time series from observations and then by inputting the inferred time series into an _{t} estimation method. Alternatively, the unobserved time series could be inferred simultaneously with _{t} or treated as a latent state. Such methods are now available in a development version of the R package EpiNow2 [

Observations after time

Simple but mathematically incorrect methods for the inference of unobserved times of infection have been applied to COVID-19: convolution and temporal shifts. The errors introduced by these methods may be tolerable if delays to observation are relatively short and not highly variable and if _{t} is not rapidly changing. But when dealing with longer or more variable observation delays, or when aiming to infer the timing of changes in _{t} accurately, these methods may not be sufficient.

One method infers each individual’s time of infection by subtracting a sample from the delay distribution from each observation time. This is mathematically equivalent to convolving the observation time series with the reversed (backward) delay distribution, but convolution does not accurately infer the underlying time series of infections from observations [_{t}: peaks, valleys, and changes in slope of the latent time series of infection events. Convolution and other approaches that blur or oversmooth can therefore prevent or delay detection of changes in _{t} and can impede accurate inference of the timing of these changes (

Infections back calculated from (A) observed cases or (B) observed deaths either by shifting the observed curve back in time by the mean observation delay (shift), by subtracting a random sample from the delay distribution from each individual time of observation (convolve), or by deconvolution (deconvolve). Only the deconvolved time series is adjusted for right truncation. Deconvolution most accurately recovers peaks or valleys in the true infection curve. Shifting is less accurate, and convolution is least accurate. Errors from back-calculation increase with the variance of the delay distribution (B vs. A). (C) Posterior mean and credible interval of _{t} estimates from the Cori et al. method. Inaccuracies in the inferred incidence curves affect _{t} estimates, especially when _{t} is changing (here, _{t} was estimated using shifted values from panels A and B). Finally, we note that shifting the observed curves back in time without adjustment for right truncation leads to a gap between the last date in the inferred time series of infection and the last date in the observed data, as shown by the dashed lines and horizontal arrows in panels A–C.

The second simple-but-incorrect method to adjust for lags is to shift either the raw inputs (the observed time series) or outputs of _{t} estimation on the time axis. _{t} estimates obtained by applying the methods of Cori and colleagues or Bettencourt and Ribeiro to unadjusted data will lag the true instantaneous _{t} by roughly the mean delay from infection to observation. Unadjusted estimates from Wallinga and Teunis are less lagged, because the lead intrinsic to this estimator, relative to the instantaneous reproductive number, partially offsets lags to observation (_{t}.

Unlike backward convolution, temporal shifting does not further blur the observed time series. Thus, if the mean delay is known accurately, this method is preferable to subtracting samples from the delay distribution (_{t}. Shifting inputs or _{t} estimates by a fixed amount also fails to account for realistic uncertainty in the true mean delay, which will not be known exactly and might change over time.

More reliable methods to reconstruct the incidence time series are now under development. Given a known delay distribution, a potential solution is to infer the unlagged signal using maximum likelihood deconvolution. This method was applied to AIDS cases, which feature long delays from infection to observation [

A potential alternative to deconvolution is _{t} estimation models that include forward delays to observation in the inference process or that treat the time series of infections as latent states. Such methods are in development within the R package EpiNow2 [_{t} is the seamless integration of various sources of uncertainty, e.g., in _{t} and reporting. By comparison, the 2-step approach of first transforming the observed time series and then calculating _{t} requires users to propagate uncertainty from the back-calculation step into the _{t} estimation step. A final advantage of models that include forward delays to observation is that they could facilitate inference from multiple populations or data streams simultaneously [

Deconvolution or _{t} estimation methods that include a forward observation process are particularly useful when delays to observation are relatively long and variable and in analyses that require accurate inference of the timing and speed of changes in _{t}. If delays to observation are relatively short, or if _{t} is not substantially changing, then deconvolution may not be necessary. For example, when working with synthetic case data in which the mean delays to observation are short and known accurately, the underlying infection curve (_{t} values (_{t} and to relate those changes to policies, behaviors, or other extrinsic epidemic drivers at specific points in time. For example, simply shifting a time series of observed deaths by the mean delay does not accurately recover the underlying curves of infections or _{t} (

Another advantage of working with observations nearer the time of infection, such as times of symptom onset among newly symptomatic individuals, is that they provide more information about recent transmission events and therefore allow _{t} to be estimated in closer to real time (

Further investigation is needed to determine the best methods for inferring infections from observations if the underlying delay distribution is uncertain. If the delay distribution is severely misspecified, all 3 approaches (deconvolution, shifting by the mean delay, or convolution) will incorrectly infer the timing of changes in incidence. In this case, methods such as deconvolution or shifting by the mean delay might more accurately estimate the magnitude of changes in _{t} but at the cost of spurious precision in the inferred timing of those changes. Ideally, the delay distribution could be inferred jointly with the underlying times of infection or estimated as the sum of the incubation period distribution and the distribution of delays from symptom onset to observation (e.g., from line list data).

Estimating the instantaneous reproductive number requires data on the number of new infections (i.e., transmission events) over time. These inputs must be inferred from observations using assumptions about delays between infection and observation.

Inferring the unlagged time series of infections using deconvolution, or within an _{t} estimation model that includes forward delays, can improve accuracy.

A less accurate but simpler approach is to shift the observed time series by the mean delay to observation. If the delay to observation is not highly variable and if the mean delay is known exactly, the error of this approach may be tolerable. A key disadvantage is that shifting by a fixed amount of time does not account for uncertainty or individual variation in delay times.

Sampling from the delay distribution to impute individual times of infection from times of observation accounts for uncertainty but blurs peaks and valleys in the underlying incidence curve, which, in turn, compromises the ability to rapidly detect changes in _{t}.

Near real-time estimation requires not only inferring times of infection from the observed data but also adjusting for missing observations of recent infections. The absence of recent infections is known as “right truncation.” Without adjustment for right truncation, the number of recent infections will appear artificially low because they have not yet been reported [_{t} estimation.

_{t} estimation (

The simplest approach is to drop estimates on the last few dates or to flag them as unreliable [

In short, accurate near real-time _{t} estimation requires both inferring the infection time series from recent observations and adjusting for right truncation. Errors in either step could amplify errors in the other. Joint inference approaches for near real-time _{t} estimation, which simultaneously infer times of infection and adjust for right truncation, are now in development [

Due to reporting delays, infections at the end of a growing time series will be undercounted. To avoid systematic unerestimation of _{t} on the most recent dates, adjust for the right truncation using 1 of many available methods or truncate the time series to the last date with complete reporting.

The effect of incomplete case observation on _{t} estimation depends on the observation process. If the fraction of infections observed is constant over time, _{t} point estimates will remain accurate and unbiased despite incomplete observation [_{t} estimates.

Sampling biases will also bias _{t} estimates [_{t} estimates. For example, several _{t} estimation dashboards currently adjust for testing effort [_{t} estimates as potentially biased in the few weeks following known changes in data collection or reporting. At a minimum, practitioners and policy makers should understand how the data underlying _{t} estimates were generated and whether they were collected under a standardized testing protocol.

_{t} point estimates will remain accurate given the imperfect observation of cases if the fraction of cases observed is time-independent and representative of a defined population. But even in this best-case scenario, confidence or credible intervals will not accurately measure uncertainty from imperfect observation.

Changes over time in the type or fraction of infections observed can bias _{t} estimates. Structured surveillance with fixed testing protocols can reduce or eliminate this problem.

Because _{t} estimators incorrectly assume all infections are observed, day-of-week reporting effects and stochasticity in the number of observations per day can cause spurious variability in _{t} estimates, especially if the number of observations per day is low [_{t} estimates, but the size of the smoothing window can affect both the temporal and quantitative accuracies of estimates. Larger windows effectively increase the sample size by drawing information from multiple time points but blur what may be biologically meaningful changes in _{t}. Some smoothing approaches can also cause _{t} estimates to lead or lag the true value (

Estimates were obtained using synthetic data drawn from the _{t} calculated from synthetic data.

Lags can be particularly severe when using the conventions suggested by Cori and colleagues, in which _{t} is reported on the last date in a given window, rather than on the middle date. Although this convention returns _{t} estimates to the last date in the time series, which is convenient for real-time estimation, _{t} estimates reported at the end of a window are based entirely on data from the past and therefore lag the instantaneous _{t} (_{t} at the midpoint of the smoothing window (_{t} estimation on the last _{t} estimation requires large enough daily counts to permit a small window (e.g., a few days).

Although the sliding window increases statistical power to infer _{t}, it does not by itself accurately calculate confidence intervals. Thus, underfitting and overfitting are possible. The risk of overfitting in the Cori method is determined by the length of the time window that is chosen. In other words, there is a trade-off in the window length between picking up noise with very short windows and over-smoothing with very long ones. To avoid this, one can choose the window size based on short-term predictive accuracy, for example, using leave-future-out validation to minimize the 1-step-ahead log score [

If _{t} appears to vary abruptly due to underreporting, a wide smoothing window can help resolve _{t}. However, wider windows can also lead to lagged or inaccurate _{t} estimates.

If a wide smoothing window is needed, report _{t} for

To avoid overfitting, choose a smoothing window based on short-term predictive accuracy [

We tested the accuracy of several methods for _{t} estimation in near real time and recommend the methods of Cori and colleagues [

Most epidemiological data are not ideal, and statistical adjustments are needed to obtain accurate and timely _{t} estimates. First, considerable preprocessing is needed to infer the underlying time series of infections (i.e., transmission events) from delayed observations and to adjust for right truncation. Best practices for this inference are still under investigation, especially if the delay distribution is uncertain. The smoothing window must also be chosen carefully, potentially adaptively, and daily counts must be sufficiently high for changes in _{t} to be resolved on short timescales. To avoid biases in _{t} estimates, the generation interval distribution must be estimated and specified accurately. Finally, to avoid false precision in _{t}, uncertainty arising from delays to observation, from adjustment for right truncation, and from imperfect observation must be propagated. The functions provided in the EpiEstim package quantify uncertainty arising from the _{t} estimation model but currently not from uncertainty arising from imperfect observation or delays.

Work is ongoing to determine how best to infer infections from observations and to account for all relevant forms of uncertainty when estimating _{t}. Some useful extensions of the methods provided in EpiEstim have already been implemented in the R package EpiNow2 [

But even the most powerful inferential methods, extant and proposed, will fail to estimate _{t} accurately if changes in sampling are not known and accounted for. If testing shifts from more to less infected subpopulations or if test availability shifts over time, the resulting changes in case numbers will be ascribed to changes in _{t}. Thus, structured surveillance also belongs at the foundation of accurate _{t} estimation. This is an urgent problem for near real-time estimation of _{t} for COVID-19, as case counts in many regions derive from clinical testing outside any formal surveillance program. Deaths, which are more reliably sampled, are lagged by 2 to 3 weeks and still subject to biases in underreporting. The establishment of sentinel populations (e.g., outpatient visits with recent symptom onset) for _{t} estimation could thus help rapidly identify the effectiveness of different interventions and recent trends in transmission.

All code for analysis and figure generation is available at

(A–C) Alternate version of _{t} first hits its minimum value after falling abruptly (time 67, yellow point) or 8 days after the changepoint (time 75). (D–F) The time series ends on the day _{t} stops rising (time 97, yellow point) or 8 days later (time 105). Estimates of the instantaneous reproductive number (A, B, D, E) remain accurate to the end of the time series, and estimates do not change as new observations become available in the 8 days following the changepoint. As in the main text, estimates of the unadjusted case reproductive number (C, F) depend on data from not-yet-observed time points. These estimates become more accurate as new observations are added to the end of the time series (orange vs. blue). Methods to infer the number of not-yet-observed infections can help make estimates of the case reproductive number more accurate in real time [

(TIF)

Both were estimated using a 7-day smoothing window on a synthetic time series of new infections, observed without delay. The estimates of Cori et al. and Wallinga and Teunis are similar in shape when smoothed, but the estimate of Wallinga and Teunis (the case reproductive number) leads that of Cori et al. (the instantaneous reproductive number) by roughly 8 days, or the mean generation interval. Solid colored lines and confidence regions show the posterior mean and 95% credible interval (Cori et al.) or maximum likelihood estimate and 95% confidence interval (Wallinga and Teunis). Dotted and dashed lines show the exact instantaneous reproductive number and case reproductive number, respectively.

(TIF)

(A) Consider 1,000 individuals all infected at time 100 (vertical line shows the mean). (B) Now consider the times at which these individuals are observed. Logically, _{observation} = _{infected}+

(TIF)

We are grateful to Michael Höhle for helpful comments. This work was completed in part with resources provided by the University of Chicago’s Research Computing Center.