^{1}

^{2}

^{3}

^{4}

^{3}

^{5}

^{6}

^{2}

^{7}

The authors have declared that no competing interests.

Conceived and designed the experiments: MT ML MLW CV YY HW FC. Analyzed the data: MT ML FC. Wrote the paper: MT ML MLW CV YY HW FC.

We present how Extreme Value Theory (EVT) can be used in public health to predict future extreme events.

We applied EVT to weekly rates of Pneumonia and Influenza (P&I) deaths over 1979–2011. We further explored the daily number of emergency department visits in a network of 37 hospitals over 2004–2014. Maxima of grouped consecutive observations were fitted to a generalized extreme value distribution. The distribution was used to estimate the probability of extreme values in specified time periods.

An annual P&I death rate of 12 per 100,000 (the highest maximum observed) should be exceeded once over the next 30 years and each year, there should be a 3% risk that the P&I death rate will exceed this value. Over the past 10 years, the observed maximum increase in the daily number of visits from the same weekday between two consecutive weeks was 1133. We estimated at 0.37% the probability of exceeding a daily increase of 1000 on each month.

The EVT method can be applied to various topics in epidemiology thus contributing to public health planning for extreme events.

A central question for resource planning in public health is to predict the likelihood that exceptional or extreme events will occur in the not too distant future [

The main goal of EVT is to assess, from a series of observations, the probability of events that are more extreme than those previously recorded. For example, 40% of the Netherlands is below the sea level and has to be protected against the sea by dikes. The height of dikes can be calculated from storm data collected for around 100 years using EVT, so that the risk of flooding would be less than one every 10,000 years [

Data used in this paper were counts of deaths (per age group and per month) or counts of emergency visits (per day) or counts of population (per age group).

All data were received by the authors in de-identified form. These data were strictly anonymous and did not require approval from an ethics committee.

Based on EVT [

The Fréchet class (

The Gumbel class (

The Weibull class (

A classical method for modelling the extremes of a stationary time series is the method of block maxima, in which consecutive observations are grouped into non-overlapping blocks of length _{n,1},…, M_{n,m}, say, to which the GEV distribution can be fitted for some large value of

Once a GEV distribution is fitted to _{p} of a GEV distribution is called the return level associated with the return period _{p} _{p} is expected to be exceeded on average once every years _{p}. More precisely, _{p} is exceeded by the annual maximum in any particular year with probability _{p} can be expressed in terms of the GEV parameters:

We used the

Pneumonia and Influenza (P&I) mortality data provide a specific indicator of influenza mortality [

We defined the cumulative rates of P&I mortality (cPI) as the sum of weekly P&I mortality over eight consecutive weeks using a moving time window, through the entire time series (

cPI rates correspond to black symbols and cPIM, the annual maxima, to the red symbols.

In our application, M_{n,1},…, M_{n,m} stand for the maxima of

Assuming that cPIM were distributed according to a GEV, we estimated the GEV parameters and their 95% Confidence Intervals (95%CI) by the maximum-likelihood method: the location parameter (

a-Empirical (bars) and fitted (curve) distributions for the annual maxima of cumulative P&I mortality. b-Quantile-Quantile (QQ) plots for the annual maxima of cumulative P&I mortality. c- Return plots for the annual maxima of cumulative P&I mortality

Return level plots were then calculated for return periods up to 50 years (

Daily numbers of emergency department visits were obtained between the 1^{st} July 2004 to the 30^{th} October 2014 from the cyber-urgence network of “Assistance Publique/Hôpitaux de Paris” [

We identified iEVM as the monthly maximal increase of the number of emergency department visits between the same weekdays from two consecutive weeks (n = 124). The empirical distribution of iEVM had a mean of 375 (minimum 82; maximum 1133) visits. The estimated location parameter was 305, 95%CI (282; 327), the scale parameter was 116, 95%CI (100; 133), and the shape parameter was 0.02, 95%CI (-0.09; 0.14). The fit was good (KS test P-value = 0.37) except for the observed highest maximum (

Using simple illustrative examples, we showed the applicability of EVT to epidemiologic data. A GEV distribution was fitted to block maxima and was used to calculate estimates of return levels and of risks of exceeding a defined threshold value over given time periods.

In this work, we assumed the stationarity of the underlying working time series. This was likely the case for the two applications presented: means and standard deviations calculated over moving windows of different lengths did not vary over the study periods and the autocorrelations coefficients for both time series decreased rapidly towards the null (results not shown). Moreover, the model fits were good except for one outlier value of weekly increment of emergency department visits in summer 2014.

Methods for dealing with non-stationary distributions of maxima have been suggested in EVT. For other applications, it might be useful to consider a cyclical GEV model, that is a GEV model with time-varying location and scale parameters [

While such refinements might improve the accuracy of extreme value estimates, they are beyond the scope of this study as the choice of the specific approach would depend on the intended use of the forecasts.

Return level estimates should be helpful in planning resource needs, much like the statistical rationale for building dikes in the Netherlands. In our illustrative application on emergency department visits, EVT can be useful to estimate the surge capacity of the institution. For example, one could recommend sizing complementary health care resources (beds, staff) on a value that might be exceeded once in the next ten years–in our case an increase of 893 visits in the emergency rooms. Taking the example of seasonal influenza epidemics, if one assumes that antivirals, vaccines or face-masks stockpiles should be amassed, they can easily be dimensioned using estimates of EVT analysis based on an annual risk of exceeding an

Nevertheless, a limitation of the method for public health planning is that it can’t be used to predict extreme events when these events differ by nature from those observed–as no distribution of related maxima will be observed and consequently, fitted. This means, for example, that EVT won’t help to anticipate what the impact could be of a nuclear disaster on emergency visits or to predict the mortality burden of an avian H5N1 influenza pandemic. For these types of extreme events, other methods such as risk analysis or modeling should be used. However, when data are available, we believe that extreme value theory offers a statistical rationale for public health planning of extreme events, and could be applied to a various range of topics in epidemiology.

iEV correspond to black symbols and iEVM, the monthly maxima, to the red symbols.

(PDF)

a-Empirical (bars) and fitted (curve) distributions for the monthly maxima of iEV. b-Quantile-Quantile (QQ) plots for the monthly maxima of iEV. c- Return plots for the monthly maxima of iEV.

(PDF)

Contains Flu season, date and 8-week sum of the age-standardized P&I death rates.

(CSV)

Contains date and number of emergency visits.

(CSV)

The authors thank Dr Dominique Brun-Ney from the Direction de l’Organisation Médicale et des relations avec les Universités (DOMU) and Pr Dominique Pateron from the Pole Urgence et Aval, Groupe Hospitalier Universitaire de l’Est Parisien, Assistance Publique Hôpitaux (APHP), Paris France, for providing the emergency department data.