^{1}

^{*}

^{1}

^{2}

^{2}

^{3}

Conceived and designed the experiments: MTB SSC RJT. Analyzed the data: MTB SSC JM. Contributed reagents/materials/analysis tools: MTB SSC JM CKP RJT. Wrote the paper: MTB SSC JM CKP RJT.

Dr. Thomas has consulted for Total Sleep Holdings; has a patent for CO2 adjunctive therapy for complex sleep apnea, ECG-based method to assess sleep stability and phenotype sleep apnea; and has financial interests in SomRx. Dr. Thomas, Dr. Peng and Mr. Mietus are part co-inventors of the sleep spectrogram method (licensed by the BIDMC to Embla), and share patent rights and royalties. Mr. Mietus has financial interests in DynaDx Corp. Dr. Peng has financial interests in DynaDx Corp. This does not alter our adherence to all the PLoS ONE policies on sharing data and materials. Dr Bianchi has indicated no financial conflicts of interest.

Enhanced characterization of sleep architecture, compared with routine polysomnographic metrics such as stage percentages and sleep efficiency, may improve the predictive phenotyping of fragmented sleep. One approach involves using stage transition analysis to characterize sleep continuity.

We analyzed hypnograms from Sleep Heart Health Study (SHHS) participants using the following stage designations: wake after sleep onset (WASO), non-rapid eye movement (NREM) sleep, and REM sleep. We show that individual patient hypnograms contain insufficient number of bouts to adequately describe the transition kinetics, necessitating pooling of data. We compared a control group of individuals free of medications, obstructive sleep apnea (OSA), medical co-morbidities, or sleepiness (n = 374) with mild (n = 496) or severe OSA (n = 338). WASO, REM sleep, and NREM sleep bout durations exhibited multi-exponential temporal dynamics. The presence of OSA accelerated the “decay” rate of NREM and REM sleep bouts, resulting in instability manifesting as shorter bouts and increased number of stage transitions. For WASO bouts, previously attributed to a power law process, a multi-exponential decay described the data well. Simulations demonstrated that a multi-exponential process can mimic a power law distribution.

OSA alters sleep architecture dynamics by decreasing the temporal stability of NREM and REM sleep bouts. Multi-exponential fitting is superior to routine mono-exponential fitting, and may thus provide improved predictive metrics of sleep continuity. However, because a single night of sleep contains insufficient transitions to characterize these dynamics, extended monitoring of sleep, probably at home, would be necessary for individualized clinical application.

Numerous endogenous and exogenous factors influence whether sleep or wake is achieved, how long a given state is maintained, and the reasons sleep architecture may become fragmented

Although the ESS score was correlated with the severity of obstructive sleep apnea (OSA) in the large Sleep Heart Health Study (SHHS) database, the absolute changes were small and even the most severe OSA group had scores within the normal range (<10)

Survival analysis applied to sleep bout durations, in which lifetime refers to the length of time spent in a given stage, indicates decreased stability of sleep in OSA patients in proportion to severity of disease

Given the increasing realization that the percentage of time in a given stage may be less important than their distribution dynamics across the night, several groups have considered sleep architecture in terms of the statistical distributions of sleep stage durations. In fact, measurements of sleep in rodents, cats, and humans suggested that the duration of sleep bouts appear to follow mono-exponential kinetics, with species-specific time constants governing the time spent in a given bout of sleep

Our results demonstrate that multi-exponential fitting is superior to routine mono-exponential fitting, but also show that a single night of sleep contains insufficient transitions to characterize these dynamics. Wake bout distributions may be fitted by either a multi-exponential or a power law model. OSA alters the dynamics by accelerating the decay of REM and NREM bout durations, reflecting sleep fragmentation despite unchanged summary metrics of state percentages.

The summary statistics obtained from the PSG data for the three patient groups are presented in

Variable | Control | Mild OSA | Severe OSA | ANOVAF_{(2, 1207)}, p |
Tukey |

N | 376 | 496 | 338 | ||

Age | 68.2±6.3 | 63.8±10.3 | 63.7±10.5 | 30.43, <0.001 | 1>2, 3 |

Sex (% males) | 35.6 | 60 | 70.7 | ||

Race (% Caucasian) | 85.4 | 73.8 | 74.3 | ||

BMI (kg/m^{2}) |
26.3±4 | 30.1±5.4 | 32.4±5.8 | 128.5, <0.001 | 3>2,12>1 |

Anti-HTN use (%) | 0 | 53.6 | 54.4 | ||

Systolic BP (mm Hg) | 125.2±17.2 | 129.1±19.2 | 131.7±19.4 | 10.7, >0.001 | 2>13>13>2 |

Diastolic BP (mm Hg) | 70.6±10.2 | 74.2±10.6 | 78.2±13.1 | 40.4, <0.001 | 3>1,23>2 |

Diabetes (%) | 0 | 9.5 | 5.9 | ||

Angina (%) | 0 | 9.5 | 9.5 | ||

Myocardial infarction (%) | 0 | 10.9 | 7.1 | ||

CHF (%) | 0 | 2.6 | 1.8 | ||

ESS | 4.9±2.2 | 13.1±2.8 | 9.8±4.9 | 610.6, <0.001 | 2>1,33>2 |

Sleep efficiency (%) | 81.2±11.1 | 81.7±10.2 | 79.4±11.2 | 2.38, 0.09 | N.A. |

Sleep Latency (min) | 22.7±23.9 | 23.1±22.5 | 22.9±21.4 | 0.02, 0.9 | N.A. |

Total sleep time (min) | 360±64.8 | 326.5±61.9 | 344.2±61.7 | 53,9, <0.001 | 1>2,33>2 |

N1 (%) | 5.8±4.2 | 5.8±4.3 | 6.4±5.1 | 2.5, 0.08 | N.A. |

N2 (%) | 54.9±11.9 | 59.3±10.7 | 63.1±11.9 | 44.7, <0.001 | 3>1,22>1 |

N3 (%) | 19±12.3 | 15.5±11.4 | 13.2±11.2 | 22.2, <0.001 | 1>2,32>3 |

REM (min) | 20.3±5.8 | 19.4±6 | 17.3±6.4 | 23.2, <0.001 | 1>32>3 |

REM latency (min) | 85.2±53.7 | 86±56.4 | 102.6±69.3 | 8.81, <0.001 | 3>1,2 |

Arousal index – total | 17.3±8.6 | 19±8.9 | 36.4±16.4 | 302.3, <0.001 | 3>1,2 |

Arousal index – NREM only | 18.2±9.3 | 19.7±9.6 | 38±17.3 | 284.2, <0.001 | 3>1,2 |

Arousal index – REM only | 13.6±8.7 | 15.7±9.7 | 27.3±16.8 | 135.1, <0.001 | 3>1,22>1 |

RDI | 23.6±14 | 36.3±13.5 | 71.9±17.2 | 1020.2, <0.001 | 3>1,22>1 |

AHI | 1.8±1.4 | 8.6±2.5 | 47.4±16.4 | 2795.3, <0.001 | 3>1,22>1 |

We examined the sleep architecture dynamics by focusing on the distribution of bout durations of WASO, REM, and NREM sleep. This was accomplished by generating frequency histograms for each stage, a common technique that allows visualization of data distributions, as well as fitting with functions such as exponential and power law models. These plots are obtained by collecting the relative number of events (y-axis) occurring in each “bin” (x-axis), defined here in single epoch increments of 30 seconds each. ^{2} values and the residuals (subtracting the fitted line from the data) are shown in each case. Despite the high r^{2} values (0.93–0.99), visual inspection of the WASO, overall NREM sleep and NREM sub-stage plots shows that the best single exponential function is not adequate: it emphasizes the rapid decay phase (consisting of brief events, which were more commonly observed) but entirely misses the longer duration events (that occur less frequently). Therefore, despite its common use (see, for example, ref ^{2} values as a measure of goodness of fit should be used with caution for such distributions. We show the traditional four NREM sub-stages to determine possible differences in dynamics that could explain the multi-exponential pattern seen in the global NREM metric, although clinically it has been determined that stages 3 and 4 should be considered a single state (called N3). Therefore, for the analysis shown in

The relative frequency of bouts in the control group is plotted against the duration of bouts (in bins of 30-second increments on the x-axis) for WASO (A), NREM (B), REM sleep (C) and sub-stages of NREM sleep (D) bouts. In each panel, the best fit single-exponential function (red) is overlaid, and the residuals (difference between data and fit) are plotted beneath each histogram.

The residuals (^{2} values can appear statistically favorable (close to 1.0) despite being visually sub-optimal. Although it appears that the mono-exponential curve matches the fast decay but ignores the slow decay, in fact the residuals demonstrate the opposite: systematic deviations between the fitted line and actual data were most prominent during the fast decay (short duration bins). The absolute y-axis magnitude of the long duration “tail” is small, and thus the residual between the fit and the data is small for the majority of bin sizes, even though the fit completely misses this portion of the bout distribution. Therefore, r^{2} values remain high, and overall residuals low, despite the sub-optimal single exponential fit. Assessing the normality of the residuals does not offer clarification of this fitting issue, as the distribution of residuals fails normality testing even for a single-exponential fit of 100,000 simulated bout durations drawn from a single exponential distribution (not shown).

Next we compared the single-exponential fits against more complex fits that included the sum of up to 4 exponential functions. For each stage's bout distribution, the fits were compared pair-wise (1-vs-2, 2-vs-3, and 3-vs-4) using a sum-of-squares F-test, as well as Akaike's Information Criteria (AIC) (see

Control | Mild OSA | Severe OSA | |

Tau-Fast | 0.60 (0.59–0.61) | 0.53 (0.53–0.54) | 0.53 (0.52–0.53) |

% Fast | 94.5% | 93.2% | 97.9% |

Tau-Medium | 3.1 (2.9–3.2) | 2.2 (2.1–2.2) | 3.7 (3.6–3.8) |

% Medium | 5.4% | 6.5% | 2.1% |

Tau-Slow | 16.1 (14.1–18.7) | 14.6 (13.8–15.5) | n/a |

% Slow | 0.3% | 0.3% | n/a |

Tau-Fast | 1.7 (1.6–1.8) | 0.9 (0.8–0.9) | 1.0 (1.0–1.1) |

% Fast | 77.3% | 75.3% | 82.8% |

Tau-Medium | 7.8 (6.9–9.0) | 5.2 (4.9–5.5) | 4.8 (4.6–5.1) |

% Medium | 18.2% | 21.6% | 15.2% |

Tau-Slow | 44.1 (40.0–49.1) | 37.6 (35.2–40.3) | 32.8 (30.6–35.4) |

% Slow | 4.4% | 3.1% | 2.0% |

Tau-Fast | 3.8 (2.8–5.8) | 2.6 (2.2–3.0) | 1.9 (1.8–2.0) |

% Fast | 40.5% | 57.5% | 81.9% |

Tau-Slow | 19.0 (17.4–20.9) | 16.8 (15.8–18.0) | 16.3 (15.3–17.6) |

% Slow | 59.5% | 42.5% | 18.1% |

Tau-Fast | 1.3 (1.1–1.4) |

% Fast | 93.8% |

Tau-Slow | 3.6 (2.1–11.5) |

% Slow | 6.2% |

Tau-Fast | 1.1 (1.1–1.2) |

% Fast | 90.7% |

Tau-Slow | 10.8 (10.3–11.4) |

% Slow | 9.3% |

Tau-Fast | 0.69 (0.67–0.71) |

% Fast | 94.0% |

Tau-Slow | 4.3 (4.0–4.6) |

% Slow | 6.0% |

Tau | 0.79 (0.74–0.84) |

We also fit randomly selected subsets of the control group to address the following questions: 1) Does the appearance of multiple exponential functions imply heterogeneity within the control group; and, 2) how different are the resulting fit parameters when the sample size is decreased by more than a factor of 10? The frequency histograms of four randomly selected groups of 30 control individuals are shown in

The relative frequency of bouts from four groups of n = 30 randomly chosen individuals selected from the control dataset. Each row represents a different group. The relative frequency of bouts is plotted against the duration of bouts (30-second epoch bins) for WASO (column A), NREM (column B) and REM sleep (column C) bouts. The best single exponential fit is overlaid in red.

When only a single night of stage data was considered, the frequency histograms were clearly under-sampled: fitting showed variability between individual patients (

The relative frequency histograms of WASO (A1) NREM (A2) and REM sleep (A3) bout durations are shown for a single, randomly selected patient from the control group for comparison with histograms from larger samples (

The optimal number of exponential functions was 3 for NREM, and 2 for REM bout durations, regardless of apnea presence or severity. The optimal exponential fits are shown for NREM and REM bouts across all three cohorts in

Frequency histograms are shown for WASO (A), NREM (B), and REM sleep (C) bouts. Control distributions (black) are compared with those of mild OSA (green) and severe OSA (red). To illustrate visually the goodness of fit, the NREM (row D) and REM (row E) sleep histograms are shown separately, along with the time constants (tau) and % contribution of each exponential function. For NREM sleep, the optimal number of exponentials was three, while for REM sleep, the optimal number was two, regardless of OSA severity. Note the improved residual value patterns, compared to those of the mono-exponential fits from

Given the common practice of fitting single exponential functions to sleep-wake bout duration data, we calculated how the weighted decay time constants (weighted according to their Y_{o} values) from our multi-exponential fits would compare to this technique of fixed single exponential fitting. The weighted time constants of the REM sleep multi-exponential decays were 12.8, 8.6, and 4.5 epochs for control, mild OSA, and severe OSA, respectively (single exponential fits: 14.6, 10.8, 4.8 epochs, respectively). The weighted NREM sleep multi-exponential decays were 4.7, 2.9 and 2.4 epochs, respectively (single exponential fits: 4.2, 3.5, and 2.2 epochs). These weighted decay time constants are similar to the values obtained for the best single exponential fit to the data. OSA clearly affects the best fit mono-exponential time constant (whether by forcing a mono-exponential fit or by calculating a weighted average of a multi-exponential fit). Although mono-exponential fitting can distinguish sleep architecture in the control cohort versus mild OSA and severe OSA, our intention here is not to suggest exponential fitting as a diagnostic tool for detecting OSA, but rather to illustrate the complex dynamics underlying sleep fragmentation, using OSA as a prime example of such pathological architecture.

Several groups have reported that wake bout durations are best described by a power law

The WASO frequency histogram from the control group is shown in log-log display (A), with the fitted power law overlaid in red. A 30-patient subset of WASO is shown in panel B for comparison. Various size samples drawn from three simulated exponential distributions (with time constants of 1, 5, and 25 epochs, chosen to produce relative contributions in exponential fitting of ∼95% fast, 4% intermediate, and ∼1% slow) are shown in log-log plot (C) and linear plots (D) for comparison of exponential and power law fitting.

Parameters were chosen to imitate the actual time constants and relative proportions seen in the fitting of the WASO distributions in the control population. This simulation procedure was repeated using three sample sizes that differed by a factor of 10, the largest of which was similar to the total pooled sample size for WASO in the control group. This dataset visually resembled a power law distribution, appearing linear on the log-log plot shown in

This study complements and extends previous work on the sleep-wake dynamics in several respects. First, sleep-wake state transition probabilities are more complex than previously recognized. The temporal stability of NREM and REM sleep clearly requires more than a single-exponential function to describe the bout distributions

Although it is common for sleep stages to be presented as the average duration of time spent in wake, REM, or NREM sleep stages, metrics such as mean and median may not be informative if the distributions are not Gaussian, particularly if they are highly nonlinear such as exponential or power law distributions. From a “biomarker” standpoint, the pattern and timing of stage transitions may provide clinical insight into fundamental questions about what it means to have “refreshing” sleep than summary stage metrics, although this speculation remains to be tested. REM sleep and SWS have been implicated in different types of learning and memory

Whether different types of fragmentation occur in different pathological states, or with different clinical symptoms, remains to be explored. Although most of the published bout duration analysis has focused on the presence or absence of OSA, recent data suggests that sleep stage stability may be associated with daytime symptoms in populations with syndromes of fatigue or pain

Our results also emphasize the requirement for sampling far more than one night of sleep to adequately quantify bout duration distributions. Cost and inconvenience prohibits more than one or two nights of sleep in the laboratory setting for individual patients. Whether improvements in home monitoring can offer an alternative, which would allow longitudinal assessments of sleep architecture for individual patients, remains to be explored. Although the within-subject variability is likely less than between-subject variability, the small number of transitions per night suggests the importance of extended monitoring, likely in the home setting.

Statistical analysis of sleep stage percentages typically assumes a Gaussian distribution, but some studies report mono-exponential distribution of sleep bout durations

Our results identify an important statistical limitation in the commonly employed r^{2} value, which reports excellent (>0.9) values despite largely missing the long tails of the distributions. Moreover, analysis of residuals between the fitted curve and the actual data, often used to test the goodness of fit, also under-weights the poor fitting evident in the long tail events. This occurs because of the relatively low probability of long events, yielding small residual values even for forced mono-exponentials that miss long events (see

Finally, although there is ongoing interest in fitting wake bouts to a power law distribution, the distinction between a power law and a multi-exponential distribution is not always straightforward. Indeed, simulations showed that a known multi-exponential process can visually and statistically resemble a power law. This has mathematical implications for sleep transition modeling. For example, if all sleep-wake bout durations are considered to be exponential or multi-exponential, then all transitions of the hypnogram may be simulated using a Markov chain model. Although there are several limitations, the appeal of Markov models is that stage transitions are considered probabilistic, and certain transitions may be stabilized or destabilized by different neural circuits or neuromodulators

The transition between sleep and wake (and between REM and NREM sleep) has been compared to a “switch” consisting of reciprocal inhibition between neurons whose firing favors one or another state

From the standpoint of future sleep architecture “fingerprinting”, there is potential for use of Markov chain models

Clinical correlations between daytime complaints and polysomnographic metrics of disease severity are not always straightforward, due in part to inter-subject variability, largely subjective complaints, variable intrinsic tolerance to sleep disruption, and the short duration and non-natural setting of routine clinical monitoring. Even OSA-mediated fragmentation can be missed in routine clinical metrics (such as stage percentages). The key concept is that seemingly complex and variable manifestations of “fragmentation” may in fact possess objective and identifiable underlying statistical structure, which may offer opportunity for improved correlation with clinically relevant endpoints.

Three groups of patients were selected from the Sleep Heart Health Study, a large database of home-based polysomnography (PSG) in over 6000 patients^{2} or ANOVA as appropriate). Because of the larger difference in M:F proportion across groups, we did re-analyze the exponential fitting of the control cohort by sex (Supplemental Material,

All bouts of a given stage from each subject groups were combined for statistical analysis. Frequency histograms of bout durations were generated by Prism (GraphPad Software, LaJolla, CA, USA). Bin width was one epoch. The relative frequency of bouts in each bin was calculated, and the resulting histograms were normalized to the maximal relative frequency (which was always in the shortest bin) before fitting routines were undertaken. All stage distributions, in each clinical group, failed three tests of normality (D'Agostino-Pearson, Shapiro-Wilk, and Kolmogorov-Smirnoff). Each curve was fitted first with a standard exponential decay function: Y = Y_{o} * e^{−kX} + C, where _{o}_{oi} * e^{−kiX}] values, the _{o}

Goodness of fit was compared between the best single-exponential function and the sum of ^{B}, where

(0.04 MB DOC)

The authors thank Dan Chuang and Mark Kramer for valuable programming assistance, and Dr Elizabeth Klerman, Dr Andrew Phillips, and Scott McKinney for valuable comments.