^{1}

^{2}

^{2}

^{2}

^{2}

^{2}

^{3}

^{4}

The authors have declared that no competing interests exist.

Several studies have shown that total depressive symptom scores in the general population approximate an exponential pattern, except for the lower end of the distribution. The Center for Epidemiologic Studies Depression Scale (CES-D) consists of 20 items, each of which may take on four scores: “rarely,” “some,” “occasionally,” and “most of the time.” Recently, we reported that the item responses for 16 negative affect items commonly exhibit exponential patterns, except for the level of “rarely,” leading us to hypothesize that the item responses at the level of “rarely” may be related to the non-exponential pattern typical of the lower end of the distribution. To verify this hypothesis, we investigated how the item responses contribute to the distribution of the sum of the item scores.

Data collected from 21,040 subjects who had completed the CES-D questionnaire as part of a Japanese national survey were analyzed. To assess the item responses of negative affect items, we used a parameter

The sum of the item scores approximated an exponential pattern regardless of the combination of items, whereas, at the lower end of the distributions, there was a clear divergence between the actual data and the predicted exponential pattern. At the lower end of the distributions, the sum of the item scores with high values of

The distributional pattern of the sum of the item scores could be predicted from the item responses of such items.

Depression is a common mental health disorder, with an estimated 350 million people of all ages affected around the globe [

Several recent studies based on large sample sizes have shown that total depressive symptom scores in the general population follow an exponential pattern, except for the lower end of the distribution. In a data analysis on nearly 10,000 respondents to the British National Household Psychiatric Morbidity Survey, Melzer

Although several recent studies based on large sample sizes have shown that total depressive symptom scores in the general population follow an exponential pattern, Melzer

The CES-D allows an individual to self-rate the frequency of a variety of depressive symptoms (sadness, fatigue, etc.) on a scale consisting of four possible responses: “rarely (less than 1 day),” “some (1 to 2 days),” “occasionally (3 to 4 days),” and “most of the time (5 to 7 days)” (Radloff, 1997). Recently, we have shown that responses to each of the 16 individual items related to negative affect symptoms on the CES-D tend to exhibit exponential patterns for “some” and “most” responses in the general population, while this pattern is not observed for “rarely” responses[

In the present study, we investigated the distribution of the sum of depressive symptom item scores in various combinations, using data from a large, cross-sectional national survey of the Japanese general population [

The goal of the present study was to determine whether the item responses in the range from “rarely” to “some” contribute to the non-exponential pattern of total scores at the lower end of the distribution and to examine whether the sum of negative item scores approximate an exponential pattern, except for the lower end of the distribution.

The present study used data from the Active Survey of Health and Welfare (ASHW) conducted by the Japanese Ministry of Health, Labor, and Welfare in 2000 [

The questionnaire was returned by 32,729 respondents, even though the response rate was not published by the Ministry of Health, Labor, and Welfare and Health. However, the response rates for similar surveys conducted 3 and 4 years before were 87.1% and 89.6%, respectively [

The Japanese Ministry of Health, Labor, and Welfare examined our research program and allowed us to perform a secondary analysis on the anonymized data from the ASWH, in compliance with the Japanese Statistics Act. The present study was approved in 2014 by the ethics committee of the Panasonic Health Center (approval number 2014–1). The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

We excluded 1,394 respondents owing to the suspect validity of their responses (i.e., those who answered “rarely” or “most” for all items, regardless of the nature of the item). A total of 9,588 respondents were also excluded from the sample owing to missing information on one or more key study variables (i.e., depressive symptoms, age, sex). The final sample consisted of 21,040 respondents between 12 and 98 years of age (ages 12–19; N = 2457 [male; n = 1269], ages 20–29; N = 3748 [male; n = 1788], ages 30–39; N = 3761 [male; n = 1783], ages 40–49; N = 3629 [male; n = 1788], ages 50–59; N = 3569 [male; n = 1800], ages 60–69; N = 2253 [male; n = 1155], ages 70–79; N = 1161 [male; n = 517], ages 80–89; N = 412 [male; n = 108], ages 90–98; N = 50 [male; n = 15]).

Depressive symptoms were assessed using the Japanese version of the CES-D [

In our previous work, we showed that the 16 negative items related to depressive mood, somatic symptoms, and interpersonal relations follow a common mathematical model, while the four items related to positive affect do not, suggesting that the items/symptoms associated with positive affect are not manifest variables of the unidimensional latent trait [

To assess the item response in the range from “rarely” to “some,” the parameter

The distributions of the sum of negative affect items in various combinations were analyzed using log-normal scales. The fitting curve for an exponential model was estimated using least square method. The distributional patterns of the sum of 8 negative items, 4 negative items 2 negative items, and 16 negative items were compared among the different combinations. JMP Version 11 for Windows (SAS Institute, Inc., Cary, NC, USA) was used to calculate the descriptive statistics and the frequency distributions.

The item responses for the 16 negative affect items and the calculated parameter

No. | Item | Response number (%) | Parameter r | Rate of “some” to “occasionally” | Rank order of r | |||
---|---|---|---|---|---|---|---|---|

Rarely | Some | Occasionally | Most | |||||

10824 (51.4) | 7492 (35.6) | 2118 (10.1) | 606 (2.9) | 1.45 | 3.54 | 15 | ||

14974 (71.2) | 4322 (20.5) | 1412 (6.7) | 332 (1.6) | 3.47 | 3.06 | 8 | ||

15063 (71.6) | 4129 (19.6) | 1256 (6.0) | 592 (2.8) | 3.65 | 3.29 | 6 | ||

10869 (51.7) | 6821 (32.4) | 2522 (12.0) | 828 (3.9) | 1.59 | 2.71 | 14 | ||

11384 (54.1) | 6216 (29.5) | 2265 (10.8) | 1175 (5.6) | 1.83 | 2.74 | 12 | ||

9433 (44.8) | 7988 (38.0) | 2378 (11.3) | 1241 (5.9) | 1.18 | 3.36 | 16 | ||

11276 (53.6) | 6345 (30.2) | 2444 (11.6) | 975 (4.6) | 1.78 | 2.60 | 13 | ||

16907 (80.4) | 2892 (13.7) | 849 (4.0) | 392 (1.9) | 5.85 | 3.41 | 3 | ||

13234 (62.9) | 4988 (23.7) | 1920 (9.1) | 898 (4.3) | 2.65 | 2.60 | 11 | ||

13781 (65.5) | 4919 (23.4) | 1650 (7.8) | 690 (3.3) | 2.80 | 2.98 | 10 | ||

16276 (77.4) | 3110 (14.8) | 1076 (5.1) | 578 (2.7) | 5.23 | 2.89 | 5 | ||

17043 (81.0) | 2913 (13.8) | 748 (3.6) | 336 (1.6) | 5.85 | 3.89 | 2 | ||

19259 (91.5) | 1283 (6.1) | 351 (1.7) | 147 (0.7) | 15.01 | 3.66 | 1 | ||

15362 (73.0) | 4277 (20.3) | 982 (4.7) | 419 (2.0) | 3.59 | 4,36 | 7 | ||

17235 (81.9) | 2980 (14.2) | 567 (2.7) | 258 (1.2) | 5.78 | 5.26 | 4 | ||

14933 (71.0) | 4404 (20.9) | 1083 (5.1) | 620 (2.9) | 3.39 | 4.07 | 9 | ||

14241 (67.7) | 4692 (22.3) | 1476 (7.0) | 630 (3.0) | 4.07 | 3.40 |

As presented in

The item responses of 16 negative affect items are presented on both normal (A) and log-normal (B) scales. (A) The item response for each of the 16 negative affect items showed a common pattern, which displays different patterns, with a boundary between “rarely” and “occasionally.” (B) The lines for the 16 items crossed each other between “rarely” and “occasionally,” whereas the same lines exhibited a right skewed pattern between “occasionally” and “most.” Using a log-normal scale, the item responses for the 16 items showed a linear pattern between “occasionally” and “most.”

According to the rank order of parameter

The distributions of the sum of 8 item scores for the three groups are shown in

(A) High

Using a log-normal scale, all three groups showed linear and parallel patterns from 0–8 points to 24 points, suggesting that the sum of 8 item scores for the three groups followed an exponential pattern, with similar rate parameter (

High

The fitting curve using exponential model were calculated for data of high r group from 1–24 points (y = 4241e^{-0.29x}, R^{2} = 0.99), middle r group from 0–24 points (y = 6456e^{-0.26x}, R^{2} = 0.99) and low r group from 8–24 points (y = 1575e^{-0.26x}, R^{2} = 0.99). Consistent with log-normal scale findings, exponential curve fitting showed a markedly higher coefficient of determination (R^{2} = 0.99) with similar rate parameter (-0.26 ~ -0.29).

To confirm the reproducibility of the findings observed for the sum of 8 items, we examined the distributions of the sum of 4 item scores. According to the parameter

The distributions of the sum of 4 item scores for the four groups are shown in

(A) High

Using a log-normal scale (

High

The curves of fit according to an exponential model were calculated for data of high r group from 1–12 points (y = 5063e^{-0.52x}, R^{2} = 0.99), middle high r group from 0–12 points (y = 13037e^{-0.49x}, R^{2} = 0.99), middle low r group from 0–12 points (y = 11341e^{-0.41x}, R^{2} = 0.99) and low r group from 4–12 points (y = 4166.9e^{-0.45x}, R^{2} = 0.99). Exponential curve fitting showed a markedly higher coefficient of determination in all four groups (R^{2} = 0.99) with similar rate parameter (-0.41 ~ -0.52).

Finally, we examined the distributions of the sum of 2 item scores. According to the parameter

The distributions of the sum of 2 item scores for the four groups are shown in

(A) High

Using a log-normal scale (

High

The curves of fit according to an exponential model were calculated for data of high r group from 1–6 points (y = 7263e^{-0.93x}, R^{2} = 0.99), middle high r group from 1–6 points (y = 8594e^{-0.72x}, R^{2} = 0.99), middle low r group from 1–6 points (y = 11041e^{-0.62x}, R^{2} = 0.99) and low r group from 2–6 points (y = 8866e^{-0.69x}, R^{2} = 0.97). Consistent with log-normal scale findings, although exponential curve fitting showed a higher coefficient of determination in all four groups (0.97–0.99), the rate parameter of the sum of 2 items (-0.62 ~-0.93) was not very similar compared to those of the sum of 4 items and 8 items.

Finally, we examined the distributions of the total scores of 16 items. The average of parameter

(A) The distribution of the total scores of 16 items is right-skewed. (B) Using a log-normal scale, the total scores of 16 items showed linear pattern from zero points to 48 points.

The curves of fit according to an exponential model were calculated for data of the total scores of 16 items (y = 3628e^{-0.14x}, R^{2} = 0.99). Consistent with log-normal scale findings, exponential curve fitting showed a markedly higher coefficient of determination (R^{2} = 0.99).

The aim of the present study was to determine whether the item responses in the range from “rarely” to “some” contribute to the non-exponential pattern of total scores at the lower end of the distribution. The main findings of this study are as follows: (1) regardless of the choice of the items, the sum of negative item scores approximate an exponential pattern, except for the lower end of the distribution; (2) at the lower end of the distribution, the distributional pattern of the sum of the item scores varies depending on the parameter

Our findings indicate that the sum of negative item scores in various conditions approximates an exponential pattern, except for the lower end of the distribution. The reason why the sum of negative affect item scores approximates exponential patterns irrespective of their combination could be explained by a theory suggesting that negative affect items follow a unidimensional latent trait [

Although the results of our study support the hypothesis that the latent traits of negative affect items follow an exponential distribution, the mechanism responsible for the exponential distribution of the latent traits is not clear. In general, an exponential distribution is observed where individual variability and total stability are organized together [

Analyzing the data of the second British National Survey of Psychiatry morbidity, Bebbington

The rate parameters of the exponential models of the sum of negative affect item scores of 2 negative items, 4 negative items, 8 negative items and 16 negative items were -0.62 to -0.93, -0.41 to -0.52, -0.26 to -0.29, -0.14, respectively. The estimated parameters were similar across the groups with same number of items and increased as the number of summed items increased. These results suggest that the rate parameters of the exponential model of summed scores are associated with the number of items. Further mathematical explanation is necessary to elucidate the mechanism of the rate parameter variance.

Our findings indicate that the distributional patterns of the sum of negative affect items varied depending on the parameter

The conditions that enable such findings can be speculated on. Whereas the sum of negative affect items in any combination approximates exponential patterns, with the same parameter, the number of subjects that corresponds to the range of the exponential pattern is different depending on the combinations of the items. The combinations of negative affect items with high values of

Analyzing the data of the British National Household Psychiatric Morbidity Survey, Meltzer

The present study has some limitations. First, although we evaluated whether the sum of the item scores approximates an exponential distribution on a log-normal scale, we did not perform an analysis based on other mathematical models. In general, the most important part of model evaluation is testing whether the model fits empirical data better than other models. However, to the best of our knowledge, no other mathematical models for the sum of item scores have been reported so far. Thus, using graphical analysis and curve fitting, we performed the analysis limited to an exponential model. Future studies can evaluate the comparative fit of other models to our empirical data as reported in

Conversely, there is a methodological advantage in the present investigation. The sample was representative of the Japanese general population, which reduced selection bias. In addition, the large sample size (

This file includes raw data for distributions of the total depressive symptom scores in Figs

(XLSX)

The authors would like to thank the Active Survey of Health and Welfare project for providing the data for this study.