JACS, AB and JS are authors of a study included in this review, but were not involved in the eligibility assessment or data extraction of these studies. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials. All other authors declare no competing interests.
Conceived and designed the experiments: MJP JPTH JS. Performed the experiments: MJP GC. Analyzed the data: MJP JPTH. Wrote the paper: MJP JPTH GC JACS AH JS.
To synthesise evidence on the average bias and heterogeneity associated with reported methodological features of randomized trials.
Systematic review of meta-epidemiological studies.
We retrieved eligible studies included in a recent AHRQ-EPC review on this topic (latest search September 2012), and searched Ovid MEDLINE and Ovid EMBASE for studies indexed from Jan 2012-May 2015. Data were extracted by one author and verified by another. We combined estimates of average bias (e.g. ratio of odds ratios (ROR) or difference in standardised mean differences (dSMD)) in meta-analyses using the random-effects model. Analyses were stratified by type of outcome (“mortality” versus “other objective” versus “subjective”). Direction of effect was standardised so that ROR < 1 and dSMD < 0 denotes a larger intervention effect estimate in trials with an inadequate or unclear (versus adequate) characteristic.
We included 24 studies. The available evidence suggests that intervention effect estimates may be exaggerated in trials with inadequate/unclear (versus adequate) sequence generation (ROR 0.93, 95% CI 0.86 to 0.99; 7 studies) and allocation concealment (ROR 0.90, 95% CI 0.84 to 0.97; 7 studies). For these characteristics, the average bias appeared to be larger in trials of subjective outcomes compared with other objective outcomes. Also, intervention effects for subjective outcomes appear to be exaggerated in trials with lack of/unclear blinding of participants (versus blinding) (dSMD -0.37, 95% CI -0.77 to 0.04; 2 studies), lack of/unclear blinding of outcome assessors (ROR 0.64, 95% CI 0.43 to 0.96; 1 study) and lack of/unclear double blinding (ROR 0.77, 95% CI 0.61 to 0.93; 1 study). The influence of other characteristics (e.g. unblinded trial personnel, attrition) is unclear.
Certain characteristics of randomized trials may exaggerate intervention effect estimates. The average bias appears to be greatest in trials of subjective outcomes. More research on several characteristics, particularly attrition and selective reporting, is needed.
Randomized clinical trials (RCTs) are considered to produce the most credible estimates of the effects of interventions [
Empirical evidence can inform which methodological features of RCTs should be considered when appraising RCTs. Many studies have investigated the influence of reported study design characteristics on intervention effect estimates following the landmark study by Schulz et al. [
The aim of this systematic review was to synthesise the results of meta-epidemiological studies that have investigated the average bias and heterogeneity associated with reported methodological features of RCTs.
All methods were pre-specified in a study protocol, which is available in
We included meta-epidemiological studies investigating the association between reported methodological characteristics and intervention effect estimates in RCTs. We considered only meta-epidemiological studies adopting a matched design that ensured that comparisons between trials with different methodological features were only made within the same clinical scenario. Matching is most often done at the meta-analysis level, when a collection of meta-analyses is assembled and the individual trials within each meta-analysis are classified into those with or without a particular methodological characteristic (such as adequate versus inadequate allocation concealment) [
We excluded single systematic reviews and meta-analyses of RCTs that present a subgroup or sensitivity analysis based on a particular source of bias, since the influence of reported study characteristics on intervention effect estimates tends to be estimated imprecisely within individual meta-analyses. We also excluded studies that assembled a collection of RCTs (e.g. all child health related RCTs published in 2012), and used meta-regression to examine the relationship between a source of bias and trial effect estimates. Such studies do not control for the different interventions examined and outcomes measured across the trials, and so are at high risk of bias due to confounding. Finally, we excluded meta-epidemiological studies comparing randomized with non-randomized studies.
We only included meta-epidemiological studies investigating methodological features that can lead to the biases under the conceptual framework that underlies the Cochrane risk of bias tool for RCTs (see
Letters A-E denote the sources of bias listed in
Type of bias | Possible methodological features that can lead to bias |
---|---|
A. Bias arising from the randomisation process | Inadequate generation of a random sequence |
Inadequate allocation concealment | |
Imbalance in baseline characteristics | |
No adjustment for confounding in the analysis | |
B. Bias due to deviations from intended interventions | Non-blinded participants |
Non-blinded clinician/treatment provider | |
Unbalanced delivery of additional interventions or co-interventions | |
Participants switching interventions within the trial and being analysed in a group different from the one to which they were randomized | |
C. Bias due to missing/incomplete outcome | Missing/incomplete outcome data (dropouts, losses to follow-up, or post-randomisation exclusions) |
D. Bias in measurement of outcomes | Non-blinded outcome assessor |
Non-blinded data analyst | |
Use of faulty measurement instruments (with low validity and reliability) | |
E. Bias in selection of the reported result | Selective reporting of a subset of outcome domains, or of a subset of outcome measures or analyses for a particular outcome domain. |
Our primary interest was in the association between each methodological characteristic and:
the magnitude of the intervention effect estimate (average bias);
variation in average bias across meta-analyses (to determine whether average bias estimates are relatively similar or not across meta-analyses addressing different clinical questions), and;
the extent of between-trial heterogeneity associated with each characteristic (to determine, for example, whether effect estimates from inadequately concealed trials are more likely to be heterogeneous than estimates from adequately concealed trials). We were also interested in the above estimates stratified by type of outcome (e.g. “mortality” versus “other objective” versus “subjective”) and type of intervention (e.g. “pharmacological” versus “non-pharmacological”), however defined by the study authors. We could not include estimates stratified by type of comparator (e.g. placebo versus no treatment) since such estimates were not reported in any of the included studies. We included meta-epidemiological studies which presented at least one of the estimates of interest.
We retrieved all meta-epidemiological studies included in the AHRQ report, which searched for studies published up to September 2012 [
One reviewer (MJP) screened all titles and abstracts retrieved from the searches. Two reviewers (MJP and GC) independently screened all full text articles retrieved. Any disagreements regarding study eligibility were resolved via discussion
One reviewer (MJP) extracted all of the data using a form developed in Microsoft Excel. A second reviewer (GC) verified the accuracy of all average bias and heterogeneity effect estimates and confidence limits extracted. Data extraction items are presented in
The following data were extracted:
study characteristics, including the methodological characteristics investigated, how the characteristic was assessed (i.e. number of authors involved in assessment, inter-rater reliability of assessment), definitions of adequate/inadequate characteristics, number of included meta-analyses, number of RCTs included in the meta-analyses, sampling frame (e.g. “random sample of all Cochrane reviews with continuous outcomes that included at least 3 RCTs”), areas of health care addressed, and range of years of publication of the meta-analyses;
types of outcomes, interventions and comparators examined in the meta-analyses (which were categorised using the classification systems described by
effect estimates and measures of precision (e.g. ratio of odds ratio (ROR) and 95% confidence interval (95% CI);
any confounding variables assessed by the study authors (e.g. sample size, other methodological characteristics);
any methods used to deal with potential overlap of RCTs across the meta-analyses.
Characteristics of included meta-epidemiological studies were summarised using frequencies and percentages for binary variables and medians and interquartile ranges (IQRs) for continuous variables.
We analysed the association between a methodological characteristic and the magnitude of an intervention effect estimate (average bias) using the ratio of odds ratios (ROR), ratio of hazard ratios (RHR), or difference in standardised mean differences (dSMD) effect measure, whichever was reported by the study investigators. We analysed the association between a methodological characteristic and between-trial heterogeneity, and the variation in average bias, using the standard deviation of underlying effects (tau) or I^{2}. We only analysed associations for each characteristic independently (i.e. we did not consider average bias in trials with
We combined estimates of average bias in a meta-analysis using the random-effects model. We used DerSimonian and Laird’s method of moments estimator to estimate the between-study variance [
Two studies combined data from individual meta-epidemiological studies [
Some meta-epidemiological studies presented multiple comparisons and analyses for the same outcome. We used the following decision rules to select effect estimates to present in forest plots:
comparisons selected in the following order: (1) inadequate/unclear versus adequate (or “high/unclear” versus “low” risk of bias); (2) inadequate versus adequate; (3) inadequate versus adequate/unclear.
adjusted effect estimate selected ahead of unadjusted effect estimate.
A total of 3081 records were identified in the searches. We retrieved 118 full text articles after screening 2910 unique titles/abstracts. Twenty-four meta-epidemiological studies summarised in 28 reports met the inclusion criteria (
The included meta-epidemiological studies were published between 1995 and 2015 (
Characteristics | Studies (%, n = 24) |
---|---|
Assembled a collection of meta-analyses, and compared (within each meta-analysis) the effect estimate in trials with versus without a characteristic | 20 (83) |
Assembled a collection of trials, and compared (within each trial) the effect estimate for the same outcome with versus without a characteristic | 3 (13) |
Other |
1 (4) |
Sequence generation | 14 (58) |
Allocation concealment | 17 (71) |
Baseline imbalance | 3 (13) |
Adjusting for confounders in analysis | 1 (4) |
Block randomisation in unblinded trials | 1 (4) |
Blinding of participants | 6 (25) |
Blinding of personnel | 3 (13) |
Participants switching intervention groups within the trial | 1 (4) |
Attrition | 10 (42) |
Blinding of outcome assessor | 7 (29) |
Blinding of data analyst | 1 (4) |
Double blinding | 11 (46) |
Selective reporting | 3 (13) |
Two reviewers independently assessed all trials | 18 (75) |
Reliance on assessments by authors of included meta-analyses | 4 (17) |
One reviewer assessed all trials, with verification by another | 1 (4) |
Only one author assessed all trials | 1 (4) |
Average bias | 24 (100) |
Extent of between-trial heterogeneity | 1 (5) |
Variation in average bias | 11 (46) |
Median (IQR) meta-analyses | 26 (16–46) |
Median (IQR) trials | 229 (116–380) |
Range for meta-analyses | 1983–2014 |
Range for trials | 1955–2011 |
Varied | 16 (67) |
Child/neonatal health only | 2 (8) |
Osteoarthritis only | 2 (8) |
Mental health only | 1 (4) |
Oral medicine only | 1 (4) |
Pregnancy and childbirth only | 1 (4) |
Critical care medicine only | 1 (4) |
Varied (pharmacologic or non-pharmacologic) | 21 (88) |
Pharmacologic only | 1 (4) |
Non-pharmacologic only | 2 (8) |
Varied (mortality, other objective or subjective) | 18 (75) |
Mortality only | 1 (4) |
Subjective only | 5 (21) |
Binary | 16 (67) |
Continuous | 7 (29) |
Time-to-event | 1 (4) |
Meta-meta-analytic approach [ |
17 (71) |
Logistic regression | 4 (17) |
Multivariable, multilevel model [ |
3 (13) |
Bayesian hierarchical bias model | 2 (8) |
Bayesian network meta-regression model | 1 (4) |
No modelling | 1 (4) |
Dependent trials excluded | 12 (50) |
Dependent trials included, but analysis adjusted to account for this | 6 (25) |
Unclear (dependent trials possibly included) | 5 (21) |
Dependent trials included, with no adjustment for this | 1 (4) |
All values given as n (%) except where indicated.
^{a} Assembled a collection of trials, and compared (within each trial) the effect estimate in sub-studies with versus without a characteristic. Specifically, investigators included parallel group four-armed clinical trials that randomized patients to a blinded sub-study (experimental vs control) and an otherwise identical nonblind sub-study (experimental vs control). Investigators also included three-armed trials with experimental and no-treatment groups and a placebo group portrayed to patients as another experimental group, so that patients were not informed about the possibility of a placebo intervention. This permitted the experimental group to be included both in a nonblind sub-study (experimental vs no treatment control) and a blind sub-study (experimental vs placebo control)
^{b} Denominator is 20 as between-trial heterogeneity is not applicable in four meta-epidemiological studies
^{c} Percentages do not sum to 100 as some meta-epidemiological studies used more than one approach
Estimates of average bias were available for 13 methodological characteristics, of which nine were assessed in more than one meta-epidemiological study (see forest plots in figures below; single study estimates for other characteristics are summarised in the text). Heterogeneity estimates were reported for only six characteristics (
Study design characteristic | Average bias (95% CI) | Increase in between-trial heterogeneity |
Variation in average bias (95% CI) |
---|---|---|---|
Armijo-Olivo 2015: All outcomes | dSMD -0.02 (-0.15, 0.12) | NR | tau 0.10 |
BRANDO (Savović 2012): All outcomes | ROR 0.90 (0.82, 0.99) | tau 0.06 (0.01, 0.20) | tau 0.05 (0.01, 0.15) |
BRANDO (Savović 2012): Mortality | ROR 0.86 (0.69, 1.06) | tau 0.08 (0.01, 0.31) | tau 0.06 (0.01, 0.28) |
BRANDO (Savović 2012): Other objective | ROR 1.00 (0.84, 1.20) | tau 0.07 (0.01, 0.30) | tau 0.07 (0.01, 0.27) |
BRANDO (Savović 2012): Subjective | ROR 0.88 (0.76, 1.00) | tau 0.05 (0.01, 0.21) | tau 0.06 (0.01, 0.24) |
Papageorgiou 2015: All outcomes | dSMD -0.01 (-0.26, 0.25) | NR | tau 0.46 |
Armijo-Olivo 2015: All outcomes | dSMD -0.12 (-0.30, 0.06) | NR | tau 0.21 |
BRANDO (Savović 2012): All outcomes | ROR 0.89 (0.81, 0.99) | tau 0.06 (0.01, 0.19) | tau 0.05 (0.01, 0.18) |
BRANDO (Savović 2012): Mortality | ROR 1.03 (0.82, 1.31) | tau 0.07 (0.01, 0.30) | tau 0.07 (0.01, 0.33) |
BRANDO (Savović 2012): Other objective | ROR 0.92 (0.76, 1.12) | tau 0.06 (0.01, 0.24) | tau 0.06 (0.01, 0.29) |
BRANDO (Savović 2012): Subjective | ROR 0.82 (0.70, 0.94) | tau 0.08 (0.01, 0.27) | tau 0.07 (0.01, 0.30) |
Herbison 2011: All outcomes | ROR 0.91 (0.83, 0.99) | NR | tau 0.19 |
Nuesch 2009a: Subjective outcomes | dSMD -0.15 (-0.31, 0.02) | NR | tau 0.24 |
Hrobjartsson 2014b: Subjective | dSMD -0.56 (-0.71, -0.41) | NA | I^{2} 60% |
Nuesch 2009a: Subjective | dSMD -0.15 (-0.39, 0.09) | NR | tau 0.26 |
Hrobjartsson 2012: Subjective | ROR 0.64 (0.43, 0.96) | NA | I^{2} 45% |
Hrobjartsson 2013: Subjective | dSMD -0.23 (-0.40, -0.06) | NA | I^{2} 46% |
Hrobjartsson 2014a: Subjective (standard trials) | RHR 0.73 (0.57, 0.93) | NA | I^{2} 24% |
Hrobjartsson 2014a: Subjective (atypical trials) | RHR 1.33 (0.98, 1.82) | NA | I^{2} 0% |
BRANDO (Savović 2012): All outcomes | ROR 0.86 (0.73, 0.98) | tau 0.20 (0.02, 0.39) | tau 0.17 (0.03, 0.32) |
BRANDO (Savović 2012): Mortality | ROR 1.07 (0.78, 1.48) | tau 0.09 (0.01, 0.44) | tau 0.08 (0.01, 0.42) |
BRANDO (Savović 2012): Other objective | ROR 0.91 (0.64, 1.33) | tau 0.10 (0.01, 0.50) | tau 0.20 (0.02, 0.85) |
BRANDO (Savović 2012): Subjective | ROR 0.77 (0.61, 0.93) | tau 0.24 (0.02, 0.45) | tau 0.20 (0.04, 0.39) |
Abraha 2015: All outcomes | ROR 0.80 (0.69, 0.94) | NR | tau 0.28 |
Abraha 2015: Objective | ROR 0.80 (0.60, 1.06) | NR | tau 0.42 |
Abraha 2015: Subjective | ROR 0.84 (0.70, 1.01) | NR | tau 0.33 |
BRANDO (Savović 2012): All outcomes | ROR 1.07 (0.92, 1.25) | tau 0.07 (0.01, 0.24) | tau 0.06 (0.01, 0.24) |
BRANDO (Savović 2012): Mortality | ROR 1.07 (0.80, 1.42) | tau 0.10 (0.01, 0.32) | tau 0.09 (0.01, 0.75) |
BRANDO (Savović 2012): Other objective | ROR 1.35 (0.63, 2.94) | tau 0.13 (0.01, 1.05) | tau 0.13 (0.01, 1.15) |
BRANDO (Savović 2012): Subjective | ROR 1.03 (0.79, 1.36) | tau 0.07 (0.01, 0.38) | tau 0.07 (0.01, 0.35) |
* tau is on the log scale for RORs, but not for dSMDs
CI = confidence interval; dSMD = difference in standardised mean differences; NA = not applicable; NR = not reported; RHR = ratio of hazard ratios; ROR = ratio of odds ratios. dSMD < 0 and ROR and RHR < 1 = larger effect in trials with inadequate characteristic (or at high/unclear risk of bias)
Based on a meta-analysis of seven meta-epidemiological studies [
The boxed section displays the average bias estimates, where available, from the seven meta-epidemiological studies contributing to the BRANDO 2012^{a} study (however only the BRANDO 2012^{a} ROR was included in our meta-analysis). The BRANDO 2012^{a} ROR is based on a multivariable analysis with adjustment for allocation concealment and double blinding [the corresponding univariable ROR is (95% CrI) 0.89 (0.82, 0.96)]. The BRANDO 2012^{b} ROR is based on a multivariable analysis with adjustment for allocation concealment and double blinding [the corresponding univariable ROR (95% CrI) is 0.89 (0.75, 1.05)]. The Unverzagt 2013^{c} ROR is based on a multivariable analysis with adjustment for allocation concealment, double blinding, attrition, selective outcome reporting, early stopping, pre-intervention, competing interests, baseline imbalance, switching interventions, sufficient follow-up, and single- versus multi-centre status [the corresponding univariable ROR (95% CI) is 0.98 (0.8, 1.21)]. The BRANDO 2012^{d} ROR is based on a multivariable analysis with adjustment for allocation concealment and double blinding [the corresponding univariable ROR (95% CrI) is 0.99 (0.84, 1.16)]. The BRANDO 2012^{e} ROR is based on a multivariable analysis with adjustment for allocation concealment and double blinding [the corresponding univariable ROR (95% CrI) is 0.83 (0.74, 0.94)].
Our meta-analysis of seven meta-epidemiological studies [
The boxed section displays the average bias estimates, where available, from the seven meta-epidemiological studies contributing to the BRANDO 2012^{a} study (however only the BRANDO 2012^{a} ROR was included in our meta-analysis). The BRANDO 2012^{a} ROR is based on a multivariable analysis with adjustment for sequence generation and double blinding [the corresponding univariable ROR (95% CrI) is 0.93 (0.87, 0.99)]. The BRANDO 2012^{b} ROR is based on a multivariable analysis with adjustment for sequence generation and double blinding [the corresponding univariable ROR (95% CrI) is 0.98 (0.88, 1.10)]. The BRANDO 2012^{c} ROR is based on a multivariable analysis with adjustment for sequence generation and double blinding [the corresponding univariable ROR (95% CrI) is 0.97 (0.85, 1.10)]. The BRANDO 2012^{d} ROR is based on a multivariable analysis with adjustment for sequence generation and double blinding [the corresponding univariable ROR (95% CrI) is 0.85 (0.75, 0.95)].
The influence of other sources of bias arising from the randomisation process were less clear. There was little evidence that the presence (versus absence) of baseline imbalance inflates intervention effects (ROR 1.03, 95% CI 0.89 to 1.19; I^{2} 0%; 2 meta-epidemiological studies [
The Unverzagt 2013^{a} ROR is based on a multivariable analysis with adjustment for sequence generation, allocation concealment, double blinding, attrition, selective outcome reporting, early stopping, pre-intervention, competing interests, switching interventions, sufficient follow-up, and single- versus multi-centre status [the corresponding univariable ROR (95% CI) is 0.92 (0.80, 1.06)].
Based on a meta-analysis of three meta-epidemiological studies [
Intervention effect estimates for binary outcomes were not exaggerated in trials with lack of/unclear blinding of personnel (versus blinding of personnel) (ROR 1.00, 95% CI 0.86 to 1.16; I^{2} 0%; 2 meta-epidemiological studies [
Bias due to participants switching interventions within the trial and being analysed in a group different from the one to which they were randomized was examined in one small meta-epidemiological study of 12 meta-analyses in critical care medicine [
We did not combine estimates of average bias due to attrition because the definition of attrition varied across the meta-epidemiological studies (see
The boxed section displays the average bias estimates, where available, from the four meta-epidemiological studies contributing to the BRANDO 2012 study. The Abraha 2015^{a} ROR is based on a multivariable analysis with adjustment for use of placebo comparison, sample size, type of centre, items of risk of bias, post-randomisation exclusions, funding, and publication bias [the corresponding univariable ROR (95% CI) is 0.83 (0.71, 0.97)]. The Unverzagt 2013^{b} ROR is based on a multivariable analysis with adjustment for sequence generation, allocation concealment, double blinding, selective outcome reporting, early stopping, pre-intervention, competing interests, baseline imbalance, switching interventions, sufficient follow-up, and single- versus multi-centre status [the corresponding univariable ROR (95% CI) is 1.19 (0.98, 1.45)]. The Nuesch 2009b^{c} dSMD is based on a multivariable analysis with adjustment for allocation concealment [the corresponding multivariable dSMD (95% CI) with adjustment for blinding of participants is -0.15 (-0.30, 0.00), and the corresponding univariable dSMD (95% CI) is -0.13 (-0.29, 0.04)].
The influence of lack of/unclear blinding of outcome assessors (versus blinding) was negligible in a meta-analysis of four meta-epidemiological studies [
RHR = Ratio of hazard ratios. Hróbjartsson 2014a^{a} “standard trials” comprise those comparing experimental interventions with standard control interventions, such as placebo, no-treatment, usual care or active control. Hróbjartsson 2014a^{b} “atypical trials” comprise those comparing an oral experimental administration of a drug with the intravenous control administration of the same drug for cytomegalovirus retinitis.
Lack of/unclear double blinding (versus double blinding, where both participants and personnel/assessors are blinded) was associated with a 23% exaggeration of intervention effect estimates in trials with subjective outcomes (ROR 0.77, 95% CI 0.61 to 0.93; 1 meta-epidemiological study [
The boxed section displays the average bias estimates, where available, from the seven meta-epidemiological studies contributing to the BRANDO 2012^{a} study (however only the BRANDO 2012^{a} ROR was included in our meta-analysis). The BRANDO 2012^{a} ROR is based on a multivariable analysis with adjustment for sequence generation and allocation concealment [the corresponding univariable ROR (95% CrI) is 0.87 (0.79, 0.96)]. The BRANDO 2012^{b} ROR is based on a multivariable analysis with adjustment for sequence generation and allocation concealment [the corresponding univariable ROR (95% CrI) is 0.92 (0.80, 1.04)]. The Unverzagt 2013^{c} ROR is based on a multivariable analysis with adjustment for sequence generation, allocation concealment, attrition, selective outcome reporting, early stopping, pre-intervention, competing interests, baseline imbalance, switching interventions, sufficient follow-up, and single- versus multi-centre status [the corresponding univariable ROR (95% CI) is 0.84 (0.69, 1.02)]. The BRANDO 2012^{d} ROR is based on a multivariable analysis with adjustment for sequence generation and allocation concealment [the corresponding univariable ROR (95% CrI) is 0.93 (0.74, 1.18)]. The BRANDO 2012^{e} ROR is based on a multivariable analysis with adjustment for sequence generation and allocation concealment [the corresponding univariable ROR (95% CrI) is 0.78 (0.65, 0.92)].
In one meta-epidemiological study, blinding of data analysts was recorded, but average bias could not be quantified because the number of informative meta-analyses (i.e. those including trials with and without the characteristic) was too low [
Based on a meta-analysis of two small meta-epidemiological studies [
The Unverzagt 2013^{a} ROR is based on a multivariable analysis with adjustment for sequence generation, allocation concealment, double blinding, attrition, early stopping, pre-intervention, competing interests, baseline imbalance, switching interventions, sufficient follow-up, and single- versus multi-centre status [the corresponding univariable ROR (95% CI) is 0.73 (0.54, 0.98)].
This review of 24 meta-epidemiological studies suggests that on average, intervention effect estimates are exaggerated in trials with inadequate/unclear (versus adequate) sequence generation and allocation concealment. For these characteristics, the average bias appears to be larger in trials of subjective outcomes compared with other objective outcomes. For subjective outcomes, intervention effect estimates appear to be exaggerated in trials with lack of/unclear blinding of participants (versus blinding of participants), lack of/unclear blinding of outcome assessors (versus blinding of outcome assessors) and lack of/unclear double blinding (versus double blinding, where both participants and personnel/assessors are blinded). The average bias due to attrition varied depending on how it was defined. The influence of other characteristics (baseline imbalance, no adjustment for confounders, use of block randomisation in unblinded trials, unblinded personnel, and analysing participants in a group different from the one to which they were randomized) is uncertain, because they have been examined in only a few small meta-epidemiological studies. Some characteristics have not been investigated in any meta-epidemiological study (unblinded data analysts, use of faulty measurement instruments, bias in selection of the reported results). Only one meta-epidemiological study measured the between-trial heterogeneity associated with characteristics [
Our review builds on previous reviews [
Our review has some limitations. We only considered methodological characteristics implied by the conceptual framework underlying the current Cochrane risk of bias tool for randomized trials, because it is unclear whether other characteristics investigated in meta-epidemiological studies (e.g. single-versus multi-centre status, early stopping) represent a specific bias, small-study effects, or spurious findings [
There are also important limitations of the included meta-epidemiological studies. Many meta-epidemiological studies examined a small number of meta-analyses, and so may have had insufficient power to reliably estimate associations [
We encourage decision makers and systematic reviewers who rely on the results of randomized trials to routinely consider the risk of biases associated with the methods used. Our review suggests that particular caution is needed when interpreting the results of trials in which sequence generation, allocation concealment and blinding are not reported, and when outcome measures are subjectively assessed. This evidence is currently being taken into consideration in our work on a revision of the Cochrane risk of bias tool for randomized trials, which will include a new structure and clearer guidance that we anticipate will lead to more robust assessments.
Novel approaches are needed to examine the influence of attrition and selective reporting. Most previous meta-epidemiological studies of the influence of attrition have dichotomised trials based on some arbitrary amount of missing data (e.g. >20%). It would be more useful to know whether average bias varies according to different amounts of and reasons for missing data. Further, in previous meta-epidemiological studies of selective reporting, the authors only examined whether omission or addition of
In conclusion, empirical evidence suggests that the following characteristics of randomized trials are associated with exaggerated intervention effect estimates: inadequate/unclear (versus adequate) sequence generation and allocation concealment, and no/unclear blinding of participants, blinding of outcome assessors and double blinding. The average bias appears to be greatest in trials of subjective outcomes. More research on the influence of attrition and biased reporting of results is needed. The development of novel methodological approaches for the empirical investigation of study design biases would also be valuable.
(DOC)
(DOCX)