Quality of studies
The quality of treatment studies is determined through a range of factors. Here, we are using the score on a tool called PEDro to evaluate the quality of the treatment study. The maximum score is 10, and the higher the score, the higher the quality of the study.
The fixed-effect model assumes that there is one true effect size that underlies all studies in the analysis and that all differences in the Hedges’ g effect size for each study are due to sampling error.
For example, if a drug company has run eight studies to assess the effect of a drug and all studies have recruited patients in the same way, used the same researchers, dose, etc., then all studies are expected to have an identical effect size (i.e., as though as this were one large study conducted with a series of cohorts). In which case, a fixed-effect model would be appropriate.
The random-effects model assumes that the true effect size differs across the studies and that all studies in the analysis are assumed to be a random sample of all possible studies that met the inclusion criteria for the review.
In practice, meta-analyses include data from a series of studies that have been conducted by different researchers, have recruited different patients from different populations, have used different doses, etc. In which case, the random-effects model is more easily justified than the fixed-effect model.
In addition, the results of a random-effects model generalise to a range of populations, whereas those from the fixed-effect model do not. Therefore, this report interprets the random-effects model results.
Hedges’ g was used as the standardised mean difference (SMD) effect size. Hedges’ g is a variation of Cohen’s d effect that corrects for biases due to small sample sizes .
The magnitude of Hedges’ g may be interpreted using Cohen’s  convention as small (0.20), medium (0.5), or large (0.8).
Since higher scores denote better outcomes for all measures, a positive effect size indicates that the intervention favours the experimental group(s), while a negative effect size signifies that the intervention favours the control group(s).
Cohen’s U3  is a measure of non-overlap and is defined as “the percentage of the A population which the upper half of the cases of the B population exceeds”.
Hedges’ g was converted to the percentage of overlap between the control and experimental groups .
Probability of superiority
Also known as the common language effect size (CLES), the probability of superiority gives the probability that a person picked at random from the treatment group will have a higher score than a person picked at random from the control group .
Restricted maximum likelihood estimator
For the random-effects model, the restricted maximum likelihood (REML) estimator was used to estimate the amount of heterogeneity (denoted as τ2). According to Langan et al. , the REML estimator has the most reasonable properties for meta-analysis of continuous data and should be used instead of the DerSimonian and Laird (DL) estimator. However, irrespective of which τ2 estimator is used, meta-analyses comprising less than 10 studies will yield imprecise results (i.e., wider confidence intervals), particularly when study sample sizes are small (N < 40) . As such, results from meta-analyses containing fewer than 10 studies should be interpreted with caution.
Heterogeneity in effect sizes across studies was tested using the Q-statistic (with p < 0.10 indicating significant heterogeneity) and its magnitude was quantified using the I2 statistic.
I2 is an index that describes the proportion of total variation in study effect size estimates that is due to heterogeneity . A value of 0% indicates no observed heterogeneity, and larger values show increasing heterogeneity, with 25% considered low, 50% moderate, and 75% high heterogeneity.
The prediction interval quantifies the extent of heterogeneity in the distribution of effect sizes . It is an estimation of the range within which 95% of the true effect sizes are expected to fall.
Where a study has only reported data for independent subgroups, a combined effect size across the subgroups was calculated. This was done by conducting a fixed-effect meta-analysis on the subgroups for that study .
Multiple control and/or experimental groups
For studies with more than one control and/or experimental group, pairwise comparisons were conducted between the control and experimental groups(s) and/or between experimental groups. The corresponding standard errors were adjusted using the method of Rucker et al. . The effect sizes and adjusted standard errors were then combined using a fixed-effect meta-analysis.
Multiple outcomes within a study
Where a study has reported data on several related (but distinct) outcomes for the same participants, a summary effect size was computed by combining the data from all the related outcomes. A correlation coefficient between the outcomes must be specified for these calculations . Since this will vary between different outcome domains, the analyses default to a correlation of r = 0.50. This assumes that 25% of the total variation in one outcome is explained by its relationship with another outcome.
- Borenstein M, Hedges LV, Higgins JPT, Rothstein HR: Introduction to meta-analysis. John Wiley & Sons, Ltd; Chichester, UK: 2009.
- Cohen J. Statistical power analysis for the behavioural sciences. 2nd ed. Erlbaum; Hillsdale, NJ: 1988.
- Reiser B, Faraggi D: Confidence intervals for the overlapping coefficient: the normal equal variance case. Journal of the Royal Statistical Society. 1999, 48:413-418. doi: 10.1111/1467-9884.00199.
- Ruscio, J: A probability-based measure of effect size: robustness to base rates and other factors. Psychological Methods. 2008, 13:19-30. doi: 10.1037/1082-989X.13.1.19.
- Langan, D., Higgins, J. P. T., Jackson, D., Bowden, J., Veroniki, A. A., Kontopantelis, E., … Simmonds, M. (2019). A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses Research Synthesis Methods. 2019, 10:83-98 doi: 10.1002/jrsm.1316.
- Rucker G, Cates CJ, Schwarzer G: Methods for including information from multi-arm trials in pairwise meta-analysis. Research Synthesis Methods. 2017, 8:392-403. doi: 10.1002/jrsm.1259.