Learn how to interpret your search findings using a range of resources available on the CogTale website
Three scales are used to evaluate risk of bias and quality of evidence in the studies uploaded to CogTale.
1) PEDro Scale
The PEDro scale is a free, online resource developed by the Centre for Evidence-Based Physiotherapy. The PEDro scale evaluates a study’s methodological quality, allowing for the identification of study results which are valid and useful.
The scale consists of 11 items. The first item (“eligibility criteria were specified”) evaluates external validity (i.e. how ‘generalizable’ the findings of the study are to the wider population). Items 2-9 assess the study’s internal validity (i.e. the degree to which you can be confident that the results of the study are caused by one independent variable). Items 10-11 determine whether the study results are interpretable (i.e. whether sufficient statistical information has been provided). A point is awarded for each satisfied criterion. The total score of the PEDro scale is determined by summing the scores of criteria 2-11, thus, (excluding criterion 1), the methodological quality of the study is ranked based on a total score out of 10.
The PEDro scale was based on the Delphi list developed by Verhagen and collegues (for more information on the Delphi list please review: Verhagen et al., 1998). The reliability and validity of the PEDro scale has also been established by several studies (see: deMorton, 2009; Maher et al., 2003, as examples) and in some cases has been shown to be more comprehensive than other measures of methodological quality (see: Bhogal et al., 2005).
To download the PEDro Scale click here (PDF).
2) JADAD Scale
The JADAD scale is a commonly used tool to assess the methodological quality of controlled trials. The scale consists of 7 items that assess three key methodological features of controlled trials: (1) randomisation (i.e. when study participants are assigned to a treatment or control group by chance), (2) blinding (i.e. minimising the risk of prior expectations of participants and researchers from influencing the reporting of results), and (3) withdrawals and dropouts (i.e. participants who fail to complete a study).
Researchers respond to questions using a Yes/No format (e.g. “was the study described as randomised?”). For items 1-5, one point (+1) is awarded for each satisfied criterion. Items 6 and 7 attract a negative score (-1). The total score of the JADAD scale is determined by summing the scores of criteria 1-7, and the methodological quality of the study is therefore based on a total score out of 5 (where 5 is the best score a study can achieve.
The JADAD has been shown to be an easy to use tool, with established reliability and external validity (see: Olivio et al., 2008 for an example of this).
3) Cochrane Risk of Bias
The Cochrane Risk of Bias (ROB) tool provides a framework for evaluating potential sources of bias in the: study design, conduct, analysis, and reporting of results, in randomised controlled trials. The ROB tool evaluates the methodological quality of trials based on six bias domains:
- Selection bias: was participant allocation random, and concealed?
- Performance bias: were participants and study personal blinded from knowledge of which intervention a participant received?
- Detection bias: were outcome assessors blinded from knowledge of which intervention a participant received?
- Attrition bias: is there missing data, and how was this data treated?
- Reporting bias: is there evidence of selective reporting?
- Other sources of bias
Researchers formulate domain-level “judgements” about the risk of bias (i.e. “low risk”, “high risk”, or “unclear risk”) using evidence from the trial paper, trial protocol, and other sources. These domain-level judgements then provide the basis for an overall assessment of the risk of bias for the study being evaluated.
The above image from a previous Cochrane Handbook for Systematic Reviews of Interventions (2011) is an example of a 'Risk of bias summary' figure that is populated following assessment of risk of bias using the CROB tool. The green '+' symbols indicate a 'low risk' of bias, while the yellow '?' and red '-' symbols indicate an 'unclear risk' and 'high risk' of bias respectively.
For more detailed information on the methodology of the Cochrane ROB tool, please review: Higgins et al., 2011.
References and Further Reading
Bhogal, S. K., Teasell, R. W., Foley, N. C., & Speechley, M. R. (2005). The PEDro scale provides a more comprehensive measure of methodological quality than the Jadad scale in stroke rehabilitation literature. Journal of clinical epidemiology, 58(7), 668-673.
de Morton, N. A. (2009). The PEDro scale is a valid measure of the methodological quality of clinical trials: a demographic study. Australian Journal of Physiotherapy, 55(2), 129-133.
Higgins, J. P., Altman, D. G., Gøtzsche, P. C., Jüni, P., Moher, D., Oxman, A. D., ... & Sterne, J. A. (2011). The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. Bmj, 343, d5928.
Maher, C. G., Sherrington, C., Herbert, R. D., Moseley, A. M., & Elkins, M. (2003). Reliability of the PEDro scale for rating quality of randomized controlled trials. Physical therapy, 83(8), 713-721.
Olivo, S. A., Macedo, L. G., Gadotti, I. C., Fuentes, J., Stanton, T., & Magee, D. J. (2008). Scales to assess the quality of randomized controlled trials: a systematic review. Physical therapy, 88(2), 156-175.
Verhagen, A. P., de Vet, H. C., de Bie, R. A., Kessels, A. G., Boers, M., Bouter, L. M., & Knipschild, P. G. (1998). The Delphi list: a criteria list for quality assessment of randomized clinical trials for conducting systematic reviews developed by Delphi consensus. Journal of clinical epidemiology, 51(12), 1235-1241.
Meta Analyses Explained
What is a meta analysis?
A meta-analysis is a statistical technique. It involves combining the results of studies that are statistically similar, in order to identify a single conclusion. For example, a researcher may conduct a meta-analysis which examines the effectiveness of a particular cognitive intervention program in individuals with a diagnosis of Mild Cognitive Impairment (MCI). To do this would involve pooling the results of randomised controlled trials which assess the particular intervention program that the researcher is interested in, in participants with MCI.
Why conduct a meta analysis?
There are several advantages to conducting meta-analyses. Firstly, when compared to the analysis of a single study, the results of a meta-analysis are statistically stronger. This is due to the increased sample size (i.e. number of participants) and greater variability in the sample (i.e. age, gender, etc.) that typically result from combining studies. Meta-analyses also allow for identification of more accurate estimates of the magnitude of the effect(s) of the treatment being investigated. This information can be helpful in determining how useful an intervention or treatment may be in the wider population.
What is an effect size?
Put simply, the effect size measures the size of an effect. For example, if one study group has had a cognitive treatment and the other group has no treatment, then the effect size would measure the effectiveness of the cognitive treatment. In other words, the effect size tells us how much more effective the cognitive treatment was compared to no treatment.
Why do we need effect sizes?
Most studies use a measure of statistical significance (i.e. the observed differences are not due to chance) to endorse their findings. However, statistical significance is limited in several ways. Firstly, it does not provide any information about the magnitude of the difference between the two treatments/measures/groups being assessed in the study (i.e. how much more effective the treatment was compared to no treatment- as referenced in the above section). In addition, with a large enough sample, most studies will often produce statistically significant results even when the intervention or treatment has only small effects. Small effects, even if significant, may often have little clinical utility. Lastly, statistical significance cannot be compared across studies, which limits our ability to compare the results of different treatments (for example) across different studies.
Common Effect Sizes
- Cohen’s d – measures the effect size between two groups; is commonly used in meta-analysis.
- Hedge’s g – similar to Cohen’s d, however preferred when sample sizes are very small (e.g. <20 people).
- Odds ratio (OR) – reflects the odds of a desired outcome in the intervention group relative to the odds of a similar outcome in the control group.
- Relative risk (RR) – reflects the probability of an event occurring (e.g. developing a disease) in an exposed group, compared to this same event occurring in a non-exposed group.
- Pearson’s r – measures the strength and direction of a correlation between two variables.
How do I interpret an effect size?
While there are no set definitions for interpreting effect sizes; some values have been offered cautiously as a guideline or “rule of thumb”. For example, Cohen (1988) proposed that a d of .2 indicates a small effect, while a d of .5 indicates a medium effect, and a d of .8 indicates a large effect. However, it is important when interpreting an effect size to refer to prior studies to see where your findings fit into the wider literature, and to also consider the methodological quality of the study, and the clinical significance of the findings (i.e. has the intervention resulted in a meaningful change in the participants’ lives?).
References and Further Reading
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. 2nd edn.
Durlak, J. (2009). How to Select, Calculate, and Interpret Effect Sizes, Journal of Pediatric Psychology, 34(9), 917–928, https://doi.org/10.1093/jpepsy/jsp004
Effect Size. Retrieved from: https://researchrundowns.com/quantitative-methods/effect-size/
Hedge’s g: Definition, Formula (2017, October 31). Retrieved from: http://www.statisticshowto.com /hedges-g/
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. http://doi.org/10.3389/fpsyg.2013.00863
Sullivan, G. M., & Feinn, R. (2012). Using Effect Size—or Why the P Value Is Not Enough. Journal of Graduate Medical Education, 4(3), 279–282. http://doi.org/10.4300/JGME-D-12-00156.1