Put simply, the effect size measures the size of an effect. For example, if one study group has had a cognitive treatment and the other group has no treatment, then the effect size would measure the effectiveness of the cognitive treatment. In other words, the effect size tells us how much more effective the cognitive treatment was compared to no treatment.
Most studies use a measure of statistical significance (i.e. the observed differences are not due to chance) to endorse their findings. However, statistical significance is limited in several ways. Firstly, it does not provide any information about the magnitude of the difference between the two treatments/measures/groups being assessed in the study (i.e. how much more effective the treatment was compared to no treatment- as referenced in the above section). In addition, with a large enough sample, most studies will often produce statistically significant results even when the intervention or treatment has only small effects. Small effects, even if significant, may often have little clinical utility. Lastly, statistical significance cannot be compared across studies, which limits our ability to compare the results of different treatments (for example) across different studies.
- Cohen’s d – measures the effect size between two groups; is commonly used in meta-analysis.
- Hedge’s g – similar to Cohen’s d, however preferred when sample sizes are very small (e.g. <20 people).
- Odds ratio (OR) – reflects the odds of a desired outcome in the intervention group relative to the odds of a similar outcome in the control group.
- Relative risk (RR) – reflects the probability of an event occurring (e.g. developing a disease) in an exposed group, compared to this same event occurring in a non-exposed group.
- Pearson’s r – measures the strength and direction of a correlation between two variables.
While there are no set definitions for interpreting effect sizes; some values have been offered cautiously as a guideline or “rule of thumb”. For example, Cohen (1988) proposed that a d of .2 indicates a small effect, while a d of .5 indicates a medium effect, and a d of .8 indicates a large effect. However, it is important when interpreting an effect size to refer to prior studies to see where your findings fit into the wider literature, and to also consider the methodological quality of the study, and the clinical significance of the findings (i.e. has the intervention resulted in a meaningful change in the participants’ lives?).
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. 2nd edn.
Durlak, J. (2009). How to Select, Calculate, and Interpret Effect Sizes, Journal of Pediatric Psychology, 34(9), 917–928, https://doi.org/10.1093/jpepsy/jsp004
Effect Size. Retrieved from: https://researchrundowns.com/quantitative-methods/effect-size/
Hedge’s g: Definition, Formula (2017, October 31). Retrieved from: http://www.statisticshowto.com /hedges-g/
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. http://doi.org/10.3389/fpsyg.2013.00863
Sullivan, G. M., & Feinn, R. (2012). Using Effect Size—or Why the P Value Is Not Enough. Journal of Graduate Medical Education, 4(3), 279–282. http://doi.org/10.4300/JGME-D-12-00156.1