Click on the links below for information about other types of critical appraisal tools.
Systematic Process for Investigating and Describing Evidence-Based Research (SPIDER)
Study designs | This tool can be used for several study designs. The tool was developed for etiological systematic reviews in occupational therapy and other health-related disciplines. |
Number of items | 15 quality indicators classified into 9 quality themes (sampling and participation, statistical analysis, outliers/missing data, diagnostics, model fit, author limitations, validity, reliability, rationale). |
Rating | 1 (evidence existed), 0 (absence of evidence). |
Validity | This tool was developed from various theoretical frameworks, including methodologies used in other systematic reviews, measurement theory, critical appraisal, evidence based practice, qualitative research, and systematic review guidelines. SPIDER went through several revisions, including receiving reviews by two external researchers, evaluation of items by five doctoral-level students, formal content validity testing, a critical written appraisal by two internationally renowned measurement experts. The following validity was tested: 1) a priori validity, 2) face validity, 3) content validity using a content validity matrix and a content validity index, 5) construct validity, and 6) criterion validity (Classen et al 2008). |
Reliability | The intrarater and interrater reliability was tested. Results found 73% agreement between raters (Classen et al 2008). |
Other information | N/A |
Main references | Classen, S., Winter, S., Awadzi, K. D., Garvan, C. W., Lopez, E. D., & Sundaram, S. (2008). Psychometric testing of SPIDER: Data capture tool for systematic literature reviews. American Journal of Occupational Therapy, 62(3), 335-348. |
Crowe Critical Appraisal Tool (CCAT)
Study designs | This tool can be used for several study designs. |
Number of items | 22 items on 8 categories (preliminaries, introduction, design, sampling, data collection, ethical matters, results, discussion). Each item can have several item descriptors. A total of 54 items descriptors were developed. |
Rating | present, absent, not applicable |
Validity | Tool developed from a review of 44 critical appraisal tools (Crowe & Sheppard, 2011b). The CCAT was analysed against five alternative tools (PEDro, Cho and Bero, SCED), Reis for qualitative designs, AMSTAR). A random sample of 60 papers was rated; 10 papers from six different research designs. The CCAT had significant weak to moderate positive correlations (Kendall’s t 0.33–0.55) with the alternative tools, except in one category (Preamble). There were significant moderate to strong positive correlations in the quasi-experimental (t 0.70–1.00), descriptive-exploratory-observational (t 0.72–1.00), qualitative (t 0.74–0.81), and systematic review (t 0.62–0.82) designs and to a lesser extent in the true experimental (t 0.68–0.70) design. There were no significant correlations in the single system research designs (Crowe & Sheppard, 2011a). |
Reliability | Five raters appraised 24 papers. ICC for total score of 0.83 and absolute agreement of 0.74 (Crowe, Sheppard, & Campbell, 2012). Ten raters were randomized into two groups, the Informal appraisal group vs the CCAT group. The ICC for absolute agreement was 0.76 for the informal appraisal group and 0.88 for the CCAT group (Crowe, Sheppard, & Campbell, 2011). |
Other information | https://conchra.com.au/2015/12/08/crowe-critical-appraisal-tool-v1-4/ |
Main references | • Crowe, M., & Sheppard, L. (2011a). A general critical appraisal tool: an evaluation of construct validity. International Journal of Nursing Studies, 48(12), 1505-1516. • Crowe, M., & Sheppard, L. (2011b). A review of critical appraisal tools show they lack rigor: Alternative tool structure is proposed. Journal of clinical epidemiology, 64(1), 79-89. • Crowe, M., Sheppard, L., & Campbell, A. (2011). Comparison of the effects of using the Crowe Critical Appraisal Tool versus informal appraisal in assessing health research: a randomised trial. International Journal of Evidence-Based Healthcare, 9(4), 444-449. • Crowe, M., Sheppard, L., & Campbell, A. (2012). Reliability analysis for a proposed critical appraisal tool demonstrated value for diverse research designs. Journal of clinical epidemiology, 65(4), 375-383. |
COSMIN Risk of Bias Checklist (COnsensus-based Standards for the selection of health status Measurement INstruments Risk of Bias checklist)
Study designs | Tool developed to assess the methodological quality of articles on health measurement instruments. |
Number of items | The tool contains 12 boxes: internal consistency, reliability, measurement error, content validity, structural validity, hypothesis testing, cross-cultural validity, criterion validity, responsiveness, interpretability, generaliability, IRT (item response theory). The number of items varies in each box (ranging from 5 to 18). |
Rating | very good, adequate, doubtful, inadequate, N/A |
Validity | Tool developed from a review on systematic reviews of measurement properties and from an international 4-round Delphi study with 43 experts (Mokkink e al 2010b). |
Reliability | A total of 75 articles were appraised by two to six raters (88 raters participated). The percentage agreement was appropriate (68% was above 80% agreement), but the Kappa coefficients were low (61% below 0.40 and 6% above 0.75) (Mokkink e al 2010a). |
Other information | Three tools available: COSMIN Study Design checklist; COSMIN Risk of Bias checklist; COSMIN Reporting checklist The COSMIN Risk of Bias developed in 2018 is an update of the COSMIN checklist developed in 2010. User guide available at https://www.cosmin.nl/ |
Main references | • Mokkink, L. B., Boers, M., van der Vleuten, C., Bouter, L., Alonso, J., Patrick, D. L., et al. (2020). COSMIN Risk of Bias tool to assess the quality of studies on reliability or measurement error of outcome measurement instruments: a Delphi study. BMC Medical Research Methodology, 20(1), 1-13. • Mokkink, L., De Vet, H., Prinsen, C., Patrick, D., Alonso, J., Bouter, L., & Terwee, C. (2018). Cosmin risk of bias checklist for systematic reviews of patient-reported outcome measures. Quality of Life Research : An International Journal of Quality of Life Aspects of Treatment, Care and Rehabilitation - Official Journal of the International Society of Quality of Life Research, 27(5), 1171-1179. • Mokkink, L. B., Terwee, C. B., Gibbons, E., Stratford, P. W., Alonso, J., Patrick, D. L., et al. (2010a). Inter-rater agreement and reliability of the COSMIN (COnsensus-based Standards for the selection of health status Measurement Instruments) checklist. BMC Medical Research Methodology, 10(82), 1-11. • Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., et al. (2010b). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study. Quality of Life Research, 19(4), 539-549. • Mokkink, L. B., Terwee, C. B., Knol, D. L., Stratford, P. W., Alonso, J., Patrick, D. L., et al. (2010c). The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: a clarification of its content. BMC Medical Research Methodology, 10(22), 1-8. • Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., et al. (2010d). The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology, 63(7), 737-745. • Mokkink, L. B., Terwee, C. B., Stratford, P. W., Alonso, J., Patrick, D. L., Riphagen, I., et al. (2009). Evaluation of the methodological quality of systematic reviews of health status measurement instruments. Quality of Life Research, 18(3), 313-333. • Mokkink, L. B., Terwee, C. B., Knol, D. L., Stratford, P. W., Alonso, J., Patrick, D. L., et al. (2006). Protocol of the COSMIN study: COnsensus-based Standards for the selection of health Measurement INstruments. BMC Medical Research Methodology, 6(2), 1-7. |
Meta-tool for Quality Appraisal for Public Health Evidence (MetaQAT)
Study designs | The MetaQAT is a meta-tool for appraising public health evidence. It is mentioned that the MetaQAT is not a critical appraisal tool but a quality assessment process. A companion tool was assembled from existing critical appraisal tools to provide study design-specific guidance on validity appraisal. |
Number of items | 9 on 4 domains: relevancy, reliability, validity, and applicability. The validity can be appraised using signalling questions or with existing appraisal tools. |
Rating | The scale is optional (yes, no, unclear, N/A) |
Validity | Face validity was assessed by consulting with senior scientists experienced in critical appraisal. Content validity was established during the development process, when the content of relevant tools was compared and mapped according to a standard source. This process ensures that the MetaQAT framework covers all aspects of critical appraisal addressed by existing tools. In addition, the framework was compared with the Heller framework for appraisal in public health. Since no gold standard tool exists, criterion validity was assessed by expert assessment of study quality (Rosella et al 2016). |
Reliability | Good agreement between two reviewers for bias assessment was found in a systematic review (kappa = 0.79) (Savage et al 2016). • Savage, R. D., Rosella, L. C., Brown, K. A., Khan, K., & Crowcroft, N. S. (2016). Underreporting of hepatitis A in non-endemic countries: A systematic review and meta-analysis. BMC Infectious Diseases, 16(281), 1-12. |
Other information | https://www.publichealthontario.ca/en/health-topics/public-health-practice/library-services/metaqat |
Main references | • Rosella, L., Bowman, C., Pach, B., Morgan, S., Fitzpatrick, T., & Goel, V. (2016). The development and validation of a meta-tool for quality appraisal of public health evidence: Meta Quality Appraisal Tool (MetaQAT). Public Health, 136, 57-65. |