Systems for grading the quality of evidence and the strength of recommendations II: pilot study of a new system.

TitleSystems for grading the quality of evidence and the strength of recommendations II: pilot study of a new system.
Publication TypeJournal Article
Year of Publication2005
AuthorsAtkins D, Briss PA, Eccles M, Flottorp S, Guyatt GH, Harbour RT, Hill S, Jaeschke R, Liberati A, Magrini N, Mason J, O'Connell D, Oxman AD, Phillips B, Sch├╝nemann H, Edejer TT-T, Vist GE, Williams JW
Corporate AuthorsGRADE Working Group
JournalBMC health services research
Date Published2005 Mar 23
KeywordsComprehension; Consensus; Evidence-Based Medicine; Humans; Judgment; Pilot Projects; Practice Guidelines as Topic; Quality Assurance, Health Care; Risk Assessment

BACKGROUND: Systems that are used by different organisations to grade the quality of evidence and the strength of recommendations vary. They have different strengths and weaknesses. The GRADE Working Group has developed an approach that addresses key shortcomings in these systems. The aim of this study was to pilot test and further develop the GRADE approach to grading evidence and recommendations.

METHODS: A GRADE evidence profile consists of two tables: a quality assessment and a summary of findings. Twelve evidence profiles were used in this pilot study. Each evidence profile was made based on information available in a systematic review. Seventeen people were given instructions and independently graded the level of evidence and strength of recommendation for each of the 12 evidence profiles. For each example judgements were collected, summarised and discussed in the group with the aim of improving the proposed grading system. Kappas were calculated as a measure of chance-corrected agreement for the quality of evidence for each outcome for each of the twelve evidence profiles. The seventeen judges were also asked about the ease of understanding and the sensibility of the approach. All of the judgements were recorded and disagreements discussed.

RESULTS: There was a varied amount of agreement on the quality of evidence for the outcomes relating to each of the twelve questions (kappa coefficients for agreement beyond chance ranged from 0 to 0.82). However, there was fair agreement about the relative importance of each outcome. There was poor agreement about the balance of benefits and harms and recommendations. Most of the disagreements were easily resolved through discussion. In general we found the GRADE approach to be clear, understandable and sensible. Some modifications were made in the approach and it was agreed that more information was needed in the evidence profiles.

CONCLUSION: Judgements about evidence and recommendations are complex. Some subjectivity, especially regarding recommendations, is unavoidable. We believe our system for guiding these complex judgements appropriately balances the need for simplicity with the need for full and transparent consideration of all important issues.

Alternate JournalBMC Health Serv Res