The Guideline Assessment Project is a tripartite project that aims to develop an evidence-based and consensus-informed AGREE II Extension for Surgical Guidelines.


GAP I aimed to assess the quality of clinical practice guidelines in Surgery and to identify factors associated with quality.

We performed a scoping review of PubMed and Google Scholar and a formal search of MEDLINE through PubMed to identify guidelines published by major national and surgical organizations with an international scope. Guidelines published from January 2008 to August 2017 were eligible for inclusion.

We selected 35 scientific organizations in General Surgery across upper gastrointestinal, bariatric, colo-rectal, hernia and abdominal wall, endoscopic and minimally invasive, acute and emergency surgery.

Ten surgical organizations produced 67 guidelines over the period 2008 to 2017. Two independent reviewers independently assessed the quality against AGREE II criteria with high inter-rater agreement (median kappa, 0.799; interquartile range, 0.738–0.879).

The median overall score across all 67 guidelines was 4 out of a maximum of 7 (IQR 3 – 5), whereas 27 (40%) guidelines were not considered suitable for use based on their quality as assessed using the AGREE II instrument.

In exploratory analyses, we found that:

  • Guidelines produced by a scientific organization with an output of at least 9 guidelines over the study period (approximately 1 guideline per year) were associated with higher odds of being recommended for use (OR 3.79, 95% CI 1.01 – 12.66), but there was no evidence of association with an overall score >4 (OR 3.12, 95% CI 0.76 – 12.16).

  • The presence of a guidelines committee was associated with approximately 4-fold odds of the guideline being considered appropriate for use (OR 4.15, 95% CI 1.47 – 11.77) and 3.8-fold odds of achieving a score >4 (OR 3.84, 95% CI 1.38 – 10.70).

  • Guidelines produced by a guidelines committee had 29-fold odds for using the GRADE methodology or a modification (OR 29.04, 95% CI 7.11 – 118.67).

  • There was no statistically significant association between inter-society collaboration and the possibility of the guidelines being recommended for use (OR 0.28, 95% CI 0.07 – 1.13); however, there were 78% lower odds for these guidelines to achieve a score >4 (OR 0.22, 95% CI 0.06 – 0.82).

  • Using the GRADE methodology or a modification was associated with 8-fold odds (OR 8.17, 95% CI 2.54 – 26.29) for the guideline to be recommended for use and with 4-fold odds of achieving a score >4 (OR 4.13, 95% CI 1.49 – 11.49).

  • An association between guideline resulting from a consensus development project and achieving an overall score >4 (OR 0.52, 95% CI 0.18 – 1.52) or recommendation to use (OR 0.76, 95% CI 0.26 – 2.19) could not be demonstrated.

  • There was no significant association between inter-society collaboration and using the GRADE methodology (OR 0.45, 95% CI 0.16 – 1.24, p=0.062). Consensus-based guidelines had 6-fold lower odds for using the GRADE methodology (OR 0.16, 95% CI 0.04 – 0.62).

The full publication can be found here

