Result card
|
Authors: Katrine Frønsdal, Stefan Sauerland and Ingvil Sæterdal
Internal reviewers: Pseudo164 Pseudo164, Pseudo179 Pseudo179, Pseudo71 Pseudo71, Pseudo98 Pseudo98
Evidence issued from the basic literature search (done for the whole project) is used to assess this element. Methods for reporting data are as described in Domain Methodology.
As no tools at the present are available for assessing the quality of reliability and agreement studies, no grading to indicate strength of evidence has been performed for these outcomes.
One SR was included to assess the variation of AAA screening interpretation in terms of variation in intra-observer repeatability and inter-observer reproducibility of infra-renal aortic diameter measurements using ultrasound (Beales 2011). This SR was determined to be of medium quality (Appendix EFF-2, Section 2).
Bland-Altman plots, a method based on the differences in observed values compared with the means of measured values was used to assess these outcomes in eight of the nine included studies (Bland & Altman 1986), whereas one study used a multilevel regression approach, i.e. generalised estimating equations (GEE) for the extraction of components of variation, separating intra-observer variation from inter-observer variation (GEE 2012). By using the GEE method, the number of assumptions for this analysis were reduced, which allowed variations to be reported in terms of standard deviations and appropriate definitions of measurement reliability derived from those standard deviations.
There were wide variations between the nine included studies in terms of numbers of measurements (from 10 to 112), participant demographics (age and gender) and types of ultrasound machine (all different). Various techniques of aortic diameter measurement techniques (calliper endpoints) were used, i.e. diameter measurement between aortic inner layers (ITI), between aortic inner and outer layers (ITO), or between aortic outer layers (OTO), and in both anteroposterior (AP) and transversal (TS) planes. Measurements were done on aneurysmal and normal aortas. In all studies, observers were blind to the results from the other observers, but they had different backgrounds in terms of discipline, grade or level of experience and training.
Intra-observer repeatability
The SR by Beales et al. was the only SR from the basic literature search that assessed intra-observer repeatability. Intra-observer repeatability was assessed using Bland-Altman plots by calculating repeatability coefficients in seven studies and using the GEE method in one study (Bland & Altman 1986; GEE 2012). Data for this outcome were not available for one of the nine included studies.
The intra-observer maximum AP mean difference ranged from 0.03 to 4.8 mm, and for TS from 0.2 to 1.9 mm. Beales et al. indicated diameter intra-observer repeatability coefficients, ranging from 1.6 to 7.5 mm for AP (and from 2.8 to 15.4 mm for TS). The National Health Service Abdominal Aneurysm Screening Programme (NAAASP 2009) suggested that 5 mm is an acceptable level of observer variation between aortic diameter ultrasound measurements. Authors suggested that aortic measurements by the same practitioner may vary significantly, but did not provide any statistical support for this statement, and the diameters (ITI, ITO or OTO) measured varied between studies. In addition, numbers of observers were few in eight of the nine included studies. It was difficult to draw a definitive conclusion from the review, but it indicated overall acceptable intra-observer repeatability.
In the studies included by Beales et al. numbers of observers ranged from 1 to 4, except for one study which had 24 observers (Hartshorne 2011). However, Hartshorne et al. included exclusively assessments of static images of aortas of different sizes, whereas the other eight studies included real-time examinations in which the relevant images to enable aortic diameter measurement were acquired. This study was nevertheless highlighted by the SR authors as being the only one that had used the GEE method. In this study, the intra-observer AP mean repeatability coefficients varied from 1.6 to 2.0 mm with individual repeatability coefficients ranging from 0.8 to 6.1 mm (TS measurements were not performed in this study), which are mainly below the acceptable level of variability of 5 mm (NAAASP 2009).
Intra-observer variability for ITI and OTO aorta diameter measurements
Hartshorne et al. was the only study that assessed possible differences in intra-observer variability according to different calliper endpoints of aortic diameter measurements (i.e. diameter measurements of ITI walls versus OTO walls), as well as according to differences in observers’ background disciplines and experience (screening technicians versus vascular sonographers). In this study, 13 screening technicians and 11 vascular sonographers examined 60 aortic static images (not live). Among the sonographers, six had more than 10 years’ experience and only one had less than 1 year of experience, whereas only two screeners had more than 10 years’ experience and five had less than one year. While all 13 screeners routinely used ITI, five sonographers used OTO and six both ITI and OTO in their routine practice. When 15 images were each measured twice in random order by all 24 observers, there was no significant difference between the mean repeatability of ITI, 1.6 mm (range 0.8-5.2 mm) and that of OTO, 2.0 mm (range 0.5-6.1 mm). For ITI, there was no significant difference between the mean repeatability of screeners, 1.7 mm (range 0.8-5.2 mm) and that of sonographers 1.4 mm (range 0.9-2.4 mm; P=0.27). For OTO, on the other hand, the mean repeatability was significantly better for sonographers at 1.4 mm (range 0.6-2.6 mm) compared with screeners, mean 2.5 mm (range 1.1-6.1 mm; P=0.037). It was, however, not possible to ascertain, using these data, the effect of the sonographers’ longer experience since screeners, as opposed to sonographers, did not use OTO in their routine practice.
Inter-observer reproducibility
The SR by Beales et al. is the only SR from the basic literature search that assessed inter-observer reproducibility. Inter-observer reproducibility was assessed using Bland-Altman plots in eight studies and by the GEE method in one study (Bland & Altman 1986; GEE 2012).
For AP, the limits of agreement (reproducibility coefficients) for the diameter measurements ranged between -1.9 to +1.9 mm and -10.4 to +10.5 mm (all nine studies), whereas for TS, the largest limit of agreement was -5.6 to +5.2 mm (only two studies assessed TS diameters). According to Beales et al., five of the nine studies included had acceptable inter-observer reproducibility. For the study that involved 24 observers and used the GEE method (Hartshorne 2011), as opposed to the 1-4 observers in the eight others, the mean reproducibility coefficients were 3 mm (95% CI 2.4-3.6 mm) for ITI and 4.2 mm (95% CI 3.5-4.9 mm), both of which were below the acceptable level of variability of 5 mm (NAAASP 2009). Although the authors of the SR do not draw any conclusions about inter-observer reproducibility, the results indicate overall acceptable inter-observer reproducibility regardless of whether diameters are measured as ITI, OTI or OTO.
Inter-observer variability for ITI and OTO aorta diameter measurements
Hartshorne et al. was the only study that assessed possible differences in inter-observer variability according to different calliper endpoints of aortic diameter measurements (i.e. diameter measurements of inner-to-inner walls [ITI] versus outer-to-outer walls [OTO]), as well as according to differences in observers’ background disciplines and experience (screening technicians versus vascular sonographers) (Hartshorne 2011). In this study, in which 13 screening technicians and 11 vascular sonographers examined 60 images, mean reproducibility coefficient for ITI was significantly better than for OTO when measuring AP (TS was not measured in this study). Mean reproducibility coefficient was 3.0 mm (95% CI 2.4-3.6 mm) for ITI and 4.2 mm (95% CI 3.5-4.9) for OTO (P<0.05), but both remained acceptable according to NAAASP, i.e. less than 5 mm (NAAASP 2009). Hartshorne and collaborators performed a corresponding analysis, excluding observers with less than 1 year’s experience. In this group of 8 screening technicians and 10 sonographers mean reproducibility coefficients were 3.2 mm (95% CI 2.6-3.8 mm) for ITI and 3.8 mm (95% CI 3.1-4.5 mm) for OTO. It was not possible, however, to ascertain that there was no effect of background discipline, because the screening technicians, as opposed to sonographers, did not use OTO in their routine practice.
Impact of ITI and OTO on the threshold for surveillance and referral for treatment
Hartshorne et al. grouped the 60 images into four categories to assess the impact of ITI versus OTO on the threshold for surveillance and referral for treatment. Results presented in the table below (Table 1- EFF33) indicated that the ITI method would detect fewer aneurysms than using OTO.
Table 1 – EFF33 : Size categories using ITI vs size categories using OTO using 1440 measurements
Size categories using OTI | |||||
<30 mm |
30-45 mm |
45-55 mm |
>55 mm | ||
Size categories using ITI |
<30 mm |
348 (24%) |
60 (4%) |
0 (0%) |
0 (0%) |
30-45 mm |
0 (0%) |
262 (18%) |
124 (9%) |
0 (0%) | |
45-55 mm |
0 (0%) |
1 (0.1%) |
418 (29%) |
138 (10%) | |
>55 mm |
0 (0%) |
0 (0%) |
1 (0.1%) |
88 (6%) |
<30 mm is considered normal and requires no further surveillance (adapted from Hartshorne et al. 2011)
30-45 mm is considered a small aneurysm requiring yearly assessments
45-55 mm is considered a medium large aneurysm requiring 3-monthly assessments
>55 mm is considered a medium large aneurysm requiring immediate surgery
However, as indicated earlier, this study did not assess live images, and half of the observers were screening technicians who had less experience than vascular sonographers, and who used only ITI in their practice routine. These factors meant that a definite conclusion could not be drawn based on these data about the thresholds for surveillance and referral for treatment.