Measurement of accuracy

Measurement of accuracy

Measurement of accuracy of screening combines sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of a ‘positive test’ with respect to a ‘positive outcome’ (Figure 5.5).  The following factors must be considered when assessing measurements of accuracy.

  • The nature of the positive and non-positive test for any given circumstance must be clearly defined as must the positive and non-positive outcome for that circumstance. 
  • A positive test and outcome are rare events in a population-based screening test: thus specificity and NPV will be high and sensitivity and PPV may be more relevant (see first example in Figure 5.5 below). 
  • PPV and sensitivity may be compromised by falling prevalence of disease. [Link to Chapter 7 – Uses of HPV tests and the effect of vaccination]
  • The nature of ‘negative’ tests and outcome must be taken into consideration: in the second example below they may represent genuine HPV+ low-grade abnormalities investigated at colposcopy.
  • A test of high sensitivity and low specificity (such as high-risk HPV positivity or ASC-US+ cytology) may be compensated by a second test of higher specificity (such as colposcopic biopsy).
  • There is a trade-off between sensitivity, which depends on the false negative rate, and specificity, which depends on the false positive rate.

Accuracy may be calculated for many circumstances, such as accuracy of cytology in women referred for colposcopy (see second example in Figure 5.5 below), accuracy of screening with respect to final report [Link to Chapter 14- Quality control] and accuracy of conventional versus liquid-based cytology or HPV tests 

Surrogate estimates of sensitivity

In view of the difficulty in measuring sensitivity in the absence of knowledge of the true prevalence of CIN2+ in the population, surrogate estimates may be used, such as comparative reporting rates for HSIL and comparative detection rates for CIN2+ or rates of negative tests preceding CIN2+ [Links to Chapter 8 - Collecting cellular samples from the cervix; Chapter 14 - Quality control].

Short guides to measuring accuracy of screening tests

The concise review by Lalkhen & McCluskey (2008) is recommended as a short guide to measurement of accuracy of screening tests.  Chapter 3, Statistics, in RM Demay The Art and Science of Cytopathology (exfoliative cytology) is also strongly recommended.

Figure 5.5 (a-c). Theoretical examples of accuracy of tests with the same FN and FP rates in settings with low and high prevalence of CIN2+

(a) Formulas for measuring accuracy
(b) Accuracy with a low prevalence outcome
(c) Accuracy with a high prevalence outcome

 

Learning points from Chapter 5

  1. The principles of Wilson and Jungner (1968) for a screening programme should be examined critically with respect to its success in preventing ‘an important health problem’ as well as its ‘snags’.
  2. Improved knowledge about ‘The natural history of the condition’ has raised problems (management of reversible vs. progressive disease) and provided alternative solutions (vaccination and HPV testing).
  3. In well-screened populations high-risk HPV, low-grade and high-grade lesions tend to be detected most frequently in women under 30 years of age.
  4. Distinction between HSIL and LSIL is the key to accurate screening.
  5. ASC-US and LSIL are consistently reported more frequently than HSIL.
  6.  Excision of high-grade CIN carries a risk of premature rupture of membranes in pregnancy: the risk is related to the depth of biopsy.
  7. Rigorous quality control is essential to avoid over-diagnosis as well as under-diagnosis.
  8. Measurement of accuracy combines sensitivity, specificity, positive predictive value and negative predictive value.
  9. There is a trade off between sensitivity (reflecting false negatives) and specificity (reflecting false positives).
  10. In a population-based screening test of a rare condition NPV and specificity may be less relevant than sensitivity and PPV as measures of accuracy.
  11. Declining prevalence of disease may compromise PPV and sensitivity.

 

X