Review Article
The evaluation of diagnostic tests: evidence on technical and diagnostic accuracy, impact on patient outcome and cost-effectiveness is needed

https://doi.org/10.1016/j.jclinepi.2007.03.015Get rights and content

Abstract

Objective

Before introducing a test in clinical practice, its characteristics and added value should be assessed. Diagnostic accuracy studies only are not sufficient; the test's impact on patient outcome ought to be assessed as well. To do this, we propose a stepwise evaluation of diagnostic tests.

Study Design and Setting

Theoretical–conceptual approach.

Results

First, the test's technical accuracy refers to the ability to produce usable information under standardized conditions. In a second step, the place of the new test in the clinical pathway is determined. Thirdly, the test's diagnostic accuracy is assessed, depending on its intended goal. The fourth step assesses the test's impact on the patient outcome. Depending on the place of the test in the clinical pathway, existing evidence can be used, or new evidence will be needed. At the final step, a cost-effectiveness analysis assesses the test's financial and societal consequences.

Conclusion

Diagnostic tests evaluation should consider the technical accuracy, the test's place in the clinical pathway, its diagnostic accuracy, and its impact on patient outcome.

Introduction

Diagnostic tests are used for various purposes: to increase certainty about the presence or absence of a disease; to monitor clinical course; to support clinical management; or to assess prognosis for clinical follow-up and informing the patient [1].

Consequently, diagnostic tests have a potential clinical benefit by influencing management, patient outcome, and patient well being. Tests that do not have this potential are obsolete. Moreover, tests that are not sufficiently reliable may cause harm by inducing inappropriate treatment decisions, unnecessary concern or contrarily, unjustified reassurance. The use of diagnostic tests is therefore never neutral and should be considered with proper care.

In the last decade, the scientific base for diagnostic accuracy studies has grown fast. Several studies have offered empiric evidence on the influence of bias and variation [2], [3]. Others have conceptualized on the design and architecture of diagnostic accuracy studies [4], [5], [6], [7], [8], [9]. On the other hand, the actual practice of performing and reporting diagnostic accuracy studies remains poor [10], [11]. This has resulted in the Standards for Reporting of Diagnostic Accuracy (STARD) initiative [12] that is endorsed by most of the journals. In addition, evidence on methods of literature search [14], quality appraisal [15], [16], and synthesis [18] for systematic reviews of diagnostic accuracy studies is increasing [13], [17].

Health technology assessment agencies are faced with an increasing demand for the evaluation of diagnostic tests, often after the test has already been introduced in clinical practice. Assessment of diagnostic technologies differs from the evaluation of medical therapeutics in many respects. Most importantly, because diagnostic test results are intermediate outcomes, they influence, but do not directly determine, health outcomes in patients.

The foundation for diagnostic test evaluation was made by Ledley and Lusted in 1959 [19]. Many authors subsequently adopted a hierarchy of diagnostic efficacy with six levels [20], [21], [22], which is still being used [23], [24] (Table 1). But, some levels report on information that is already available from a previous level, as is the case with the likelihood ratios and the impact on diagnostic thinking. Other levels report on information used as a proxy for the impact on patient outcome, as is the case with the impact on therapeutic management. In addition, the model is hierarchical, by which the evaluation is stopped when evidence on one level is found to be unsatisfactory. However, tests may still be able to positively influence patient outcome, despite lack of evidence on impact on patient management, for example.

Another rating scheme has been published by Sackett who identified the four most relevant questions to be asked on a diagnostic test, thereby implicitly ranking evidence (Table 1). He also places a threshold on the use of a test in clinical practice, being evidence to answer at least the phase III question [8].

Other authors have stressed the importance of identifying the range of possible uses of the test [25], as this determines what test characteristics the test should have and whether existing evidence on the effect on patient outcome could be used.

Cost-effectiveness is considered only in the hierarchical model of Ledley and Lusted.

In this paper, we propose a comprehensive framework of diagnostic tests evaluation. Our model is based on the models proposed in earlier papers, places the emphasis on the various types of information, including societal issues. According to the place of the new test in the clinical pathway, existing or new evidence on the impact on patient outcome is necessary.

Section snippets

Stepwise evaluation

We propose a stepwise evaluation, rather than a hierarchical one. Every step ought to be taken to assess the value of the diagnostic test. The results of previous steps determine the need for evidence in the following steps.

Every test evaluation should start with an assessment of the test's capabilities under laboratory conditions. Secondly, the test's place in the clinical pathway should be determined. Thirdly, evidence on the diagnostic accuracy of the test is synthesized according to its

Discussion

Health technology assessment goes beyond diagnostic accuracy. It is not sufficient to know a test's sensitivity and specificity, for it to be introduced in clinical practice. The model we propose is intended to guide researchers when evaluating a diagnostic test on its usefulness in clinical practice. By evaluating various aspects of a test in a stepwise manner, its value is assessed more completely, and clinicians are informed in a more detailed way on what to expect from the new test.

Using

Addendum—practice example

To demonstrate the stepwise evaluation presented in this paper, the value of the positron emission tomography (PET) for the staging of patients with nonsmall cell lung cancer (NSCLC) was evaluated.

References (37)

  • A.W. Rutjes et al.

    Case-control and two-gate designs in diagnostic accuracy studies

    Clin Chem

    (2005)
  • P.M. Bossuyt et al.

    Comparative accuracy: assessing new tests against existing diagnostic pathways

    BMJ

    (2006)
  • L.M. Irwig et al.

    Designing studies to ensure that estimates of test accuracy will travel

  • K.G. Moons et al.

    Test research versus diagnostic research

    Clin Chem

    (2004)
  • D.L. Sackett et al.

    The architecture of diagnostic research

    BMJ

    (2002)
  • N. Smidt et al.

    Quality of reporting of diagnostic accuracy studies

    Radiology

    (2005)
  • M.C. Reid et al.

    Use of methodological standards in diagnostic test research. Getting better but still not good

    JAMA

    (1995)
  • P.M. Bossuyt et al.

    Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative

    BMJ

    (2003)
  • Cited by (104)

    • What About All the Recent “Negative” FFR Trials?

      2023, Interventional Cardiology Clinics
    • Research on point-of-care tests in outpatient care in Germany: A scoping review and definition of relevant endpoints in evaluation studies

      2022, Zeitschrift fur Evidenz, Fortbildung und Qualitat im Gesundheitswesen
      Citation Excerpt :

      Although these studies show that patient health outcomes are not improved by CRP-POCTs, their use is recommended in clinical guidelines to reduce diagnostic uncertainty and to aid prescribing decisions [146,147]. Depending on the diagnostic goal, target disease and user requirements, different POCT require various evidence on their potential clinical value for routine application [141]. A recent systematic review on POCTs in primary care showed that POCT evaluation studies do often not investigate aspects that are relevant for general physicians in the decision to implement a POCT in practice [148].

    • Theoretical and clinical disease and the biostatistical theory

      2020, Studies in History and Philosophy of Science Part C :Studies in History and Philosophy of Biological and Biomedical Sciences
    View all citing articles on Scopus
    View full text