Review ArticleThe evaluation of diagnostic tests: evidence on technical and diagnostic accuracy, impact on patient outcome and cost-effectiveness is needed
Introduction
Diagnostic tests are used for various purposes: to increase certainty about the presence or absence of a disease; to monitor clinical course; to support clinical management; or to assess prognosis for clinical follow-up and informing the patient [1].
Consequently, diagnostic tests have a potential clinical benefit by influencing management, patient outcome, and patient well being. Tests that do not have this potential are obsolete. Moreover, tests that are not sufficiently reliable may cause harm by inducing inappropriate treatment decisions, unnecessary concern or contrarily, unjustified reassurance. The use of diagnostic tests is therefore never neutral and should be considered with proper care.
In the last decade, the scientific base for diagnostic accuracy studies has grown fast. Several studies have offered empiric evidence on the influence of bias and variation [2], [3]. Others have conceptualized on the design and architecture of diagnostic accuracy studies [4], [5], [6], [7], [8], [9]. On the other hand, the actual practice of performing and reporting diagnostic accuracy studies remains poor [10], [11]. This has resulted in the Standards for Reporting of Diagnostic Accuracy (STARD) initiative [12] that is endorsed by most of the journals. In addition, evidence on methods of literature search [14], quality appraisal [15], [16], and synthesis [18] for systematic reviews of diagnostic accuracy studies is increasing [13], [17].
Health technology assessment agencies are faced with an increasing demand for the evaluation of diagnostic tests, often after the test has already been introduced in clinical practice. Assessment of diagnostic technologies differs from the evaluation of medical therapeutics in many respects. Most importantly, because diagnostic test results are intermediate outcomes, they influence, but do not directly determine, health outcomes in patients.
The foundation for diagnostic test evaluation was made by Ledley and Lusted in 1959 [19]. Many authors subsequently adopted a hierarchy of diagnostic efficacy with six levels [20], [21], [22], which is still being used [23], [24] (Table 1). But, some levels report on information that is already available from a previous level, as is the case with the likelihood ratios and the impact on diagnostic thinking. Other levels report on information used as a proxy for the impact on patient outcome, as is the case with the impact on therapeutic management. In addition, the model is hierarchical, by which the evaluation is stopped when evidence on one level is found to be unsatisfactory. However, tests may still be able to positively influence patient outcome, despite lack of evidence on impact on patient management, for example.
Another rating scheme has been published by Sackett who identified the four most relevant questions to be asked on a diagnostic test, thereby implicitly ranking evidence (Table 1). He also places a threshold on the use of a test in clinical practice, being evidence to answer at least the phase III question [8].
Other authors have stressed the importance of identifying the range of possible uses of the test [25], as this determines what test characteristics the test should have and whether existing evidence on the effect on patient outcome could be used.
Cost-effectiveness is considered only in the hierarchical model of Ledley and Lusted.
In this paper, we propose a comprehensive framework of diagnostic tests evaluation. Our model is based on the models proposed in earlier papers, places the emphasis on the various types of information, including societal issues. According to the place of the new test in the clinical pathway, existing or new evidence on the impact on patient outcome is necessary.
Section snippets
Stepwise evaluation
We propose a stepwise evaluation, rather than a hierarchical one. Every step ought to be taken to assess the value of the diagnostic test. The results of previous steps determine the need for evidence in the following steps.
Every test evaluation should start with an assessment of the test's capabilities under laboratory conditions. Secondly, the test's place in the clinical pathway should be determined. Thirdly, evidence on the diagnostic accuracy of the test is synthesized according to its
Discussion
Health technology assessment goes beyond diagnostic accuracy. It is not sufficient to know a test's sensitivity and specificity, for it to be introduced in clinical practice. The model we propose is intended to guide researchers when evaluating a diagnostic test on its usefulness in clinical practice. By evaluating various aspects of a test in a stepwise manner, its value is assessed more completely, and clinicians are informed in a more detailed way on what to expect from the new test.
Using
Addendum—practice example
To demonstrate the stepwise evaluation presented in this paper, the value of the positron emission tomography (PET) for the staging of patients with nonsmall cell lung cancer (NSCLC) was evaluated.
References (37)
- et al.
Assessment of the accuracy of diagnostic tests: the cross-sectional study
J Clin Epidemiol
(2003) - et al.
Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies
J Clin Epidemiol
(2006) - et al.
Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews
J Clin Epidemiol
(2005) - et al.
Randomised comparisons of medical tests: sometimes invalid, not always efficient
Lancet
(2000) - et al.
Change in differential diagnosis and patient management with the use of portable ultrasound in a remote setting
Wilderness Environ Med
(2005) - et al.
Distraction from randomization in diagnostic research
Ann Epidemiol
(2006) - et al.
Meta-analysis of positron emission tomographic and computed tomographic imaging in detecting mediastinal lymph node metastases in nonsmall cell lung cancer
Ann Thorac Surg
(2005) - et al.
Evaluation of diagnostic procedures
BMJ
(2002) - et al.
Empirical evidence of design-related bias in studies of diagnostic tests
JAMA
(1999) - et al.
Evidence of bias and variation in diagnostic accuracy studies
CMAJ
(2006)
Case-control and two-gate designs in diagnostic accuracy studies
Clin Chem
Comparative accuracy: assessing new tests against existing diagnostic pathways
BMJ
Designing studies to ensure that estimates of test accuracy will travel
Test research versus diagnostic research
Clin Chem
The architecture of diagnostic research
BMJ
Quality of reporting of diagnostic accuracy studies
Radiology
Use of methodological standards in diagnostic test research. Getting better but still not good
JAMA
Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative
BMJ
Cited by (104)
What About All the Recent “Negative” FFR Trials?
2024, Cardiology ClinicsWhat About All the Recent “Negative” FFR Trials?
2023, Interventional Cardiology ClinicsResearch on point-of-care tests in outpatient care in Germany: A scoping review and definition of relevant endpoints in evaluation studies
2022, Zeitschrift fur Evidenz, Fortbildung und Qualitat im GesundheitswesenCitation Excerpt :Although these studies show that patient health outcomes are not improved by CRP-POCTs, their use is recommended in clinical guidelines to reduce diagnostic uncertainty and to aid prescribing decisions [146,147]. Depending on the diagnostic goal, target disease and user requirements, different POCT require various evidence on their potential clinical value for routine application [141]. A recent systematic review on POCTs in primary care showed that POCT evaluation studies do often not investigate aspects that are relevant for general physicians in the decision to implement a POCT in practice [148].
Reliability of the kinematic theory parameters during handwriting tasks on a vertical setup
2022, Biomedical Signal Processing and ControlTheoretical and clinical disease and the biostatistical theory
2020, Studies in History and Philosophy of Science Part C :Studies in History and Philosophy of Biological and Biomedical SciencesModel-based economic evaluations of diagnostic point of care tests were rarely fit for purpose
2019, Journal of Clinical Epidemiology