In the absence of distant metastasis, treatment options for non-small cell lung cancer depend on how much the disease has spread to the different lymph nodes within the chest, that is, the stage of the disease. If the cancer has not spread beyond the nearest (N1) lymph nodes, surgery is often the treatment of choice. Other treatment options for these patients include treatment with either radiotherapy, chemotherapy, or both. Planning the optimal treatment is therefore critically dependent on accurate staging of the disease. PET-CT scanning is a non-invasive method of establishing the spread of NSCLC within the chest and elsewhere in the body, which is increasingly available and used by lung cancer multi-disciplinary teams. Although the non-invasive nature of PET-CT constitutes one of the major advantages of the test, PET-CT may be suboptimal in detecting malignancy in normal-sized lymph nodes and in ruling out malignancy in patients with coexisting inflammatory or infectious diseases. We examined the accuracy of PET-CT scanning in establishing the spread of cancer in patients with suspected or confirmed NSCLC that is potentially suitable for surgical treatment with curative intent.
We included 45 studies, and based on the criteria for a positive PET-CT scan, we performed two main analyses. In the 18 studies (2823 participants) in the Activity > background group, PET-CT was found to accurately identify 77.4% (95% CI 65.3 to 86.1) of the participants with NSCLC spread beyond the N1 nodes and 90.1% (95% CI 85.3 to 93.5) of the participants without spread beyond the N1 nodes. In the 12 studies (1656 participants) in the SUVmax of ≥ 2.5 group, PET-CT accurately identified 81.3% (95% CI 70.2 to 88.9) of the participants with spread beyond the N1 nodes and 79.4% (95% CI 70 to 86.5) of the participants without spread beyond the N1 nodes. However, the results varied a lot between the studies in each analysis, and the quality and size of the studies themselves, country of study origin, percentage of participants with adenocarcinoma, FDG dose, and type of PET-CT scanner influenced the results. We believe that the results of this review show that the accuracy of PET-CT is insufficient to allow management based on PET-CT alone.
This review has shown that accuracy of PET-CT is insufficient to allow management based on PET-CT alone. The findings therefore support National Institute for Health and Care (formally 'clinical') Excellence (NICE) guidance on this topic, where PET-CT is used to guide clinicians in the next step: either a biopsy or where negative and nodes are small, directly to surgery. The apparent difference between the two main makes of PET-CT scanner is important and may influence the treatment decision in some circumstances. The differences in PET-CT accuracy estimates between scanner makes, NSCLC subtypes, FDG dose, and country of study origin, along with the general variability of results, suggest that all large centres should actively monitor their accuracy. This is so that they can make reliable decisions based on their own results and identify the populations in which PET-CT is of most use or potentially little value.
A major determinant of treatment offered to patients with non-small cell lung cancer (NSCLC) is their intrathoracic (mediastinal) nodal status. If the disease has not spread to the ipsilateral mediastinal nodes, subcarinal (N2) nodes, or both, and the patient is otherwise considered fit for surgery, resection is often the treatment of choice. Planning the optimal treatment is therefore critically dependent on accurate staging of the disease. PET-CT (positron emission tomography–computed tomography) is a non-invasive staging method of the mediastinum, which is increasingly available and used by lung cancer multidisciplinary teams. Although the non-invasive nature of PET-CT constitutes one of its major advantages, PET-CT may be suboptimal in detecting malignancy in normal-sized lymph nodes and in ruling out malignancy in patients with coexisting inflammatory or infectious diseases.
To determine the diagnostic accuracy of integrated PET-CT for mediastinal staging of patients with suspected or confirmed NSCLC that is potentially suitable for treatment with curative intent.
We searched the following databases up to 30 April 2013: The Cochrane Library, MEDLINE via OvidSP (from 1946), Embase via OvidSP (from 1974), PreMEDLINE via OvidSP, OpenGrey, ProQuest Dissertations & Theses, and the trials register www.clinicaltrials.gov. There were no language or publication status restrictions on the search. We also contacted researchers in the field, checked reference lists, and conducted citation searches (with an end-date of 9 July 2013) of relevant studies.
Prospective or retrospective cross-sectional studies that assessed the diagnostic accuracy of integrated PET-CT for diagnosing N2 disease in patients with suspected resectable NSCLC. The studies must have used pathology as the reference standard and reported participants as the unit of analysis.
Two authors independently extracted data pertaining to the study characteristics and the number of true and false positives and true and false negatives for the index test, and they independently assessed the quality of the included studies using QUADAS-2. We calculated sensitivity and specificity with 95% confidence intervals (CI) for each study and performed two main analyses based on the criteria for test positivity employed: Activity > background or SUVmax ≥ 2.5 (SUVmax = maximum standardised uptake value), where we fitted a summary receiver operating characteristic (ROC) curve using a hierarchical summary ROC (HSROC) model for each subset of studies. We identified the average operating point on the SROC curve and computed the average sensitivities and specificities. We checked for heterogeneity and examined the robustness of the meta-analyses through sensitivity analyses.
We included 45 studies, and based on the criteria for PET-CT positivity, we categorised the included studies into three groups: Activity > background (18 studies, N = 2823, prevalence of N2 and N3 nodes = 679/2328), SUVmax ≥ 2.5 (12 studies, N = 1656, prevalence of N2 and N3 nodes = 465/1656), and Other/mixed (15 studies, N = 1616, prevalence of N2 to N3 nodes = 400/1616). None of the studies reported (any) adverse events. Under-reporting generally hampered the quality assessment of the studies, and in 30/45 studies, the applicability of the study populations was of high or unclear concern.
The summary sensitivity and specificity estimates for the 'Activity > background PET-CT positivity criterion were 77.4% (95% CI 65.3 to 86.1) and 90.1% (95% CI 85.3 to 93.5), respectively, but the accuracy estimates of these studies in ROC space showed a wide prediction region. This indicated high between-study heterogeneity and a relatively large 95% confidence region around the summary value of sensitivity and specificity, denoting a lack of precision. Sensitivity analyses suggested that the overall estimate of sensitivity was especially susceptible to selection bias; reference standard bias; clear definition of test positivity; and to a lesser extent, index test bias and commercial funding bias, with lower combined estimates of sensitivity observed for all the low 'Risk of bias' studies compared with the full analysis.
The summary sensitivity and specificity estimates for the SUVmax ≥ 2.5 PET-CT positivity criterion were 81.3% (95% CI 70.2 to 88.9) and 79.4% (95% CI 70 to 86.5), respectively.In this group, the accuracy estimates of these studies in ROC space also showed a very wide prediction region. This indicated very high between-study heterogeneity, and there was a relatively large 95% confidence region around the summary value of sensitivity and specificity, denoting a clear lack of precision. Sensitivity analyses suggested that both overall accuracy estimates were marginally sensitive to flow and timing bias and commercial funding bias, which both lead to slightly lower estimates of sensitivity and specificity.
Heterogeneity analyses showed that the accuracy estimates were significantly influenced by country of study origin, percentage of participants with adenocarcinoma, (¹⁸F)-2-fluoro-deoxy-D-glucose (FDG) dose, type of PET-CT scanner, and study size, but not by study design, consecutive recruitment, attenuation correction, year of publication, or tuberculosis incidence rate per 100,000 population.