Comparison of lung cancer screening trials, showing rates of identification of early stage disease and effects on mortality.
In the landmark American National Lung Cancer Screening Trial (NLST), low-dose CT (LDCT) screening produced a relative mortality reduction of 20%. These results have not been replicated in any of the European studies, although these are of limited statistical power. Besides doubt about the general applicability of the NLST findings, if LDCT screening is to be successfully implemented, a number of developments are still required, including better characterisation of entry criteria and refinement of screening and nodule management protocols. The high incidence of false-positive findings increases costs and morbidity. Even when histologically malignant tumours are identified, frequently these would not have manifested as disease, i.e. they are “overdiagnosed”. These patients are liable to receive unnecessary treatment. LDCT screening is relatively expensive in comparison with other cancer screening modalities. Whilst cost-effectiveness can be improved by integration with smoking cessation programmes, how this would be done in practice remains unclear. Furthermore, individuals at high-risk of lung cancer are virtually by definition risk prone, raising concerns about how attractive participation in a screening programme would be, especially given the very small reported absolute risk reduction in the NLST.
- lung cancer
- low-dose computed tomography
- early diagnosis
Lung cancer is the commonest cause of cancer death in both men and women across the developed world, due to a combination of its high incidence and relatively short average survival after diagnosis. Most lung cancer patients present symptomatically and most already have incurable disease at presentation. These considerations have resulted in attempts to improve outcomes through screening.
Screening, though, is a challenging strategy for harm reduction. Screening programmes have been introduced for many cancers, often as a result of political pressures rather than on sound evidence. Indeed, the harms of many screening programmes have only become evident well after widespread implementation, and their utility has often become more rather than less controversial with time. Self-evidently, for screening to be effective, earlier disease identification needs to lead to improved treatment outcomes. This calls into question what we know about the natural history of early stage, asymptomatic tumours, which turns out to be surprisingly little. It is now becoming increasingly certain though that, whilst some tumours will progress and ultimately cause premature death, others may never cause harm. This leads to a bias known as overdiagnosis . The other way in which overdiagnosis can occur is when a competing cause of death prevents clinical manifestations of a tumour that would otherwise have proved lethal . Screening is particularly liable to overdiagnosis, because more aggressive tumours have short volume-doubling times and progress rapidly. Thus, the interval between the onset of a radiological abnormality and the emergence of symptoms is relatively short, and the opportunity for presymptomatic detection in a screening programme is small. Conversely, tumours that grow slowly will have a long phase when they exist without symptoms and are particularly liable to be identified through screening.
There is now convincing evidence that overdiagnosis does occur as an inherent harm in most, and probably all, cancer screening programmes, including mammography . It is always harmful because it is unknown which of the tumours identified are the ones overdiagnosed, meaning that some patients will have treatment for a disease that would never have materialised. The resulting costs include financial (unnecessary investigations/treatments), physical (side effects and complications of treatments) and psychological and are all serious, irrespective of any benefits derived by those with “real” disease.
The other important bias of screening programmes is lead-time bias, which occurs when a disease is diagnosed earlier than it would have been without screening. Even if the natural history of the case is not improved, the patient appears to survive longer than otherwise they would have but dies at the same date. Because of this it is vital that for proper evaluation of a screening programme the endpoint taken should be mortality difference in comparison with a control group.
2. Lung cancer screening trials
Several large studies were conducted in the 1960s and 1970s using plain chest radiography, with or without sputum cytology [4–10]. These studies all had methodological weaknesses , including possessing limited power and, rather remarkably, even the control group in several having three-year radiographs. No study provided evidence that mortality was reduced. Recently, the PLCO study has reported  and has finally and convincingly shown an absence of any mortality benefit of plain chest radiography screening, compared to no screening.
The advent of low-dose CT (LDCT) scanning provided a new modality applicable to lung cancer screening. Initial studies indicated that LDCT was able to identify early lung cancers with a high rate of resectability . The landmark National Lung Screening Trial (NLST) was a large and adequately powered, randomised, controlled trial of screening with LDCT against plain chest radiography, undertaken in the USA. The headline result was a reduction in lung cancer mortality, by the apparently impressive figure of 20% . Does this mean that CT screening immediately to be implemented for lung cancer or that lung cancer mortality could be reduced by anything like this amount? I believe that the answer is no to both.
2.1. The National Lung Screening Trial (NLST)
Eligible participants were identified as current or former smokers between the ages of 55 and 75 years with no personal history of malignancy and at least a 30-pack-year smoking history . Former smokers had to have quit less than 15 years prior to participating in the study. Volunteers who had previously received a diagnosis of lung cancer or who had undergone a chest CT scan within 18 months before enrolment, and those who had haemoptysis or unexplained weight loss of more than 6.8 kg in the preceding year were excluded. An impressive total of 53,454 volunteers were enrolled; 26,722 were randomly assigned to receive three annual screening examinations by either LDCT or chest radiography. They were then followed for an additional 5 years after the screening examinations were completed. No further screening was undertaken in subjects diagnosed with lung cancer. In both groups, the screening test was deemed positive if it showed a non-calcified pulmonary nodule 4 mm or larger in diameter or if there were other findings suspicious for cancer. When comparing the two groups after the completion of the three annual screenings, 24% of participants screened with LDCT had positive screening results compared with 6.9% of individuals who received chest radiography, confirming the superiority of LDCT in identifying nodules over plain chest radiography . More than 75% of the participants with a positive screening result underwent further diagnostic evaluation, either with additional imaging or with invasive/surgical procedures. In this report, a “false positive” was defined as any case requiring further evaluation, including just repeat CT imaging. Using this definition, more than 90% of the positive findings were false positives, both in the controls and in the LDCT group.
Of the 26,722 volunteers screened with LDCT, 1060 participants were found to have lung cancer in comparison to 941 of the group screened with chest radiography. Overall, when comparing the two groups, the detection rate of lung cancer was 13% greater in the group screened with LDCT than with chest radiography. More early stage (IA and IB) cancers were diagnosed with CT. Most importantly in the evaluation of a screening study, there was evidence of reduced mortality: a total of 356 lung cancer deaths occurred in the LDCT vs. 443 deaths in the plain radiography group, over a median of 6.5 years of follow-up (
2.2. European studies
An important issue is to what extent the findings of the NSLT might hold for a non-US population. There are important differences in epidemiology in Europe, including a difference in distribution of histological subtypes and a much lower frequency of non-calcified pulmonary nodules arising from fungal pathogens. In the UK, squamous cancers represent about 40% of cancers and adenocarcinomas 18%, whilst in the USA, squamous cancers only represent 27% with adenocarcinoma being the most prevalent type at 31% . As squamous cancers tend to arise in proximal airways, they are less amenable to identification as a lung nodule on CT, unlike adenocarcinomas, which more often present as intrapulmonary nodules.
The DANTE and DLCST studies each compared five annual rounds of LDCT screening to usual care and were both considerably smaller than the NLST. The DANTE study randomised 2811 and the DLCST randomised 4104 men and women who were healthy, heavy and current or former smokers to LDCT screening or no screening. After medians of 34 and 58 months of follow-up, respectively, not even a trend towards reduced mortality was found: (DANTE, relative risk [RR], 0.97; 95% CI, 0.71–1.32;
A comparison of the larger European and LDCT screening trails with the NLST with the trials of the plain chest X-ray in screening [4, 12] is shown in Table 1 , particularly in respect of their effects on mortality. It should be emphasised that the only trial producing a statistically significant result was the NLST.
|Trial||Modality||Recruits||No. of lung cancers detected (%)||No. at stage 1 (%)||No. of deaths from lung cancer (%)||Relative mortality reduction from screening||Absolute mortality reduction from screening|
|Negative figures denote mortality increase|
|NLST (14)||Chest X-ray||26,035||941 (3.61)||131 (0.5)||443 (1.70)||+20%||+0.33%|
|LDCT||26,309||1060 (4.03)||400 (0.52)||356 (1.35)|
|DANTE (17)||Clinical||1186||73 (6.07)||16 (1.35)||55 (4.64)||−0.65%||−0.03%|
|LDCT||1264||104 (8.23)||47 (3.72)||59 (4.67)|
|DLCST (16)||Usual||2052||24 (1.17)||5 (0.24)||11 (0.54)||−35%||−0.19%|
|LDCT||2052||70 (3.41)||47 (2.29)||15 (0.73)|
|PLCO (12)||Usual||77,456||1620 (2.09)||462 (0.6)||1230 (1.59)||+1.26%||−0.02%|
|Chest X-ray||77,445||1696 (2.19)||374 (0.48)||1213 (1.57)|
|MLP (4)||Usual||4593||160 (3.48)||31 (0.67)||115 (2.50)||−6%||−0.14%|
|Chest X-ray||4618||206 (4.46)||68 (1.47)||122 (2.64)|
2.3. Downsides of LDCT screening
Any benefits of CT screening need to be weighed against the harms. Besides the relatively small direct risk of cancers that are caused directly by the radiation exposure from the CT scans, CT screening suffers from all of the general limitations of screening in general. These include:
The costs—both psychological and financial arising particularly through the high rate of false-positive diagnoses or indeterminate findings
Uncertainty regarding how such a programme would work in practice including its potential to reach/capture a reasonable proportion of incident cases
In the NLST, the substantial excess of cancers diagnosed in the CT group (1060 vs. 941) implies that overdiagnosis did occur. Patz and colleagues’  analysis of the NLST study suggested that up to 18% of the cancers identified in the NLST may have been indolent and likely to have been overdiagnosed and indicated that for each cancer death avoided, 1.38 cases may have been overdiagnosed. Moreover, the figures for overdiagnosis may have been even worse if the control arm had received no chest radiographs, which can also be assumed to have resulted in some overdiagnosis. The risk of overdiagnosis, as might be expected, depends on histological subtype and is most striking in patients with a diagnosis of bronchoalveolar cell carcinoma (now called minimally invasive adenocarcinoma), in whom the risk of overdiagnosis was estimated to be 85% after 7 years of follow-up or 49% with lifetime follow-up . These data also raise the question as to the necessity and type of therapy required if a diagnosis of minimally invasive adenocarcinoma is established.
Because the major risk factor for lung cancer is the smoking of tobacco, in order to qualify for entrance into a screening programme, individuals need to have smoked significantly, and many will be current cigarette smokers. Consequently, the target population of a lung cancer screening programme may be expected to have a relative disregard for its own health and a tendency to accept of risk, potentially predisposing to poor acceptability of, and adherence to, screening. This is likely to be much more evident in real life in comparison to a highly motivated volunteer population. It has been shown that smokers in the USA are significantly more likely than never smokers to be male, non-white and less educated; to report poor health status; and to be less likely to be able to identify a usual source of healthcare . This study also indicated that current smokers were less likely than never smokers to believe that early detection would result in a good chance of survival and expressed relative reluctance to consider computed tomography screening for lung cancer. Interestingly, only half of these smokers stated that they would opt for surgery for a cancer diagnosed as a result of screening, further calling into question the value of early diagnosis in this group.
Importantly, even if all subjects meeting the NLST criteria were to accept and adhere to screening, it has been estimated that only 27% of incident lung cancer patients would be included . This implies that the potential to limit lung cancer mortality could only at the very best be 20 of 27% of cases, i.e. a 5.4% mortality reduction overall.
It can also be argued that a screening programme represents a collusion with self-harming behaviour, particularly in relation to current smokers, and throws into focus the interrelationship between smoking cessation and lung cancer screening, particularly as CT screening is only designed to mitigate one of the many potential harms of smoking. This is all the more relevant because recruitment into a lung cancer screening programme does not appear to increase the likelihood of smoking cessation [23–25] and reduce it . Offering smoking cessation, which is one of the most cost-effective of all heath interventions, within a screening programme has been shown to improve the cost-effectiveness of the screening—by 20–45% [26, 27]. Is this “cheating”? It is only when expensive LDCT screening is combined with highly cost-effective smoking cessation that cost-utility ratios become comparable with those of other accepted cancer screening programmes!
No long-term psychological harm was found in the NELSON trial. In those with negative results, anxiety and distress fell from baseline; following an abnormal result, anxiety and distress were transient and tended to have returned to baseline by the next screening round . However, the harms—psychological, physical and financial—suffered by those with overdiagnosed tumours have not been quantified and are likely to be substantial.
In the NLST, a major complication occurred in almost five of every 10,000 persons screened due to investigation of a benign lung nodule . Whilst this is a low proportion, these were “normal” subjects in whom any harm is seriously to be regretted.
Based on the NSLT’s own size cut-offs, the average nodule detection rate per round of screening was very high at 20%. In most LDCT screening studies, more than 90% of nodules prove to be benign. Whilst there is a tendency towards lower nodule detection rates in repeat screening rounds, this appears only to be due to the discounting of nodules that had been present in the prior round, so screening, if embarked upon, should probably continue. Furthermore, from the evidence of the fourth round of screening in the NELSON study, lengthening the duration of intervals between rounds beyond 2 years does not appear to be an effective strategy.
In most screening study protocols, a detected nodule triggers further imaging, but the approaches have been inconsistent between studies, and it has been suggested that follow-up imaging rates may have been underestimated . The frequency of further CT imaging among screened individuals has ranged from 1% in the study by Veronesi et al.  to 44.6% in the study by Sobue et al. . The frequency of further positron emission tomography (PET) imaging among screened individuals has exhibited less variation, ranging from 2.5% in the study by Bastarrika et al.  to 5.5% in the NLST.
The frequency of invasive evaluation of detected nodules, although generally low, has shown marked variation in reported studies. In the NLST, in the patients not found to have lung cancer, 1.2% underwent an invasive procedure such as needle biopsy or bronchoscopy, and 0.7% had a thoracoscopy, mediastinoscopy or thoracotomy. In the NELSON study, these numbers were very similar at 1.2% and 0.6%, respectively .
A workshop undertaken by the International Association for the Study of Lung Cancer (IASLC)  identified a number of areas where improvements are needed to be made in relation to future implementation of LDCT screening, indicating that, whilst LDCT screening may have potential value, the science around the process remains preliminary. The specific areas requiring clarification were identified as optimization of the identification of high-risk individuals, development of radiological guidelines, development of guidelines for the clinical workup of indeterminate nodules, development of guidelines for pathology reporting, definition of criteria for surgical and therapeutic interventions of suspicious nodules identified through lung cancer CT screening programmes and development of recommendations for the integration of smoking cessation practices into future national lung cancer CT screening programmes.
Incremental cost-effectiveness ratio (ICER) is defined as = (C1 − C2)/(E1 − E2) , where C1 and E1 are the cost and effect in the intervention and where C2 and E2 are the cost and effect in the control group. Costs are usually described in monetary units, whilst benefits/effect in health status is measured in terms of quality-adjusted life years (QALYs).
Using such methodology, lung cancer LDCT screening was found to be considerably more expensive than other US screening programmes with an ICER of between $126,000 and $169,000/LY . In comparison, colorectal cancer screening has an ICER of $13,000 to $32,000/LY. When a basic smoking cessation intervention is included, which, as outlined above, subsidises an expensive intervention with a cheap one, but which also adds to the total costs, an ICER of $23,185 per QALY is gained, falling further to $16,198 to per QALY gained with a more intensive regimen . Given the huge benefits of smoking cessation for a wide range of diseases, the case for offering smoking cessation to all anyway is strong. Setting up an LDCT screening programme first and adding on smoking cessation to that seems to be putting the cart before the horse: if employed at all for smokers, it should be an add-on to a smoking cessation intervention.
In the UKLST, the cost-effectiveness analyses used data from life tables and modelled data on quality-adjusted life year (QALY) from the NLST, the validity of which is unclear. Given that no effect on mortality could be shown, the reliability of cost-utility analysis is highly questionable, although was estimated at only £8466 (approximately $11,000 at today’s exchange rate) per QALY gained. This figure is substantially less than that quoted in the NLST and is well within the threshold deemed acceptable by the National Institute for Health and Care Excellence.
2.5. Applicability of trial findings to routine clinical practice
Most of the NLST sites were designated National Cancer Institute centres, and more than 80% were large, multidisciplinary academic centres with more than 400 beds . It seems unlikely that the results obtained by less specialised centres will be directly comparable. Furthermore, screening trials are likely to attract, if not healthy volunteers, at least a group who may be more likely than the average to adhere to the screening protocol. Indeed, adherence to the three screening rounds in the NLST was 90%, which is highly unlikely to be achievable in routine practice.
Great difficulty was experienced in recruiting participants to UKLS, given that a system comparable to that that may be used in a real-life screening programme was employed: of the 247,354 questionnaires sent out, response rates to the initial questionnaire were low, with an initial positive response rate of only 15% in current smokers and about 40% in never or former smokers. Even then, a high attrition rate occurred with potential participants being lost at every stage of the recruitment process. Finally, only 4061 subjects (46.5% of all high-risk positive responders) consented and were recruited into the trial.
Another positive from the CT screening studies is that they have provided evidence underpinning the rational approach to the investigation of solitary pulmonary nodules, including the very helpful algorithm developed by the BTS . This includes appreciation that minimally invasive adenocarcinomas may be benign in behaviour and may allow a less aggressive approach to management in comorbid or frail patients.
The American NLST, in finding a 20% relative reduction in mortality from screening with LDCT, in comparison to plain chest radiography, which itself is ineffective, suggests that screening may be one strategy for improving lung cancer outcomes in a well-motivated American population. This benefit was achieved at a cost comparable to those of other established screening programmes but only when smoking cessation was included. However, the absolute reduction in mortality achieved is small: 87 avoided deaths in 26,722 screened participants, representing a 0.33% lower risk of dying from lung cancer for each individual participant. Besides the harms resulting from the unnecessary treatment of overdiagnosed cases, 24% of participants were found to have a nodule over three rounds, leading to further diagnostic workup. Even in these large academic institutions, a major complication occurred in five of every 10,000 cases with a benign nodule. Many of the cancers diagnosed are small, minimally invasive adenocarcinomas, and these contribute significantly to overdiagnosed cases. Overall, for every cancer death avoided, 1.38 cases may have been overdiagnosed.
Disappointingly, none of the European studies (DANTE and DLCST and NELSON) have found evidence of any significant, or indeed even a trend towards, reduction in mortality, possibly reflecting different epidemiology of cancer subtypes. The use of volumetric techniques employed in the NELSON study seems attractive, but the efficacy of this approach remains completely unproven.
Concentration on harm reduction through screening potentially deflects attention from the need to improve diagnosis and treatment for the majority of cases falling outwith current screening eligibility criteria for smoking history and age. It is known that patients often tolerate lung cancer symptoms for long periods before presenting with them . General practitioners also find identifying lung cancer cases challenging, and patients will often attend several times before the diagnosis is considered and a chest radiograph performed . An early study revealed that educating the public and primary healthcare teams on the importance of cough as a lung cancer symptom resulted in a large increase in chest radiographs being performed and suggested earlier diagnosis . This led on to the national “Be Clear on Cancer” campaign in the UK, the results of which were positive, leading to state funding of repeat programmes. Further work is taking place to look at the effects of lowering thresholds for the obtaining of chest radiographs for chest symptoms in primary care . Facilitating earlier diagnosis of symptomatic disease should also minimise overdiagnosis.
I believe that screening in lung cancer is potentially able to improve lung cancer mortality, but our understanding of how to apply this in real populations, including those outside the USA, is in its infancy. Cost-effectiveness is much improved when screening is combined with smoking cessation, but this is in effect subsidising an otherwise unaffordable screening programme by combining it with another highly cost-effective intervention that could be provided anyway.
Besides the expense of screening, small but definite harms resulting from radiation exposure, investigations of benign lesions and the more significant difficulty of finding of inconsequential disease (overdiagnosis) also reduce its attractiveness. Until we have better understanding of these issues, I believe we should be concentrating more effort on the earlier diagnosis of symptomatic disease, at least in Europe.
Whatever we think of the weaknesses of the current attempts to reduce mortality though screening, the NLST does point to the potential for improving lung cancer outcomes through expeditious diagnosis. For now, at least in Europe, this must be based on improved identification of symptomatic disease. Initiatives to improve detection of early stage disease will rely on improving public and primary care awareness of lung cancer symptoms and reducing the impediments to diagnosis following recognition that lung cancer is a possible diagnosis.