Cure models can provide improved possibilities for inference if used appropriately, but there is potential for misleading results if care is not taken. In this study, we compared five commonly used approaches for modelling cure in a relative survival framework and provide some practical advice
on the use of these approaches.
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Estimating the proportion cured of cancer: Some practical advice for users
1. G Model
CANEP-624; No. of Pages 7
Cancer Epidemiology xxx (2013) xxx–xxx
Contents lists available at ScienceDirect
Cancer Epidemiology
The International Journal of Cancer Epidemiology, Detection, and Prevention
journal homepage: www.cancerepidemiology.net
Estimating the proportion cured of cancer: Some practical advice for users
X.Q. Yu a,b,*, R. De Angelis c, T.M.L. Andersson d, P.C. Lambert d,e, D.L. O’Connell a,b,f,g, P.W. Dickman d
a
Cancer Council New South Wales, Sydney, Australia
Sydney School of Public Health, Sydney, Australia
c
National Centre of Epidemiology, Italian National Institute of Health, Rome, Italy
d
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
e
University of Leicester, Department of Health Sciences, Leicester, UK
f
School of Medicine and Public Health, University of Newcastle, Newcastle, Australia
g
School of Public Health and Community Medicine, University of New South Wales, Sydney, Australia
b
A R T I C L E I N F O
A B S T R A C T
Article history:
Accepted 24 August 2013
Available online xxx
Background: Cure models can provide improved possibilities for inference if used appropriately, but
there is potential for misleading results if care is not taken. In this study, we compared five commonly
used approaches for modelling cure in a relative survival framework and provide some practical advice
on the use of these approaches.
Patients and methods: Data for colon, female breast, and ovarian cancers were used to illustrate these
approaches. The proportion cured was estimated for each of these three cancers within each of three age
groups. We then graphically assessed the assumption of cure and the model fit, by comparing the
predicted relative survival from the cure models to empirical life table estimates.
Results: Where both cure and distributional assumptions are appropriate (e.g., for colon or ovarian
cancer patients aged <75 years), all five approaches led to similar estimates of the proportion cured. The
estimates varied slightly when cure was a reasonable assumption but the distributional assumption was
not (e.g., for colon cancer patients 75 years). Greater variability in the estimates was observed when the
cure assumption was not supported by the data (breast cancer).
Conclusions: If the data suggest cure is not a reasonable assumption then we advise against fitting cure
models. In the scenarios where cure was reasonable, we found that flexible parametric cure models
performed at least as well, or better, than the other modelling approaches. We recommend that,
regardless of the model used, the underlying assumptions for cure and model fit should always be
graphically assessed.
Crown Copyright ß 2013 Published by Elsevier Ltd. All rights reserved.
Keywords:
Statistical cure
Cure models
Relative survival
Population-based
1. Introduction
Advances in the diagnosis and treatment of cancer have meant
that an increasing number of cancer patients are now cured of their
cancer. For those cancers where cure occurs, the cumulative
survival curves level off at the point of cure. The definition of ‘cure’,
as used in this context, is that ‘cured’ patients have a mortality rate
equal to that of the subjects of the same age and sex in the general
population, and differs conceptually from ‘‘clinical cure’’ for
individual patients.
Traditional approaches to survival analysis assume that a single
survival distribution can be used to describe the survival of all
individuals with a given set of covariates. Cure models in a relative
survival framework (the most commonly used approach for
* Corresponding author at: Cancer Research Division, Cancer Council New South
Wales, PO Box 572, Kings Cross, NSW 1340, Australia. Tel.: +61 2 93341851;
fax: +61 2 8302 3550.
E-mail addresses: xueqiny@nswcc.org.au, xue.yu@sydney.edu.au (X.Q. Yu).
population-based data) [1], on the other hand, assume that the
patients can be partitioned into two groups, those who are cured
and those who are not, with separate survival distributions for
each. When applying cure models, one typically reports an
estimate of the proportion cured along with a summary measure
(e.g., mean or median survival time) of the uncured. The proportion
cured, in particular, is felt to be more directly relevant to patients
and clinicians and easier to interpret than the measures reported
from traditional approaches. An additional advantage of cure
models is that they potentially provide greater possibilities for
studying the mechanisms underlying temporal trends. Traditional
approaches can, for example, demonstrate that survival is improving
over time but cure models can elucidate whether this is because we
are curing more patients or because we are prolonging the survival
time of those patients who will nevertheless succumb to the disease
or, more usually, some combination of both.
Due to these advantages, several statistical models for fitting
survival data with a cure proportion in a relative survival framework have been developed over the last two decades [2–6]. Cure
1877-7821/$ – see front matter . Crown Copyright ß 2013 Published by Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.canep.2013.08.014
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
2. G Model
CANEP-624; No. of Pages 7
X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx
2
models can also be fitted in a cause-specific survival framework,
but such models are not considered here. These cure models have
been applied to data from various populations to investigate either
the improvement in survival over time, or geographical variation in
cancer survival [2–11]. Using these models and interpreting the
results they produce, however, must be undertaken carefully and
with an awareness of the potential limitations and inaccuracies
resulting from the assumptions on which the models rely. It is
therefore important to comprehensively compare these models
and assess their merits for use in different circumstances, so that
appropriate recommendations for their use may be made. Existing
studies on these issues, however, are rather limited, although we
hope that this study will begin to address some of these questions.
First, we compare five approaches [2–6] for cure modelling in a
relative survival framework by applying them with the population-based survival data for three cancers with different cure
profiles. We then discuss the relative merits of the approaches and
provide practical advice for users.
2. Methods
All five approaches [2–6] are variations on two types of cure
model: the mixture cure model and the non-mixture cure model.
Both models assume that a proportion of patients will be cured by
defining an asymptote for the relative survival, but the models are
parameterised in different ways. Three of the selected approaches
are mixture cure models [2–4] and two of them are non-mixture
cure models [5,6]. A brief summary of these approaches is provided
in Table 1 and the key features of these approaches are described
in Appendix 1 (Model specifications).
The application of these approaches was illustrated using SEER9 data [12] from 1981 to 1988, with follow-up through to 2007
(with only the first 15 years of follow-up data being used in the
estimations), for patients diagnosed with cancers of the colon,
female breast and ovary. The decision to choose these three types
of cancer was because they are typical examples that illustrate
several scenarios regarding two key assumptions of cure models: a
proportion of cancer patients do get cured and the distribution of
the survival times of the uncured cases can be described by the
chosen parametric distribution. Survival data were extracted from
the SEER database using the SEER*Stat software in two formats
with identical selection criteria: grouped relative survival data and
individual survival data. The grouped data (used in the CANSURV
software [13] and Verdecchia’s approach) [3] included the
following variables: number of patients alive at start of the
follow-up interval (annual 1–15), number of patients who died,
number of patients lost to follow-up, and also observed, expected,
and relative survival, and the standard error for relative survival.
Individual records were used in other three approaches [4–6]. As
survival differ greatly by age at diagnosis, we estimated the
proportion cured separately by age group (60, 60–74, and 75–84
years).
To examine the adequacy of the cure models, we visually
compared the predicted relative survival from the cure models
with empirical life table estimates derived using the Ederer II
method [14].
3. Results
A total of 148,963 cases were included (colon cancer: 52,203;
breast cancer: 84,595; ovarian cancer: 12,165). For colon and
ovarian cancer in the two younger age groups (60 and 60–74
years) all approaches produced similar estimates of the proportion
cured (Table 2). For the oldest age group (75–84 years) there was
Table 1
Comparison of five approaches for estimating the proportion cured of cancer.
Yu et al. [2]
Type of model
Structure of input data
Parameter estimation
Software used
Assumed survival distribution
a
b
Verdecchia et al. [3]
De Angelis et al. [4]
Lambert et al. [5]
Andersson et al. [6]
Mixture
Grouped survival data
Maximum likelihood
CANSURV
Weibull
Mixture
Grouped survival data
Non-linear least squares
SAS
Weibull
Mixture
Individual survival data
Maximum likelihood
Stata
Weibull a
Non-mixture
Individual survival data
Maximum likelihood
Stata
Weibull a
Non-mixture
Individual survival data
Maximum likelihood
Stata
Splines b
We use the Weibull distribution but other distributions are available in the Stata implementation by Lambert. The Stata Journal 2007
The baseline cumulative hazard is estimated using restricted cubic splines so the survival distribution is a parametric distribution that is a function of the spline parameters.
Table 2
Estimated proportion cured (%) and (95% confidence intervals) from different cure model approaches by cancer type and age group.
60 years
Yu et al. [2] (yearly interval data)
(monthly interval data)
Verdecchia et al. [3] (yearly interval data)
(monthly interval data)
De Angelis et al. [4]
Lambert et al. [5]
Andersson et al. [6]
Yu et al. [2]
Verdecchia et al. [3]
De Angelis et al. [4]
Lambert et al. [5]
Andersson et al. [6]
Yu et al. [2] (yearly interval data)
(monthly interval data)
Verdecchia et al. [3] (yearly interval data)
(monthly interval data)
De Angelis et al. [4]
Lambert et al. [5]
Andersson et al. [6]
Colon cancer
52.9
(51.7–54.0)
52.8
(51.7–53.9)
52.8
(52.3–53.3)
53.1
(53.0–53.3)
52.2
(51.1–53.3)
51.9
(50.8–53.0)
52.4
(51.4–53.4)
Breast (female) cancer
62.4
(61.7–63.1)
63.8
(62.1–65.6)
62.1
(61.5–62.8)
61.8
(61.1–62.5)
64.1
(63.6–64.7)
Ovarian cancer
46.7
(45.2–48.3)
Failed to converge
47.5
(46.3–48.7)
47.8
(47.5–48.0)
47.8
(46.3–49.3)
48.0
(46.4–49.6)
48.7
(47.2–50.1)
60–74 years
75–84 years
52.9
52.8
52.5
52.7
54.0
53.5
52.9
(51.8–53.9)
(51.8–53.8)
(52.2–52.9)
(52.6–52.9)
(53.1–54.9)
(52.6–54.5)
(52.1–53.8)
50.8
53.1
50.9
51.3
56.5
56.0
50.0
(48.5–53.0)
(51.5–54.7)
(49.8–52.0)
(51.0–51.6)
(55.3–57.8)
(54.8–57.3)
(48.7–51.2)
58.9
59.2
57.8
55.9
66.2
(56.8–60.9)
(58.0–60.4)
(55.7–59.9)
(53.3–58.5)
(65.5–66.9)
0
0
4.7
0.06
65.0
(0–100)
(0–100)
(63.3–66.7)
23.1
22.6
22.3
22.4
23.2
23.2
22.5
(21.6–24.7)
(21.2–24.2)
(21.1–23.4)
(22.2–22.7)
(21.7–24.7)
(21.7–24.9)
(21.2–23.8)
17.3
18.8
16.3
16.7
20.2
21.1
15.8
(14.6–20.4)
(16.4–21.6)
(15.4–17.2)
(16.4–17.0)
(17.8–22.8)
(18.5–24.0)
(13.9–17.9)
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
3. G Model
CANEP-624; No. of Pages 7
X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx
considerable variation, with results ranging from 50.0% to 56.5% for
colon cancer and from 15.8% to 21.1% for ovarian cancer. As
mentioned previously, the results for the approaches with grouped
data were based on commonly used annual follow-up intervals. For
colon and ovarian cancers, to examine the sensitivity to the width
of the follow-up interval, we also present estimates in Table 2
using monthly intervals. Breast cancer was chosen to show that it
is not sensible to fit cure models for this cancer. As expected, all five
modelling approaches produced an estimated proportion cured for
breast cancer patients in the two younger age groups (Table 2). For
the youngest age group (60 years) all approaches resulted in
3
quite similar estimates (ranging from 61.8% to 64.1%), while
estimates for the middle age group (60–74 years) were less
consistent. For the oldest age group (75–84 years), however, all
methods based on the Weibull model indicated no cure or
negligible proportions cured (values ranging from 0 to 4.7%), while
the flexible parametric approach gave an estimate of 65%.
The predicted relative survival estimates, stratified by age
groups, for each of the three cancers from the five approaches were
plotted against the life table estimates to evaluate model fit (Fig. 1).
For breast cancer (Fig. 1B), the graphical assessment indicate that
there was no evidence of statistical cure for any age group because all
Fig. 1. Comparing predicted relative survival from different modelling approaches with life table estimates – for (A) colon cancer, (B) breast cancer and (C) ovarian cancer.
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
4. G Model
CANEP-624; No. of Pages 7
X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx
4
Table 3
Estimated proportion cured (%) from different modelling approaches for localised colon cancer.
60 years
Yu et al. [2]
Verdecchia et al. [3]
De Angelis et al. [4]
Lambert et al. [5]
Andersson et al. [6]
60–74 years
75–84 years
Failed to converge
87.1
Failed to converge
Failed to converge
86.5
Failed to converge
82.2
Failed to converge
Failed to converge
82.9
Failed
Failed
Failed
Failed
81.8
the survival curves did not level off within 15 years of follow-up,
while cure appears to be a reasonable assumption for colon and
ovarian cancer (Fig. 1A and C).
We also applied these approaches to data of localised colon
cancer to evaluate the models in relatively high survival situation
(Table 3).
4. Discussion
Cure models have been increasingly used for modelling timeto-event data incorporating a proportion cured. We wish to
emphasise that their application and interpretation is dependent
on assumptions, and violation of these assumptions may lead to
biased estimates. While some statistical procedures are relatively
insensitive to underlying assumptions this is not the case with cure
models. The two key assumptions are that cure occurs and that the
distribution of survival times of the uncured cases can be described
by the chosen parametric distribution. If the first assumption is not
reasonable then we would advise against using cure models, and
even when cure is a reasonable assumption, careful assessment of
the distributional assumptions must be undertaken so that biased
estimates can be prevented. From our experience in applying cure
models we have found that cure is not always a reasonable
assumption, with breast cancer and prostate cancer being typical
examples. We have also found that even when there is evidence of
cure, the models do not converge, or only fit poorly, when survival
is either relatively good or relatively poor. The developers of cure
models often discuss the limitations of the models, and urge
caution in their use, but typically present illustrative examples
where cure models work well; colon cancer being a particular
favourite [2–6,9,10,15].
In this study, we explored the practicalities of applying cure
models in a broader context, using data for three different cancer
types that illustrate several scenarios regarding these two central
assumptions. First is the situation where cure is not a reasonable
assumption, illustrated using data for female breast cancer. Second
is the situation where both the cure and distributional assumptions are reasonable (in this study we used the Weibull distribution
for all approaches requiring a distributional assumption). Finally is
the situation where cure is a reasonable assumption but the
distributional assumption is not.
Within the considered 15-year follow-up time there was no
evidence of statistical cure, for any age group, for women
diagnosed with breast cancer (Fig. 1B) but this does not preclude
the possibility that many of the women are medically cured. We
believe that cure models should not be used for such data, but we
nonetheless fitted cure models to illustrate this scenario. All
approaches produced an estimate of the proportion cured for the
two younger age groups, and one of the approaches also produced
a large positive estimate in the oldest age group (Table 2). We
would emphasise here that just producing an estimate does not
mean the approach is sensible.
It would be desirable to have a formal test for determining
whether population cure exists, to assist researchers deciding
whether it is appropriate to apply cure models to the population of
interest. There has previously been some work in this area [16,17],
to
to
to
to
converge
converge
converge
converge
but the application of these methods is rather limited, largely due
to a lack of software to implement the proposed approaches [18].
Thus, a simple and easy to implement test is needed for this. In the
absence of such a test, we suggest visually examining life table
estimates of relative survival by key prognostic factors to
determine if the survival curves tend to level off after a certain
period of follow-up; if not, a cure model should not be applied to
such data [19,20]. This raises the interesting question of how much
levelling is sufficient to allow the application of cure models, a
question that still requires much discussion.
For the second scenario, where there is clear graphical
evidence that the statistical cure assumption is appropriate,
our analyses showed that all five approaches, no matter what the
structure of the input file, the estimation methods or the
software used, produced very similar estimates for the proportion cured. This is the case for the two younger age groups for
colon and ovarian cancer (Table 2), and supported by graphical
evidence that the survival curves from the different approaches
are in close agreement (Fig. 1A and C). Thus, when both the cure
and distributional assumptions are met, most cure model
approaches are likely to give reliable estimates of the proportion
cured.
The third scenario, where cure is reasonable but the distributional assumption is not, was illustrated using the oldest age group
for both colon and ovarian cancer. Here, the Weibull distribution
cannot capture the survival shapes appropriately since mortality is
quite high within the first year and then rapidly decreases. In this
situation, the estimates of the proportion cured from different
approaches varied from 50.0% to 56.5% for colon cancer, and from
15.8% to 21.1% for ovarian cancer (Table 2). The two approaches
that use a dataset comprising individual records and a Weibull
distribution [4,5] yielded very similar results and overestimated
the proportion cured, which confirm the previous finding that cure
models do not perform well when survival drops rapidly soon after
diagnosis [5,15].
However, the two Weibull models using grouped data [2,3]
gave similar and very close estimates to the life table estimates
for colon cancer in the oldest age group. We suspect that the use
of grouped data with an annual follow-up interval effectively
averaged out the high excess mortality in the first few months
after diagnosis; consequently resulting in lower estimates which
coincided with that from the flexible parametric model. To test
this hypothesis, we repeated the initial analysis with monthly
follow-up intervals using CANSURV and found that the updated
estimate of the proportion cured moved towards that (56%)
obtained using the individual records, from 50.8% to 53.1%. This
suggested that when the assumption of a Weibull distribution is
not appropriate, the models using grouped data may be sensitive
to the choice of the width of the follow-up interval. Although the
magnitude of the change (from 50.8% to 53.1%) does not
constitute evidence against using Weibull models, we believe
that the flexible parametric cure model may be preferable in such
situations.
Besides satisfying two central assumptions for cure models, the
accuracy of estimates of the proportion cured is based on the size of
the study population and the length of patient follow-up, two
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
5. G Model
CANEP-624; No. of Pages 7
X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx
important prerequisites for the application of cure models. Thus,
the strengths of this study include the large sample, which
increased the probability of obtaining a stable estimate, and the
long follow-up, which allowed the survival curves to level off. For
both colon and ovarian cancers, 15-years of follow-up is
considered to be beyond the minimum threshold required [21].
For colon cancer, and to a lesser extent ovarian cancer, the cure
models used here are well justified, because there is strong
empirical evidence of the existence of a proportion cured. For colon
cancer, advances in diagnostic and surgical techniques along with
adjuvant chemotherapy and radiotherapy have led to impressive
improvements in outcomes: a substantial proportion of patients
with early disease stage [22] or regional disease [23] may be cured
of the disease. Cure is also possible for selected advanced colon
cancer patients through a multimodal approach of combining
surgical treatments and systemic therapies [24,25]. For ovarian
cancer the evidence was not as strong but the available data
[26,27] clearly pointed in the same direction. In addition, our
results are consistent with many population-based studies which
indicate that cure is possible for some patients with colon or
ovarian cancer [2,3,5,28].
However, cure models in general have several potential
limitations which need to be considered. First, cure models with
right-censored data suffer from an inherent identifiability problem. To try to minimise this problem we examined whether the
survival curve levelled off after a certain period of follow-up to
make sure most events have been observed, as advised by Yu [2]. In
the case of colon and ovarian cancers the survival curves appeared
to level off at around 10 years of follow-up.
Second, most cure models do not converge (e.g., localised
colon cancer), or only fit poorly, when survival is either relatively
good or relatively poor, as we showed earlier in this study. To
account for the latter problem, Lambert et al. [15] proposed a
mixture Weibull distribution approach which assumes that the
distribution of survival times for the uncured cases is a mixture
of two Weibull distributions and reported two advantages over a
single Weibull model [15]: a lower Akaike Information Criterion
(AIC) value, indicating better fit of the model; and closer
predicted survival estimates to the empirical survival estimates.
In situations like this, the flexible parametric model is a better
way to be more flexible in the shape of the parametric
distribution than the mixture Weibull.
Third, there is currently a lack of diagnostic tools for all
approaches. Although AIC has been used to select a better model,
using such measures for model selection can be dangerous if
interest lies in estimation of the proportion cured [5]. The difficulty
in assessing cure models is that when the chosen distribution, e.g.,
Weibull, is not appropriate the estimation algorithm will favour
models that provide a better fit where there is most information,
typically early in the follow-up where more deaths occur, rather
than later in the follow-up where cure occurs. As such, the usual
tests of goodness-of-fit are not especially informative for cure
models since they may favour a model that fits well in the first year
following diagnosis but for which the proportion cured is
estimated poorly. Thus, we believe that the use of graphs for
assessing goodness of fit is extremely important, although
methods for model diagnostics are still needed for future
methodological research [29].
Both grouped relative survival data and data comprising
individual survival records can be used for estimating the
proportion cured. There are several benefits to using grouped
survival data in cure models. The tabulated data are readily
available in published reports [3], the approaches are implemented
with the use of readily available software (either SAS or CANSURV),
and the model is easy to run and takes less time to converge than
approaches that use individual data records. However, there are
5
some concerns regarding loss of information due to collapsing data
into groups such as requiring an annual follow-up interval, which
is not ideal for older patients with high excess mortality in the first
few months after diagnosis.
The models fitted to individual data have an advantage over
those fitted to grouped data in that one is not required to
categorise continuous covariates such as age. Modelling age as a
smooth function, e.g., using splines, is not only biologically more
plausible than modelling age as a step function, but it gives the
possibility to make predictions for individual ages rather than for
age groups. The flexible parametric cure model offers some
additional advantages including greater modelling flexibility with
respect to the shapes of the survival distributions, greater
sensitivity to small excess risk; and easy implementation, as
readily available software can be used. However, it is also
potentially sensitive to the choice of the number and location of
knots. In this study, we have fitted a model with seven knots with
default locations. But how sensitive are the results to different
numbers or locations of knots? We performed a sensitivity
analysis by fitting models with varying numbers and locations of
knots using the colon cancer data for the oldest age group for
which relatively larger variation was observed. As found in
previous studies [6,30,31], the estimated proportion cured was
insensitive to either ‘‘sensible’’ choices of the number of knots
with a difference of only 0.2% (50.0% vs 50.2%) or locations of the
knots: the difference being 0.4% (49.9% vs 50.3%) (Fig. 2). This
further confirms that the flexible parametric models are generally
insensitive to the number and position of the knots ‘‘as long as
they are placed over the whole follow-up period and the last knot
is positioned at the last observed death time or possibly later’’
[30]. Its unique advantage is that it allows modelling for older age
groups (Fig. 1) and cases with early cancer stage (Table 3), which
is not always possible using other approaches. More detailed
methods and results for this sensitivity analysis were described in
the Appendix 2.
However, in the case where cure is not a reasonable
assumption, such as for breast cancer, the flexible parametric
cure model gives a worse fit to the data (Fig. 1B) than the other
models. This is because the flexible parametric cure model
assumes the point of statistical cure occurs at a finite point during
follow-up, whereas Weibull models assume cure to occur even at
time infinity with a null cured proportion, and hence provide a
better fit when cure does not occur. Specifically, the flexible
parametric cure model assumes cure occurs at the last knot, which
in our example was at 15 years, thus estimating that 65% were
cured. We do not feel this should be seen as a disadvantage of the
flexible parametric cure model, since we don’t believe cure models
should be applied in situations where cure is not a reasonable
assumption.
In summary, in choosing an approach we feel practitioners
should take into account both theoretical and practical considerations. If visual inspection of cumulative relative survival curves
suggests cure is not a reasonable assumption we would, in general,
discourage the application of cure models. If cure is a reasonable
assumption and patient survival is neither extremely high nor low,
then there is little difference between the implementations and
one would choose software that suits. If one is interested in
analysing SEER data or has imported data into SEER*Stat, for
example, then CANSURV becomes particularly attractive. One does
not have to work long with cure models, however, before
encountering scenarios where the distributional assumptions
are not appropriate and the models fail to converge. Therefore,
unless one has a particularly strong preference for another
software package, we recommend the implementation by Andersson [6] since the Weibull distribution is not always sufficiently
flexible to provide a good fit when, for example early mortality is
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
6. G Model
CANEP-624; No. of Pages 7
X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx
6
Fig. 2. Sensitivity analysis varying numbers and locations of knots, SEER data for colon cancer (aged 75–84 years) diagnosed in 1981–1988. *Proportion cured overestimated
and this choice not recommended.
high and the flexible parametric model does not suffer from this
problem.
In conclusion, if the data suggest cure is not a reasonable
assumption then we advise against fitting cure models. In the
scenarios where cure was reasonable, we found that flexible
parametric cure models performed at least as well, or better, than
the other modelling approaches. We recommend that, regardless
of the model used, the underlying assumptions for cure and model
fit should always be graphically assessed.
Conflict of interest statement
No conflict of interests identified.
Acknowledgements
We thank Mark Clements for his comments on the earlier
draft of this manuscript, Qingwei Luo for assisting with
producing the graphs and Clare Kahn for editorial assistance.
Xue Qin Yu is supported by an Australian NHMRC Training
Fellowship (550002) and he thanks the Sydney Medical School
for their support in the form of an International Travelling
Fellowship in 2012, which enabled him to collaborate with Paul
Dickman at the Karolinska Institute in Sweden. Part of this work
was carried out while Paul Lambert was granted study leave by
the University of Leicester.
Appendix. Supplementary files
Supplementary files associated with this article can be found, in
the online version, at http://dx.doi.org/10.1016/j.canep.2013.
08.014.
References
[1] Dickman PW, Adami HO. Interpreting trends in cancer patient survival. J Intern
Med 2006;260:103–17.
[2] Yu B, Tiwari RC, Cronin KA, McDonald C, Feuer EJ. CANSURV: A Windows
program for population-based cancer survival analysis. Comput Methods
Programs Biomed 2005;80:195–203.
[3] Verdecchia A, De Angelis R, Capocaccia R, et al. The cure for colon cancer:
results from the EUROCARE study. Int J Cancer 1998;77:322–9.
[4] De Angelis R, Capocaccia R, Hakulinen T, Soderman B, Verdecchia A. Mixture
models for cancer survival analysis: application to population-based data with
covariates. Stat Med 1999;18:441–54.
[5] Lambert PC, Thompson JR, Weston CL, Dickman PW. Estimating and modeling
the cure fraction in population-based cancer survival analysis. Biostatistics
2007;8:576–94.
[6] Andersson TM, Dickman PW, Eloranta S, Lambert PC. Estimating
and modelling cure in population-based cancer studies within the framework of flexible parametric survival models. BMC Med Res Methodol
2011;11:96.
[7] Andersson TM, Lambert PC, Derolf AR, Kristinsson SY, Eloranta S, Landgren O.
Temporal trends in the proportion cured among adults diagnosed with acute
myeloid leukaemia in Sweden 1973–2001, a population-based study. Br J
Haematol 2010;148:918–24.
[8] Clements MS, Roder DM, Yu XQ, Egger S, O’Connell DL. Estimating prevalence
of distant metastatic breast cancer: a means of filling a data gap. Cancer Causes
Control 2012;23:1625–34.
[9] Eloranta S, Lambert PC, Cavalli-Bjorkman N, Andersson TM, Glimelius B,
Dickman PW. Does socioeconomic status influence the prospect of cure from
colon cancer – a population-based study in Sweden 1965–2000. Eur J Cancer
2010;46:2965–72.
[10] Lambert PC, Dickman PW, Osterlund P, Andersson T, Sankila R, Glimelius B.
Temporal trends in the proportion cured for cancer of the colon and rectum: a
population-based study using data from the Finnish Cancer Registry. Int J
Cancer 2007;121:2052–9.
[11] Woods LM, Rachet B, Lambert PC, Coleman MP. ‘Cure’ from breast cancer
among two populations of women followed for 23 years after diagnosis. Ann
Oncol 2009;20:1331–6.
[12] SEER. Surveillance Epidemiology and End Results (SEER) Program
Research Data (1973–2007) National Cancer Institute, DCCPS,
Surveillance Research Program Cancer Statistics Branch. National Cancer
Institute; 2010 , Released April 2010, based on the November 2009
submission.
[13] Data Modeling Branch, ed. Cansurv. Statistical Methodology Applications
Branch. National Cancer Institute; 2005. Version 1. 0.
[14] Ederer F, Heise H. Instructions to IMB 650 Programmers in Processing Survival
Computations Methodological note No. 10. End Results Evaluation Sectioned.
Bethesda: National Cancer Institute, 1959.
[15] Lambert PC, Dickman PW, Weston CL, Thompson JR. Estimating the cure
fraction in population-based cancer studies using finite mixture models. J
Roy Statist Soc 2010;59:35–55.
[16] Maller RA, Zhou S. Testing for the presence of immune or cured individuals in
censored survival data. Biometrics 1995;51:1197–205.
[17] Peng Y, Dear KB, Carriere KC. Testing for the presence of cured patients: a
simulation study. Stat Med 2001;20:1783–96.
[18] Othus M, Barlogie B, Leblanc ML, Crowley JJ. Cure models as a useful statistical
tool for analyzing survival. Clin Cancer Res 2012;18:3731–6.
[19] Othus M, Li Y, Tiwari R. Change point-cure models with application to
estimating the change-point effect of age of diagnosis among prostate cancer
patients. J Appl Stat 2012;39:901–11.
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
7. G Model
CANEP-624; No. of Pages 7
X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx
[20] Rondeau V, Schaffner E, Corbiere F, Gonzalez JR, Mathoulin-Pelissier S. Cure
frailty models for survival data: application to recurrences for breast cancer
and to hospital readmissions for colorectal cancer. Stat Methods Med Res
2013;22:243–60.
[21] Tai P, Yu E, Cserni G, et al. Minimum follow-up time required for the estimation
of statistical cure of cancer patients: verification using data from 42 cancer
sites in the SEER database. BMC Cancer 2005;5:48.
[22] Wichmann MW, Muller C, Hornung HM, Lau-Werner U, Schildberg FW. Results
of long-term follow-up after curative resection of Dukes A colorectal cancer.
World J Surg 2002;26:732–6.
[23] Wilkinson NW, Yothers G, Lopa S, Costantino JP, Petrelli NJ, Wolmark N.
Long-term survival results of surgery alone versus surgery plus 5-fluorouracil and leucovorin for stage II and stage III colon cancer: pooled analysis
of NSABP C-01 through C-05. A baseline from which to compare modern
adjuvant trials. Ann Surg Oncol 2010;17:959–66.
[24] Gallagher DJ, Kemeny N. Metastatic colorectal cancer: from improved survival
to potential cure. Oncology 2010;78:237–48.
7
[25] Tomlinson JS, Jarnagin WR, DeMatteo RP, et al. Actual 10-year survival after
resection of colorectal liver metastases defines cure. J Clin Oncol
2007;25:4575–80.
[26] Jelovac D, Armstrong DK. Recent progress in the diagnosis and treatment of
ovarian cancer. CA Cancer J Clin 2011;61:183–203.
[27] Swenerton KD, Santos JL, Gilks CB, et al. Histotype predicts the curative potential
of radiotherapy: the example of ovarian cancers. Ann Oncol 2011;22:341–7.
[28] Cvancarova M, Aagnes B, Fossa SD, Lambert PC, Moller B, Bray F. Proportion
cured models applied to 23 cancer sites in Norway. Int J Cancer 2013;132:
1700–10.
[29] Mallett S, Royston P, Waters R, Dutton S, Altman DG. Reporting performance of
prognostic models in cancer: a review. BMC Med 2010;8:21.
[30] Andersson TML, Lambert PC. Fitting and modeling cure in population-based
cancer studies within the framework of flexible parametric survival models.
Stata J 2012;12:623–38.
[31] Lambert PC, Royston P. Further development of flexible parametric models for
survival analysis. Stata J 2009;9:265–90.
Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer
Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014