Biotweeps conference:
RNA sequencing-based cell proliferation analysis across 19 cancers identifies a subset of proliferation-informative cancers with a common survival signature
3. KIRP
LGG
KIRC
LIHC
PAAD
ACC
MESO
LUAD
BRCA
GBM
SARC
STAD
HNSC
LAML
ESCA
OV
BLCA
LUSC
CESC
Proliferative Index (Counts/Million)
0 20 40 60 80 100 0 20 40 60 80
Breast
Lung
Pancreas
Esophagus
Stomach
Liver
Kidney (KIRC)
Bladder
Cervix
Kidney (KIRP)
Tumor Adjacent Normal Healthy (GTEx)
A B
Proliferative Index (Counts/Million)
Basal-like Her-2 Enriched Luminal A Luminal B Normal-like
rho=-0.65
Basal−like Luminal A Normal−like
10
20
30
40
50
60
ProliferativeIndex
Her2-Enriched Luminal B
0.8
ber
C D
E F
−100 −50 0 50
−100050100−50
PC1 (9.25%)
PC2(6.72%)
5
4
3
2
1
(A) Tumor proliferative index
(PI) distributions across
The Cancer Genome Atlas
(TCGA) cancers.
(B) PI values in healthy
G e n o t y p e - T i s s u e
E x p r e s s i o n ( G T E x )
samples (blue), TCGA
tumor-adjacent normal
tissue (red) and TCGA
tumor tissue (green).
(C) Heatmap of principal
c o m p o n e n t - t u m o r P I
c o r r e l a t i o n s a c r o s s
cancers.
Counts/Million)
60 80 100 0 20 40 60 80
Proliferative Index (Counts/Million)
Basal-like Her-2 Enriched Luminal A Luminal B Normal-like
rho=-0.65
al A Normal−likeLuminal B
AC
C
0.00.20.40.60.8
PrincipalComponentNumber
SpearmanCorrelation(rho)
BLCABRCACESCESCAG
BMHNSCKIRCKIRPLAM
LLG
G
LIHCLUADLUSCM
ESOO
V
PAADSARCSTAD
D
F
−100 −50 0 50
−100050100−50
PC1 (9.25%)
PC2(6.72%)
0 50
1 25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1C
7. small molecule
metabolic process
cellular
response to
camptothecin
cellular
localization
free ubiquitin chain
polymerization
coenzyme
metabolic
process
cofactor
metabolic
process
metabolic
process
primary
metabolic
process
antigen processing and
presentation of peptide
or polysaccharide
antigen via MHC class II
cell cycle
process
cell
proliferation
cellular
component
organization
or biogenesis
cellular
process
cellular
response
to DNA
damage
stimulus
chromosome
localization
chromosome
organization
chromosome
segregation
DNA
replication
microtubule−based
process
nitrogen
compound
metabolism
organic
substance
metabolism
regulation
of
cell
division
reproductive
process
single
organism
reproductive
process
single−organism
process
ACC
DNA
cytosine
deamination
lipoprotein
metabolism
negative
regulation
of
transposition
BLCA BRCA
anatomical
structure
formation
involved in
morphogenesis
angiogenesis
cell
morphogenesis
involved
in
differentiation
ovulation
positive regulation
of monocyte
chemotactic
protein−1 production
primary
follicle
stage
substrate−dependent
cerebral
cortex
tangential
migration
positive
regulation
of cell
adhesion
chromatin
assembly
extracellular
matrix
organization
regulation
of tight
junction
assembly
immune
response
inflammatory
response
leukocyte
migration
lipopolysaccharide−mediated
signaling
pathway
positive
regulation of
antigen
receptor−mediated
signaling pathway
regulation of
vascular
endothelial growth
factor receptor
signaling pathway
response
to
oxygen
levels
response
to
oxygen−containing
compound
response
to
stress
taxis
cellular
protein
metabolic
process
DNA
cytosine
deamination
macromolecule
modification
peptidyl−proline
hydroxylation
protein
hydroxylation
oxidation−reduction
process
acetyl−CoA
metabolism
angiogenesis
biological
adhesion
cell
activation
cell
adhesion
cellular
component
movement
extracellular
matrix
organization
immune
response
immune
system
process
nitrogen
cycle
metabolism
positive
regulation
of
vitamin D
biosynthesis
protein
hydroxylation
single−organism
metabolism
single−organism
process
CESC
dendritic
spine
maintenance
inorganic
cation
import
into
cell
protein initiator
methionine removal
regulation of
lateral mesodermal
cell fate
specification
response
to
ketone
GBM
histone
H3−K79
methylation
immune
system
process
multi−organism
metabolism
regulation
of leukocyte
cell−cell
adhesion
HNSC
amino−acid
betaine
metabolism
antigen processing and
presentation of peptide
or polysaccharide
antigen via MHC class II
cell
division cellular
component
organization
or
biogenesis
cellular
process
cellular
response
to DNA
damage
stimulus
microtubule−based
process
mitotic
cell
cycle
process
nuclear
division
protein
localization
to chromosome,
centromeric
region
protein
ubiquitination
regulation of
chromosome
segregation
single−organism
process
KIRC
catabolic
process
peptidyl−proline
hydroxylation
anatomical
structure
development
cell
cycle
process
cell
proliferation
cellular
component
movement
cellular
component
organization
or biogenesis
cellular
process
cellular
response to
DNA damage
stimulus
chromosome
segregation
collagen
metabolism
developmental
process
establishment
of chromosome
localization
macromolecule
metabolism
microtubule−based
process
organelle
fission
protein
hydroxylation
regulation
of
cell
division
reproductive
process
single
organism
reproductive
process
single−organism
process
KIRP
arachidonic
acid
metabolic
process
fatty−acyl−CoA
catabolic
process
positive
regulation
of lipid
metabolic
process
protein
polyubiquitination
sterol
metabolic
process
antigen
processing
and
presentation
biological
regulation
cell
cycle
detection of
chemical stimulus
involved in
sensory perception
of taste
fatty
acid
derivative
metabolism
immune
system
process
mesoderm
development
negative
regulation of
leukocyte
proliferation
sterol
metabolism
LAML
macromolecule
metabolic
process
calcium ion
transmembrane import
into mitochondrion
anterior/posterior
pattern
specification
cell
cycle
process
cellular
component
organization
or biogenesis
cellular
process
chromosome
organization
chromosome
segregation
developmental
process
dimethylallyl
diphosphate
biosynthesis
microtubule−based
process
negative
regulation
of viral
process
single−organism
cellular
localization
single−organism
process
LGG
protein
stabilization
regulation of
establishment of
protein localization
to chromosome
connective
tissue
development
extracellular
matrix
disassembly
L−ornithine
transmembrane
transport
protein
stabilization
pyruvate
metabolism
response
to
oxidative
stress
spermine
metabolism
LIHC
cell
cycle
single−organism
cellular
processanaphase−promoting complex−dependent proteasomal
ubiquitin−dependent protein catabolic process
nicotinamide
nucleotide
metabolic
process
nucleotide
phosphorylation
spermine
metabolic
process
antigen
processing
and
presentation
binding
of sperm
to zona
pellucida
cell
division
cell
proliferation
cellular
component
organization
or
biogenesis
cellular
process
coenzyme
A
transmembrane
transport
heterotypic
cell−cell
adhesion
intermediate
filament−based
process
microtubule−based
process
mitotic
cell
cycle
process
nuclear
division
regulation of
chromosome
segregation
response
to
inorganic
substance
single−organism
carbohydrate
catabolism
single−organism
process
LUAD
glycerolipid
catabolic
process
epithelial
fluid transport
lipid
transport
anatomical
structure
development
extracellular matrix
organization
immune
system
process
negative
regulation of
endothelial
cell apoptotic
process
platelet
degranulationprotein
activation
cascade
response
to
stimulus
single−multicellular
organism
process
LUSC
alpha−amino
acid
metabolic
process
DNA
metabolic
process
DNA
replication
ethanol
oxidation
negative
regulation
of
biological
process
negative
regulation of
cellular process
protein
K6−linked
ubiquitination
purine nucleoside
bisphosphate
catabolic process
pyrimidine
deoxyribonucleoside
metabolic
process
regulation
of molecular
function
regulation of
phosphorus
metabolic process
signal
transduction
in response
to DNA
damage
regulation
of smooth
muscle
cell
migration
cellular response
to radiation
actin
filament−based
process
angiogenesis
cell
cycle
process
cell
proliferation
cellular
component
movement
cellular
component
organization
or
biogenesis
chromosome
organization
chromosome
segregation
DNA
replication
establishment
of
chromosome
localization
microtubule−based
process
regulation of
chromosome
segregation
single−organism
process
wound
healing
MESO
amino sugar
metabolism
binding of
sperm to zona
pellucida
cell
communication
membrane
biogenesis
positive
regulation of
glucose transport
response to
insulin−like
growth factor
stimulus
signaling
single
organism
signaling
telencephalon
regionalization
OV
actin
filament−based
process
antigen
processing and
presentation of
exogenous
peptide antigen
biological
regulation
cell
cycle
process
cell differentiation
involved in embryonic
placenta development
cell
proliferation
cell−substrate
adhesion
cellular
component
movement
cellular component
organization or
biogenesis
chromosome
segregation
DNA
replication
microtubule−based
process
nuclear
division
organelle
localization
regulation of
chromosome
segregation response to
organic cyclic
compound
viral
process
PAAD
cell
differentiation
trigeminal
ganglion
development
cellular aromatic
compound metabolic
process
heterocycle
metabolic
process
immune
response
response to
cytokine
metabolic
process
biological_process
immune
system
process
multi−organism
process
negative
regulation
of
multi−organism
process nitrogen
compound
metabolism
nuclear
inner
membrane
organization
organic
cyclic
compound
metabolism
organic
substance
metabolism
single−organism
process
SARC
cellular
response
to
light
stimulus
negative regulation
of retinoic acid
receptor signaling
pathway
positive regulation of
translational
initiation in response
to starvation
negative
regulation
of RNA
export from
nucleus
sodium−independent
organic
anion
transportdetection of
chemical stimulus
involved in sensory
perception of smell multicellular
organismal
process
response
to
stimulus
sodium−independent
organic
anion
transport
STAD
catabolic
process
cobalt ion
transport
mRNA
metabolic
process
ncRNA
metabolic
process
metabolic
process
primary
metabolic
process
peptide
biosynthetic
process
snRNA
transcription
from RNA
polymerase
II promoter
antigen
receptor−mediated
signaling pathway
cell
cycle
cellular
component
organization
or biogenesis
cellular
metabolism
cellular
process
epithelium
migration
immune
system
process
interspecies
interaction
between
organisms
intracellular
transport
localization
macromolecule
catabolism
macromolecule
metabolism
nitrogen
compound
metabolism
organic
substance
metabolism
snRNA
transcription
Proliferation
Informative
Non-proliferation
Informative
angiotensin
maturation
demethylation
methylation
ESCA
Cell Proliferation/Divison
Associated Process
Gene ontology enrichment analysis on survival-associated genes in each cancer.
Adrenocortical Carcinoma Kidney Renal Clear Cell Carcinoma Kidney Renal Papillary Cell Carcinoma Pancreatic Adenocarcinoma
Brain Lower Grade Glioma Lung Adenocarcinoma Mesothelioma
Bladder Urothelial Carcnioma Breast Invasive Carcinoma
Cervical Squamous Cell &
Endocervical Adenocarcinoma Ovarian Serous Cystadenocarcinoma
Glioblastoma multiforme Head and Neck Squamous Cell Carcinoma Acute Myeloid Leukemia Sarcoma
Liver Hepatocellular Carcinoma Lung Squamous Cell Carcinoma Stomach Adenocarcinoma Esophageal Carcinoma
Proliferative
-Informative
Cancers
Non-
Proliferative
Informative
Cancers
9. Full Patient Cohort (n=6,312)
Shortest Survivors
(n=342)
Longest Survivors
(n=342)
Training Cohort
(n=479)
Testing Cohort
(n=205)
Dichotomize 18 shortest
and 18 longest surviving
patients for each cancer
Randomly split into
training (70%) and
testing cohorts (30%)
Feature selection in
training cohort
Model evaluation in
Testing Cohort
AUC
Frequency
0.5 0.6 0.7 0.8 0.9
0
2
4
6
8
10
12
Observed PIC AUC
1 2 3 4 5
0.5
Number of PICs in Permutation
AUC
A
DC
12
0.6
0.7
0.8
rho=0.569
False positive rateTruepositiverate
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
All Cancers (AUC:0.651)
PICs (AUC:0.856)
Non PICs (AUC: 0.634
All Cancers (AUC: 0.651)
PICs (AUC: 0.856)
Non PICs (AUC: 0.634)
B
(A) Workflow for cross-
cancer survival model
generation.
(B) Receiver operating
characteristic (ROC)
curve for multivariate
Cox regression with
LASSO for variable
selection on all 19
cancers (blue), PICs
only (green) and non-
PICs only (orange).
(C) Histogram showing
the distribution of
ROC area under the
curve (AUC) values
for survival models
generated on 100
randomly sampled
s e t s o f c a n c e r s
equivalent in number
to the PICs.
(D) The ROC curve AUC
values are directly
proportional to the
n u m b e r o f P I C s
included in random
sample sets
11. 2 3 4 5 6
10
20
30
40
50
60
Log10 Somatic Mutation Number
ProliferativeIndex Basal-like
Her2-Enriched
Luminal A
Luminal B
Normal-like
Unknown
0
20
40
60
80
ProliferativeIndex
RELN Mutant
RELN Wild Type
All Tumors Basal-Like HER2-Enriched Luminal A Luminal B
0 1000 3000 5000
0.00.40.8
PercentSurvival
Days
RELN Low Expresser or PAM
RELN High Expresser
A
C D
p=0.08
2000 4000 6000
0 1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
Expected P−value (−log10 scale)
ObservedP−value(−log10scale)
TP53
RB1
PIK3CA
B
(A) Tumor proliferative index (PI) is correlated with TCGA breast cancer somatic mutation burden. (B) Q-Q plot of p-
values derived from gene mutation burden-PI associations. (C) TCGA breast tumors containing non-synonymous
mutations in RELN have higher PI compared to wild-type. (D) Kaplan-Meier survival plot shows reduced expression or
protein-altering mutations in RELN are markers of poor prognosis in patients with basal breast cancer.