Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28

Open Source pre-competitive drug discovery

Moving beyond linear investigations
Both of the science and of how we work

Stephen Friend MD PhD

Sage Bionetworks (Non-Profit Organization)
Seattle/ Beijing/ Amsterdam
February 28, 2012

Partnering
&
Collabora/on-‐So
what
has
been
possible?

All
pa&ents
now
>25,000
at
a
Cancer
Center
partnered
provide

consented
expression
on
their
pts
for
classifying
sub-‐popula&ons

Combina&on
Therapies-‐
each
at

Ph
I-‐
joint
development
2
Pharma

Sharing
all
the
CT
Onc
Trial
imagining
ﬁles
among
2
Pharma

Link
Parma
with
an
“Ins&tute
for
Applied
Cancer
Center”

Share
genomic
data

on
25,000
samples

with
clinical
records
and

Expression
and
Exomes
among
three
Pharma

Partnering
&
Collabora/on-‐So
what
has
been
possible?

All
pa&ents
now
>25,000
at
a
Cancer
Center
partnered
provide

consented
expression
on
their
pts
for
classifying
sub-‐popula&ons

2006

MoffiP
Cancer
Center-‐
Merck

Combina&on
Therapies-‐
each
at

Ph
I-‐
joint
development
2
Pharma

2007

AZ
Merck
(Mek/Akt)

Sharing
all
the
CT
Onc
Trial
imagining
files
among
2
Pharma

2008

BMS
&
Merck

Link
Parma
with
an
”
Ins&tute
for
Applied
Cancer
Center”

2008

Belfer-‐
Merck

Share
genomic
data

on
25,000
samples

with
clinical
records
and

Expression
and
Exomes
among
three
Pharma

2010

Asian
Cancer
Research
Group
ACRG-‐

Lilly
Merck
Pfizer

So
what
is
the
problem?

Most
approved
therapies
were
assumed
to
be

monotherapies
for
diseases
represen&ng
homogenous

popula&ons

Our
exis&ng
disease
models
o]en
assume
pathway

knowledge
suﬃcient
to
infer
correct
therapies

what will it take to understand disease?

DNA

RNA
PROTEIN
(dark
maCer)

MOVING
BEYOND
ALTERED
COMPONENT
LISTS

DIVERSE
POWERFUL
USE
OF
MODELS
AND
NETWORKS

List of Influential Papers in Network Modeling

  50 network papers
  http://sagebase.org/research/resources.php

Sage Mission
Sage Bionetworks is a non-profit organization with a vision to
create a commons where integrative bionetworks are evolved by
contributor scientists with a shared vision to accelerate the
elimination of human disease

Building Disease Maps Data Repository

Commons Pilots Discovery Platform
Sagebase.org

Sage Bionetworks Collaborators

  Pharma Partners
  Merck, Pfizer, Takeda, Astra Zeneca,
Amgen, Roche
  Foundations
  Kauffman CHDI, Gates Foundation

  Government
  NIH, LSDF, NCI

  Academic
  Levy (Framingham)
  Rosengren (Lund)
  Krauss (CHORI)

  Federation
  Ideker, Califano, Nolan, Schadt 12

S
MAP
NEW

RULES GOVERN
PLAT
FORM

Why not share clinical /genomic data and model building within
teams in ways currently used by the software industry
(power of tracking workflows and versioning

Leveraging Existing Technologies

Addama

Taverna
tranSMART

sage bionetworks synapse project
Watch What I Do, Not What I Say

Reduce, Reuse, Recycle

Most of the People You Need to Work with Don’t Work with You

My Other Computer is Cloudera Amazon Google

Sage Metagenomics Project

Processed Data
(S3)

•  > 10k genomic and expression standardized datasets indexed in SCR
•  Error detection, normalization in mG
•  Access raw or processed data via download or API in downstream analysis
•  Building towards open, continuous community curation

Sage Metagenomics using Amazon Simple Workflow

Full case study at http://aws.amazon.com/swf/testimonials/swfsagebio/

Amazon SWF and Synapse

•  Maintains state of analysis •  Hosts raw and processed data for
•  Tracks step execution further reuse in public or private
projects
•  Logs workflow history
•  Provides visibility into
•  Dispatches work to Amazon or intermediate results and
remote worker nodes algorithmic details
•  Efficiently match job size to •  Allows programmatic access to
hardware data; integration with R
•  Provides error handling and •  Provides standard terminologies
recovery for annotations
•  Search across data sets

Synapse Roadmap
•  Data Repository
•  Projects and security Synapse Platform Functionality
•  R integration •  Workflow templates
•  Analysis provenance •  Social networking
•  Publishing figures •  User-customized
• Search •  Wiki & collaboration tools dashboards
• Controlled Vocabularies •  Integrated management •  R Studio integration
• Governance of restricted of cloud resources •  Curation tool integration
data

Internal Alpha Public Beta Testing Synapse 1.0 Synapse 1.5 Future

Q1-2012 Q2-2012 Q3-2012 Q4-2012 Q1-2013 Q2-2013 Q3-2013 Q4-2013

• TCGA •  Predictive modeling •  TBD: Integrations with other
•  METABRIC breast workflows visualization and analysis
cancer challenge •  Automated processing of packages
common genomics platforms
•  40+ manually curated clinical studies
•  8000 + GEO / Array Express datasets
•  Clinical, genomic, compound sensitivity
•  Bioconductor and custom R analysis

Data / Analysis Capabilities

INTEROPERABILITY
SYNAPSE

Genome Pattern
CYTOSCAPE
tranSMART
I2B2
INTEROPERABILITY

Five
Pilots
involving
Sage
Bionetworks

CTCAP

The
Federa/on

Portable
Legal
Consent

ORM
S
Sage
Congress
Project

MAP

F
PLAT
NEW
Arch2POCM

RULES GOVERN

Clinical Trial Comparator Arm
Partnership (CTCAP)
  Description: Collate, Annotate, Curate and Host Clinical Trial Data
with Genomic Information from the Comparator Arms of Industry and
Foundation Sponsored Clinical Trials: Building a Site for Sharing
Data and Models to evolve better Disease Maps.
  Public-Private Partnership of leading pharmaceutical companies,
clinical trial groups and researchers.
  Neutral Conveners: Sage Bionetworks and Genetic Alliance
[nonprofits].
  Initiative to share existing trial data (molecular and clinical) from
non-proprietary comparator and placebo arms to create powerful
new tool for drug development.

Started Sept 2010

Shared clinical/genomic data sharing and analysis will
maximize clinical impact and enable discovery

•  Graphic
of
curated
to
qced
to
models

How can we accelerate the pace of scientific discovery?
2008
2009
2010
2011

Ways to move beyond
“traditional” collaborations?

Intra-lab vs Inter-lab
Communication

Colrain/ Industrial PPPs Academic
Unions

sage federation:
model of biological age

Faster Aging
Predicted
Age
(liver
expression)

Slower Aging

Clinical Association
-  Gender
-  BMI
-  Disease
Age Differential Genotype Association
Gene Pathway Expression

Chronological
Age
(years)

Reproducible
science==shareable
science

Sweave: combines programmatic analysis with narrative

Dynamic generation of statistical reports
using literate data analysis

Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reports
using literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 –
Proceedings in Computational Statistics,pages 575-580.
Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9

For 11/12 compounds, the #1 predictive feature in an unbiased
analysis corresponds to the known stratifier of sensitivity
#2
CML
lineage

CML lineage
#1
EGFR
mut

EGFR mut

#1
EGFR
mut

EGFR mut

#1
CML
lineage

#1
EGFR
mut

CML linage
EGFR mut

#1
ERBB2
expr

ERBB2 expr

Can
the
approach
make
new
mut

#1
BRAF

discoveries?

BRAF mut

#1
HGF
expr

HGF expr
#2
NRAS
mut
NRAS mut

BRAF mut
#1
BRAF
mut

#3
KRAS
mut

KRAS mut

#2
NRAS
mut

NRAS mut
BRAF mut

#1
BRAF
mut

#3
KRAS
mut

KRAS mut

#2
NRAS
mut

NRAS mut
BRAF mut

#1
BRAF
mut

#2
TP53
mut

TP53 mut

#3
CDKN2A
copy

CDKN2A copy

#1
MDM2
expr

MDM2 expr

35

Presentation outline

1)
Predic&ng
drug
response
2)
Future
approaches:
3)
Standardized

from
cancer
cell
lines
network-‐based
predictors
workﬂows
for
data

and
mul&-‐task
learning
management,

Cancer
cell
line
versioning
and

encyclopedia
method
comparison

Molecular characterization
Network
/
pathway

(1,000 cell lines) prior
informa&on

Currently
  mRNA
  copy number
  somatic mutations (36
cancer-related genes)
In progress
  targeted exon sequencing Vaske,
et
al.

  epigenetics
  microRNA TCGA
/ICGC

  lncRNA Transfer
Molecular characterization
learning
(50 tumor types)
  phospho-tyrosine kinase
  metabolites

Viability screens (500 cell   genomics
lines, 24 compounds)
  transcriptomics
Small molecule screen   epigenetics

Predic&ve

Clinical data
model
Vaske,
et
al.

1)  Data
management
APIs
to
load
standaridzed
objects,
e.g.

R
ExpressionSets
(MaP
Furia):

ccleFeatureData
<-‐
getEn/ty(ccleFeatureDataId)

ccleResponseData
<-‐
getEn/ty(ccleResponseDataId)

2)

tAutomated,
standardized
workflows
for
cura&on
and
QC
of

large-‐scale
datasets
(-‐
getEn/ty(tcgaFeatureDataId)

cgaFeatureData
< Brig
Mecham).

tcgaResponseData
<-‐
getEn/ty(tcgaResponseDataId)

A.  TCGA:
Automated
cloud-‐based
processing.

B. GEO
/
Array
Expression:
Normaliza/on
workflows,
cura/on

of
phenotype
using
standard
ontologies.

C. Addi/onal
studies
with
gene/c
and
phenotypic
data
in

Sage
repository
(e.g.
CCLE
and
Sanger
cell
line
datasets)

Observed Data!=! Systematic Variation! +! Random Variation!

=! +! +!

3)  Pluggable
API
to
implement
predic&ve
modeling

algorithms.
Normalization: Remove the influence of
adjustment variables on data...!
A)  Support
for
all
commonly
used
machine
learning
methods

4)  Sta&s&cal
performance
assessment
across

(for
automated
benchmarking
against
new
methods)

models.
and
mustomPredict()
methods.

B)  Pluggable
custom
=! ethods
as
R
classes
implemen/ng

customTrain()
c +!
custom
model
1
be
arbitrarily
complex
(e.g.
pathway
and
other

A)  Can
custom
model
2
custom
model
N

priors)

5)  Output
of
candidate
biomarkers
aoops.

B)  Support
for
paralleliza/on
in
for
each
lnd

feature
evalua&on
(e.g.
GSEA,
pathway

analysis)

custom
model
1
custom
model
2
custom
model
N

6)
Experimental
follow-‐up
on
top
predic&ons
(TBD)

E.g.
for
cell
lines:
medium
throughput
suppressor
/
enhancer

screens
of
drug
sensi/vity
for
knockdown
/
overexpression
of

predicted
biomarkers.

Portable
Legal
Consent

(Ac/va/ng
Pa/ents)

John
Wilbanks

Sage
Congress
Project

April
20
2012

RealNames
Parkinson’s
Project

Revisi/ng
Breast
Cancer
Prognosis

Fanconi’s
Anemia

(Responders
Compe//ons-‐
IBM-‐DREAM)

THE QUICK WIN, FAST FAIL DRUG DEVELOPMENT PARADIGM

Test each scarce
TRADITIONAL Preclinical molecule
development Phase I thoroughly
Phase II
Phase III
Scarcity of
drug
discovery
$ $ $$ $$$$
PD Launch
FHD FED
CS
•  Increase critical information content
early to shift attrition to cheaper phase

QUICK WIN, FAST FAIL •  Use savings from shifted attrition to
re-invest in the R&D ‘sweet spot’
Preclinical
development
POC Confirmation, Higher p(TS)
dose finding
Commercialization
Abundance
of drug
discovery
PD Launch

FHD
Source: Nature Publishing Group CS
R&D ‘sweet spot’
March 1, 2012 Confidential | © 2012 Third Rock Ventures PAGE 40

Arch2POCM

Restructuring
the
Precompe//ve

Space
for
Drug
Discovery

How
to
poten/ally
De-‐Risk

High-‐Risk
Therapeu/c
Areas

Arch2POCM: Highlights
A PPP To De-Risk Novel Targets That The Pharmaceutical Industry Can
Then Use To Accelerate The Development of New and Effective Medicines
•  The Arch2POCM will be a charitable Public Private Partnership (PPP) that will file no patents and
whose scientific plan (including target selection) will be endorsed by its pharmaceutical, private
and public funders

•  Arch2POCM will de-risk novel targets by developing and using pairs of test compounds (two
different chemotypes) that interact with the selected targets: the compounds will be developed
through Phase IIb clinical trials to determine if the selected target plays a role in the biology of
human disease

•  Arch2POCM will work with and leverage patient groups and clinical CROs to enable patient
recruitment, and with regulators to design novel studies and to validate novel biomarkers

•  Arch2POCM will make its GMP test compounds available to academic groups and foundations so
they can use them to perform clinical studies and publish on a multitude of additional indications

•  Arch2POCM will release all reagents and data to the public at pre-defined stages in its drug
development process. To ensure scientific quality, data and reagents will be released once they
have been vetted by an independent scientific committee

•  Arch2POCM will publish all negative POCM data immediately in order to reduce the number of
ongoing redundant proprietary studies (in pharma, biotech and academia) on an invalidated
target and thereby
–  minimize unnecessary patient exposure
–  provide significant economic savings for the pharmaceutical industry

•  In the rare instance in which a molecule achieves positive POCM, Arch2POCM will ensure that
the compound has the ability to reach the market by arranging for exclusive access to the
proprietary IND database for the molecule 42

Arch2POCM: scale and scope
•  Proposed Goal: Initiate 2 programs. One for Oncology/Epigenetics/
Immunology. One for Neuroscience/Schizophrenia/Autism. Both
programs will have 6-8 drug discovery projects (targets) - ramped up
over a period of 2 years

–  It is envisioned that Arch2POCM’s funding partners will select targets
that are judged as slightly too risky to be pursued at the top of pharma’s
portfolio, but that have significant scientific potential that could benefit
from Arch2POCM’s crowdsourcing effort

•  These will be executed over a period of 5 years making a total of 16
drug discovery projects

–  Projected pipeline attrition by Year 5 (assuming 12 targets loaded in
early discovery)
•  30% will enter Phase 1
•  20% will deliver Ph 2 POCM data 43

Arch2POCM: proposed funding strategy

–  Arch2POCM funding will come from a combination of public
funding from governments and private sector funding from
pharmaceutical and biotechnology companies and from private
philanthropists

–  By investing $1.6 M annually into one or both of Arch2POCM’s
selected disease areas, partnered pharmaceutical companies:
1.  obtain a vote on Arch2POCM target selection
2.  gain real time data access to Arch2POCM’s12- 16 drug discovery
projects
3.  have the strategic opportunity to expand their overall portfolio

44

Entry points for Arch2POCM programs:
Two compounds (different chemotypes) will be advanced per target
Pioneer targets - genomic/ genetic
- disease networks
- academic partners
- private partners
- SAGE, SGC,

Lead Lead
Preclinical Phase I Phase II
identification optimisation

Assay
in vitro
probe
Lead Clinical Phase I Phase II
candidate asset asset

Stage-gate 1: Early Discovery and Stage-gate 2: Pharma’s re-
PCC Compounds (75%) purposed clinical assets (25%) 45

Pipeline flow for Arch2POCM
Five Year Objective: Initiate ≈ 8 drug discovery projects with 6 entering in Early Discovery, one entering in
pre-clinical and one entering in PH I

Months → 0-6 7-12 13-18 19-24 25-30 31-36 37-42 43-48 49-54 55-60

Early discovery (2) Pre-clinical Ph 11.3 Ph 2

Year #1 Pre-clinical (1) Ph 1 Ph 2
Arch2POCM
Target Load 11
Early discovery (4) Pre-clinical Ph 1
Year #2 Ph 1 (1) Ph 2
Arch2POCM
Target Load 1

Early discovery (45% PTRS) Arch2POCM Snapshot at Year 5
Pre-clinical (70% PTRS) Targets
Loaded
8

Ph I (65% PTRS) Projected
INDs
ﬁled
3-‐4

Ph II (10% PTRS) Ph
1
or
2
Trials
In
Progress
2

Projected
Complete
Ph
2
(POCM)
Data
1

*PTRS = Probability of technical and regulatory success
Sets
46

The case for epigenetics/chromatin biology

1.  There are epigenetic oncology drugs on the market (HDACs)

2.  A growing number of links to oncology, notably many genetic links (i.e.
fusion proteins, somatic mutations)

3.  A pioneer area: More than 400 targets amenable to small molecule
intervention - most of which only recently shown to be “druggable”, and
only a few of which are under active investigation

4.  Open access, early-stage science is developing quickly – significant
collaborative efforts (e.g. SGC, NIH) to generate proteins, structures,
assays and chemical starting points

47

The current epigenetics universe
Domain Family Typical substrate class* Total
Targets
Histone Lysine Histone/Protein K/R(me)n/ (meCpG) 30

demethylase
Bromodomain Histone/Protein K(ac) 57

R Tudor domain Histone Kme2/3 - Rme2s 59

O
Chromodomain Histone/Protein K(me)3 34

Y
A MBT repeat Histone K(me)3 9

L
PHD finger Histone K(me)n 97

Acetyltransferase Histone/Protein K 17

Methyltransferase Histone/Protein K&R 60

PARP/ADPRT Histone/Protein R&E 17

MACRO Histone/Protein (p)-ADPribose 15

Histone deacetylases Histone/Protein KAc 11

395

Now known to be amenable to small molecule inhibition 48

Why is Arch2POCM a “smart bet” for Pharma
investment?
Arch2POCM:
an
external
epigene/c
think
tank
from
which
Pharma
can
load
the

most
likely
to
succeed
targets
as
proprietary
programs
or
leverage
Arch2POCM

results
for
its
other
internal
efforts

•  A
front
row
seat
on
the
progression
of
6-‐
8
epigene/c
targets
means
that:

•  Pharma
can
select
the
epigene/c
targets
that
best
compliment
their
internal
pormolio
and
for

which
there
is
the
greatest
interest

•  Pharma
can
structure
Arch2POCM’s
projects
so
that
key
objec/ves
line
up
with
internal
go/no-‐
go
decisions

•  Pharma
can
use
Arch2POCM
data
to
trigger
its
internal
level
of
investment
on
a
par/cular

target

•  Pharma
can
use
Arch2POCM
resources
to
enrich
their
internal
epigene/cs
effort:
ac/ve

chemotypes,
assays,
pre-‐clinical
models,
biomarkers,
gene/c
and
phenotypic
data
for
pa/ent

stra/fica/on,
rela/onships
to
epigene/c
experts

• 
Pharma
can
use
Arch2POCM’s
lead
compound
chemotypes
to:

• 
inform
their
proprietary
medicinal
chemistry
efforts
on
the
target

• 
iden/fy
chemical
scaffolds
that
impact
epigene/c
pathways:
a
proprietary
combina/on

therapy
opportunity

• 
Toxicity
screening
of
Arch2POCM
compounds
with
FDA
tools
can
be
used
to
guide

internal
proprietary
chemistry
efforts
in
oncology,
inflamma/on
and
beyond

•  Arch2POCM’s
crowd
of
scien/sts
and
clinicians
provides
its
Pharma
partners
with

parallel
shots
on
goal
at
the
best
context
for
Arch2POCM’s
compounds/targets
49

How will Arch2POCM provide “line of sight” to new
medicines?

Arch2POCM will partner with scientists, clinicians and CROs that:

•  use “Omics” approaches to construct predictive models of disease networks
(genomic, proteomic, signaling and metabolic)

•  have strategies available to identify those disease network gene(s) which
when perturbed, impact the overall functioning of the network

•  already have epigenetic assays in place to identify chemotype structures
(from discovery and/or pharma’s re-purposed un-used clinical assets) that
impact the target and disease-correlated molecular phenotypes

•  already have biomarker tools available that can be tested for correlation to
Arch2POCM’s targets

•  already have access to patient data and/or patient groups to mine for
genetic and phenotypic signatures that may represent best responders for
Arch2POCM clinical trials
50

How will Arch2POCM provide “line of sight” to new
medicines?

•  Arch2POCM’s Ph II validation of high risk high opportunity targets
focuses Pharma’s NME efforts
•  Positive POCM data: De-risked validated targets for Pharma development
•  Negative POCM data: public release of this data minimizes the amount of time
and money that Pharma and the industry place on failed targets

•  Arch2POCM’s clinical candidate compounds provide Pharma with
multiple paths to new medicines
•  Arch2POCM compounds that achieve POCM can be advanced into Ph 3 by
Arch2POCM Members
•  The purchaser of Arch2POCM’s IND database obtains a significant time advantage
over competitors to generate Phase III data and proceed to market
•  NMEs that derive from Arch2POCM will launch with database exclusivity protections:
5-8 years to garner a return on investment

•  The crowd’s testing of Arch2POCM compounds may identify alternative/better
contexts for agonizing/antagonizing the disease biology target
•  indications
•  patient stratification
•  combination therapy options

51

Arch2POCM: current partnering status
•  Pharmaceutical Funding Partners
–  Three companies are considering a potential role as industry anchors for Arch2POCM
–  Two companies have demonstrated interest in Arch2POCM and their company leadership wants to
go to next step- awaiting face to face discussions to go over agreement

•  Public Funding Partners
–  Good progress is being made to obtain financial backing for Arch2POCM from public funders in a
number of countries (Canada, United Kingdom and Sweden) for both epigenetics and for CNS
–  Ontario Brain Institute, Canada has allocated $3M to the development of an autism clinical network that is
committed to work with Arch2POCM
•  Philanthropic Funding Partners: awaiting designation of anchor partners
•  In kind partners
–  GE Healthcare (imaging): lead diagnostics partner and willing to share its experimental oncology
biomarkers
–  Cancer Research UK: through some of its drug discovery and development resources considering
participating in Arch2POCM through “in kind efforts”
•  Academic partners
–  Institutions that have indicated willingness to let their scientists participate without patent filing:
UCSF, Massachusetts General Hospital, University of North Carolina, University of Toronto, Oxford
University, Karolinska Institute
–  Academic community of epigenetic experts/resources already identified

•  Regulatory partners: Because the objective of the Arch2POCM PPP is to probe and
elucidate disease biology as opposed to develop new proprietary products, FDA and
EMEA are ready to play an active role (toxicity screens, and legacy clinical trial data)
•  Patient group partners: leaders from Genetic Alliance, Inspire2Live and the Love Avon
Army of Women are actively engaged 52

STRATEGIC INFLECTION: FORCES AFFECTING A BUSINESS

Society’s
Needs Customers

Academia
Businesses Government

Suppliers New
Competitors
New Technologies

MDAndersonCC02272012 Confidential | © 2012 Third Rock Ventures PAGE 53

Networking
Disease
Model
Building

Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Similar to Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28

Similar to Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28 (20)

More from Sage Base

More from Sage Base (20)

Recently uploaded

Recently uploaded (20)

Stephen Friend CRUK-MD Anderson Cancer Workshop 2012-02-28