Stephen Friend Haas School of Business 2012-03-05

The Future of Open Innovation: Development and Use of Therapies

End of the Era of Medical Guilds and Alchemy

Moving beyond the Medical Industrial Complex

Stephen Friend MD PhD
Sage Bionetworks (Non-Profit Organization)
Seattle/ Beijing/ Amsterdam

UC Berkeley Hass School of Business
Topics in Innovation
March 5, 2012

•  New
ways
of
Building
Models
of
Disease

•  What
prevents
us
from
building
them?

•  What
is
Sage
Bionetworks?

•  Review
of
Six
Pilots

•  So
what
are
the
next
steps?

What
is
the
problem?

Most
approved
therapies
were
assumed
to
be

monotherapies
for
diseases
represen4ng
homogenous

popula4ons

Our
exis4ng
disease
models
o9en
assume
pathway

knowledge
suﬃcient
to
infer
correct
therapies

The value of appropriate representations/ maps

“Data Intensive” Science- Fourth Scientific Paradigm

Equipment capable of generating
massive amounts of data

IT Interoperability

Open Information System

Host evolving computational models
in a “Compute Space”

WHY
NOT
USE

“DATA
INTENSIVE”
SCIENCE

TO
BUILD
BETTER
DISEASE
MAPS?

what will it take to understand disease?

DNA

RNA
PROTEIN
(dark
maOer)

MOVING
BEYOND
ALTERED
COMPONENT
LISTS

2002 Can one build a “causal” model?

Preliminary Probabalistic Models- Rosetta /Schadt

Networks facilitate direct
identification of genes that are
causal for disease
Evolutionarily tolerated weak spots

Gene symbol Gene name Variance of OFPM Mouse Source
explained by gene model
expression*
Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics
Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics
Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg
Mirochnitchenko (University of
Medicine and Dentistry at New
Jersey, NJ) [12]

Lactb Lactamase beta 52% tg Constructed using BAC transgenics
Me1 Malic enzyme 1 52% ko Naturally occurring KO
Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple
(UCLA) [13]
Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg
(Columbia University, NY) [11]
C3ar1 Complement component 46% ko Purchased from Deltagen, CA
3a receptor 1
Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CA
Nat Genet (2005) 205:370 factor beta receptor 2

DIVERSE
POWERFUL
USE
OF
MODELS
AND
NETWORKS

List of Influential Papers in Network Modeling

  50 network papers
  http://sagebase.org/research/resources.php

“Data Intensive” Science- Fourth Scientific Paradigm
Score Card for Medical Sciences

Equipment capable of generating
massive amounts of data A-

IT Interoperability D

Open Information System D-

Host evolving computational models
in a “Compute Space F

We still consider much clinical research as if we were
hunter gathers - not sharing
.

TENURE

FEUDAL
STATES

Clinical/genomic data
are accessible but minimally usable

Little incentive to annotate and curate
data for other scientists to use

Mathematical
models of disease
are not built to be
reproduced or
versioned by others

Lack of standard forms for future rights and consents

Sage Mission
Sage Bionetworks is a non-profit organization with a vision to
create a commons where integrative bionetworks are evolved by
contributor scientists with a shared vision to accelerate the
elimination of human disease

Building Disease Maps Data Repository

Commons Pilots Discovery Platform
Sagebase.org

Sage Bionetworks Collaborators

  Pharma Partners
  Merck, Pfizer, Takeda, Astra Zeneca,
Amgen, Johnson &Johnson
  Foundations
  Kauffman CHDI, Gates Foundation

  Government
  NIH, LSDF, NCI

  Academic
  Levy (Framingham)
  Rosengren (Lund)
  Krauss (CHORI)

  Federation
  Ideker, Califano, Nolan, Schadt 27

ALZHEIMER’S

What
is
this?

Bayesian
networks
enriched

in
inflammaVon
genes

correlated
with
disease

severity
in
pre-‐frontal

cortex
of
250
Alzheimer’s

paVents.

What
does
it
mean?

InflammaVon

in
AD
is
an

interacVve
mulV-‐pathway

system.

More
broadly,

network
structure
organizes

complex
disease
effects
into

coherent
sub-‐systems
and

can
prioriVze
key
genes.

Are
you
joking?

Gene
validaVon
shows

novel
key
drivers
increase

Abeta
uptake
and
decrease

neurite
length
through
an

ROS
burst.
(highly
relevant

to
AD
pathology)

A
mulV-‐Vssue
immune-‐driven
theory
of
weight
loss

Hypothalamus

Lep4n

signaling

FaDy
acids

Macrophage/

inﬂamma4on

Liver
Adipose

M1
macrophage

Phagocytosis-‐
Phagocytosis-‐

induced
lipolysis
induced
lipolysis

PLATFORM
Sage Platform and Infrastructure Builders-
( Academic Biotech and Industry IT Partners...)

PILOTS= PROJECTS FOR COMMONS
Data Sharing Commons Pilots-
(Federation, CCSB, Inspire2Live....)
M
S

FOR
MAP

PLAT
NEW

RULES GOVERN

Why not share clinical /genomic data and model building in the
ways currently used by the software industry
(power of tracking workflows and versioning

Leveraging Existing Technologies

Addama

Taverna
tranSMART

sage bionetworks synapse project
Watch What I Do, Not What I Say

Most of the People You Need to Work with Don’t Work with You

My Other Computer is Cloudera Amazon Google

Sage Metagenomics Project

Processed Data
(S3)

•  > 10k genomic and expression standardized datasets indexed in SCR
•  Error detection, normalization in mG
•  Access raw or processed data via download or API in downstream analysis
•  Building towards open, continuous community curation

Sage Metagenomics using Amazon Simple Workflow

Full case study at http://aws.amazon.com/swf/testimonials/swfsagebio/

Synapse Roadmap
•  Data Repository
•  Projects and security Synapse Platform Functionality
•  R integration •  Workflow templates
•  Analysis provenance •  Social networking
•  Publishing figures •  User-customized
• Search •  Wiki & collaboration tools dashboards
• Controlled Vocabularies •  Integrated management •  R Studio integration
• Governance of restricted of cloud resources •  Curation tool integration
data

Internal Alpha Public Beta Testing Synapse 1.0 Synapse 1.5 Future

Q1-2012 Q2-2012 Q3-2012 Q4-2012 Q1-2013 Q2-2013 Q3-2013 Q4-2013

• TCGA •  Predictive modeling •  TBD: Integrations with other
•  METABRIC breast workflows visualization and analysis
cancer challenge •  Automated processing of packages
common genomics platforms
•  40+ manually curated clinical studies
•  8000 + GEO / Array Express datasets
•  Clinical, genomic, compound sensitivity
•  Bioconductor and custom R analysis

Data / Analysis Capabilities

Six
Pilots
involving
Sage
Bionetworks

CTCAP

Arch2POCM

The
FederaVon

Portable
Legal
Consent

M
S

FOR
MAP
Sage
Congress
Project

PLAT
NEW
BRIDGE

RULES GOVERN

Clinical Trial Comparator Arm
Partnership (CTCAP)
  Description: Collate, Annotate, Curate and Host Clinical Trial Data
with Genomic Information from the Comparator Arms of Industry and
Foundation Sponsored Clinical Trials: Building a Site for Sharing
Data and Models to evolve better Disease Maps.
  Public-Private Partnership of leading pharmaceutical companies,
clinical trial groups and researchers.
  Neutral Conveners: Sage Bionetworks and Genetic Alliance
[nonprofits].
  Initiative to share existing trial data (molecular and clinical) from
non-proprietary comparator and placebo arms to create powerful
new tool for drug development.

Started Sept 2010

Shared clinical/genomic data sharing and analysis will
maximize clinical impact and enable discovery

•  Graphic
of
curated
to
qced
to
models

Arch2POCM

Restructuring
the
PrecompeVVve

Space
for
Drug
Discovery

How
to
potenVally
De-‐Risk

High-‐Risk
TherapeuVc
Areas

Arch2POCM: scale and scope
•  Proposed Goal: Initiate 2 programs. One for Oncology/Epigenetics/
Immunology. One for Neuroscience/Schizophrenia/Autism. Both
programs will have 8 drug discovery projects (targets) - ramped up
over a period of 2 years

–  It is envisioned that Arch2POCM’s funding partners will select targets
that are judged as slightly too risky to be pursued at the top of pharma’s
portfolio, but that have significant scientific potential that could benefit
from Arch2POCM’s crowdsourcing effort

•  These will be executed over a period of 5 years making a total of 16
drug discovery projects

–  Projected pipeline attrition by Year 5 (assuming 12 targets loaded in
early discovery)
•  30% will enter Phase 1
•  20% will deliver Ph 2 POCM data 45

Arch2POCM: Highlights
A PPP To De-Risk Novel Targets That The Pharmaceutical Industry Can
Then Use To Accelerate The Development of New and Effective Medicines
•  The Arch2POCM will be a charitable Public Private Partnership (PPP) that will file no patents and
whose scientific plan (including target selection) will be endorsed by its pharmaceutical, private
and public funders
•  Arch2POCM will de-risk novel targets by developing and using pairs of test compounds (two
different chemotypes) that interact with the selected targets: the compounds will be developed
through Phase IIb clinical trials to determine if the selected target plays a role in the biology of
human disease

•  Arch2POCM will work with and leverage patient groups and clinical CROs to enable patient
recruitment, and with regulators to design novel studies and to validate novel biomarkers

•  Arch2POCM will make its GMP test compounds available to academic groups and foundations so
they can use them to perform clinical studies and publish on a multitude of additional indications

•  Arch2POCM will release all reagents and data to the public at pre-defined stages in its drug
development process. To ensure scientific quality, data and reagents will be released once they
have been vetted by an independent scientific committee

•  Arch2POCM will publish all negative POCM data immediately in order to reduce the number of
ongoing redundant proprietary studies (in pharma, biotech and academia) on an invalidated
target and thereby
–  minimize unnecessary patient exposure
–  provide significant economic savings for the pharmaceutical industry

•  In the rare instance in which a molecule achieves positive POCM, Arch2POCM will ensure that
the compound has the ability to reach the market by arranging for exclusive access to the
proprietary IND database for the molecule 46

Arch2POCM: proposed funding strategy
–  $160-200M over five years is projected as necessary to advance
up to 8 drug discovery projects within each of the two therapeutic
programs

–  Arch2POCM funding will come from a combination of public
funding from governments and private sector funding from
pharmaceutical and biotechnology companies and from private
philanthropists

–  By investing $1.6 M annually into one or both of Arch2POCM’s
selected disease areas, partnered pharmaceutical companies:
1.  obtain a vote on Arch2POCM target selection
2.  have the opportunity to donate existing compounds from their
abandoned clinical programs for re-purposing on Arch2POCM’s

targets

3.  gain real time data access to Arch2POCM’s 16 drug discovery
projects
4.  have the strategic opportunity to expand their overall portfolio 47

Pipeline flow for Arch2POCM
Five Year Objective: Initiate ≈ 8 drug discovery projects with 6 entering in Early Discovery, one entering in
pre-clinical and one entering in PH I

Months → 0-6 7-12 13-18 19-24 25-30 31-36 37-42 43-48 49-54 55-60

Early discovery (2) Pre-clinical Ph 11.3 Ph 2

Year #1 Pre-clinical (1) Ph 1 Ph 2
Arch2POCM
Target Load 11
Early discovery (4) Pre-clinical Ph 1
Year #2 Ph 1 (1) Ph 2
Arch2POCM
Target Load 1

Early discovery (45% PTRS) Arch2POCM Snapshot at Year 5
Pre-clinical (70% PTRS) Targets
Loaded
8

Ph I (65% PTRS) Projected
INDs
ﬁled
3-‐4

Ph II (10% PTRS) Ph
1
or
2
Trials
In
Progress
2

Projected
Complete
Ph
2
(POCM)
Data
1

*PTRS = Probability of technical and regulatory success
Sets
48

The case for epigenetics/chromatin biology

1.  There are epigenetic oncology drugs on the market (HDACs)

2.  A growing number of links to oncology, notably many genetic links (i.e.
fusion proteins, somatic mutations)

3.  A pioneer area: More than 400 targets amenable to small molecule
intervention - most of which only recently shown to be “druggable”, and
only a few of which are under active investigation

4.  Open access, early-stage science is developing quickly – significant
collaborative efforts (e.g. SGC, NIH) to generate proteins, structures,
assays and chemical starting points

49

Arch2POCM epigenetics program:
Assumptions for launch and completion of Year 1
•  Funding necessary to prosecute 8 epigenetic target-based projects
o  ≈$85M for five years with $15M available for Year 1
•  $1.6M from each of 3 pharma partners ($4.8M)
•  $5M from public funders and $5M from philanthropists
o  Year 1: load 3 targets with 2 in Early Discovery and 1 in pre-clinical stage of development
o  Year 2: load 5 targets with at least one late stage clinical asset from a pharma partner

•  Partners
–  In kind partners
o  GE Healthcare (imaging): open sharing of its experimental oncology biomarkers
o  CRUK: through some of its drug discovery and development resources participating in Arch2POCM
–  Potential academic partner sites
•  Institutions that have indicated willingness to let their scientists participate without patent filing: UCSF,
Massachusetts General Hospital, University of North Carolina, University of Toronto, Oxford University,
Karolinska Institute
•  Costs to fund Arch2POCM academic partners will be de-frayed by crowd-sourcing: each funded
investigator will use their own network to amplify what they can do and publish on Arch2POCM targets
–  Patient groups will enable patient recruitment and reduce costs for clinical studies
–  FDA and EMEA team of regulators available
o  Oncology experts available
o  Can provide in vitro screening assays for toxicities and biomarker development to improve patient
selection
o  FDA to help build and host a compliant Arch2POCM data-sharing site

o  Infrastructure that needs to be in place to execute on time
o  Align vendors and CROs prior to initiation of Arch2POCM projects
o  IT and patient database management: harmonization of data-entry across participating clinical collaborators
in place well before start of first Arch2POCM trial
50

General benefits of Arch2POCM for drug
development
1.  Arch2POCM s use of test compounds to de-risk previously unexplored
biology enables drug developers to initiate proprietary drug
development starting from an array of unbiased, clinically validated
targets

2.  Arch2POCM’s crowdsourced research and trials provides the
pharmaceutical industry with parallel shots on goal: by aligning test
compounds to most promising unmet medical need

3.  The positive and negative clinical trial data that Arch2POCM and the
crowd produce and publish will increase clinical success rates (as one
can pick targets and indications more smartly) and will save the
pharmaceutical industry money by reducing redundant proprietary
efforts on failed targets
51

Why is Arch2POCM a “smart bet” for Pharma
investment?
Arch2POCM:
an
external
epigeneVc
think
tank
from
which
Pharma
can
load
the

most
likely
to
succeed
targets
as
proprietary
programs
or
leverage
Arch2POCM

results
for
its
other
internal
efforts

•  A
front
row
seat
on
the
progression
of
8
epigeneVc
targets
means
that:

•  Pharma
can
select
the
epigeneVc
targets
that
best
compliment
their
internal
poriolio
and
for

which
there
is
the
greatest
interest

•  Pharma
can
structure
Arch2POCM’s
projects
so
that
key
objecVves
line
up
with
internal
go/no-‐
go
decisions

•  Pharma
can
use
Arch2POCM
data
to
trigger
its
internal
level
of
investment
on
a
parVcular

target

•  Pharma
can
use
Arch2POCM
resources
to
enrich
their
internal
epigeneVcs
effort:
acVve

chemotypes,
assays,
pre-‐clinical
models,
biomarkers,
geneVc
and
phenotypic
data
for
paVent

straVficaVon,
relaVonships
to
epigeneVc
experts

• 
Pharma
can
use
Arch2POCM’s
lead
compound
chemotypes
to:

• 
inform
their
proprietary
medicinal
chemistry
efforts
on
the
target

• 
idenVfy
chemical
scaffolds
that
impact
epigeneVc
pathways:
a
proprietary
combinaVon

therapy
opportunity

• 
Toxicity
screening
of
Arch2POCM
compounds
with
FDA
tools
can
be
used
to
guide

internal
proprietary
chemistry
efforts
in
oncology,
inflammaVon
and
beyond

•  Arch2POCM’s
crowd
of
scienVsts
and
clinicians
provides
its
Pharma
partners
with

parallel
shots
on
goal
at
the
best
context
for
Arch2POCM’s
compounds/targets
52

How will Arch2POCM provide “line of sight” to new
medicines?

•  Arch2POCM’s Ph II validation of high risk high opportunity targets
focuses Pharma’s NME efforts
•  Positive POCM data: De-risked validated targets for Pharma development
•  Negative POCM data: public release of this data minimizes the amount of time
and money that Pharma and the industry place on failed targets

•  Arch2POCM’s clinical candidate compounds provide Pharma with
multiple paths to new medicines
•  Arch2POCM compounds that achieve POCM can be advanced into Ph 3 by
Arch2POCM Members
•  The purchaser of Arch2POCM’s IND database obtains a significant time advantage
over competitors to generate Phase III data and proceed to market
•  NMEs that derive from Arch2POCM will launch with database exclusivity protections:
5-8 years to garner a return on investment

•  The crowd’s testing of Arch2POCM compounds may identify alternative/better
contexts for agonizing/antagonizing the disease biology target
•  indications
•  patient stratification
•  combination therapy options

53

How can we accelerate the pace of scientific discovery?
2008
2009
2010
2011

Ways to move beyond
“traditional” collaborations?

Intra-lab vs Inter-lab
Communication

Colrain/ Industrial PPPs Academic
Unions

sage federation:
model of biological age

Faster Aging
Predicted
Age
(liver
expression)

Slower Aging

Clinical Association
-  Gender
-  BMI
-  Disease
Age Differential Genotype Association
Gene Pathway Expression

Chronological
Age
(years)

Reproducible
science==shareable
science

Sweave: combines programmatic analysis with narrative

Dynamic generation of statistical reports
using literate data analysis

Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reports
using literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 –
Proceedings in Computational Statistics,pages 575-580.
Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9

Federated
Aging
Project
:

Combining
analysis
+
narraVve

=Sweave Vignette
Sage Lab
R code + PDF(plots + text + code snippets)
narrative
HTML

Data objects

Califano Lab Ideker Lab Submitted
Paper

Shared
Data
JIRA:
Source
code
repository
&
wiki

Repository

1)  Data
management
APIs
to
load
standaridzed
objects,
e.g.

R
ExpressionSets
(MaD
Furia):

ccleFeatureData
<-‐
getEnVty(ccleFeatureDataId)

ccleResponseData
<-‐
getEnVty(ccleResponseDataId)

2)

tAutomated,
standardized
workflows
for
cura4on
and
QC
of

large-‐scale
datasets
(-‐
getEnVty(tcgaFeatureDataId)

cgaFeatureData
< Brig
Mecham).

tcgaResponseData
<-‐
getEnVty(tcgaResponseDataId)

A.  TCGA:
Automated
cloud-‐based
processing.

B. GEO
/
Array
Expression:
NormalizaVon
workflows,
curaVon

of
phenotype
using
standard
ontologies.

C. AddiVonal
studies
with
geneVc
and
phenotypic
data
in

Sage
repository
(e.g.
CCLE
and
Sanger
cell
line
datasets)

Observed Data!=! Systematic Variation! +! Random Variation!

=! +! +!

3)  Pluggable
API
to
implement
predic4ve
modeling

algorithms.
Normalization: Remove the influence of
adjustment variables on data...!
A)  Support
for
all
commonly
used
machine
learning
methods

4)  Sta4s4cal
performance
assessment
ew
methods)

(for
automated
benchmarking
against
n across
models.

B)  Pluggable
custom
=! ethods
as
R
classes
implemenVng

m
customTrain()
and
customPredict()
methods.

+!
custom
model
1
be
arbitrarily
complex
(e.g.
pathway
and
other

A)  Can
custom
model
2
custom
model
N

priors)

5)  Output
of
candidate
biomarkers
aeach
eature

B)  Support
for
parallelizaVon
in
for
nd
f loops.

evalua4on
(e.g.
GSEA,
pathway
analysis)

custom
model
1
custom
model
2
custom
model
N

6)
Experimental
follow-‐up
on
top
predic4ons
(TBD)

E.g.
for
cell
lines:
medium
throughput
suppressor
/
enhancer

screens
of
drug
sensiVvity
for
knockdown
/
overexpression
of

predicted
biomarkers.

Portable
Legal
Consent

(AcVvaVng
PaVents)

John
Wilbanks

Sage
Congress
Project

April
20
2012

RealNames
Parkinson’s
Project

RevisiVng
Breast
Cancer
Prognosis

Fanconi’s
Anemia

(Responders
CompeVVons-‐
IBM-‐DREAM)

Networking
Disease
Model
Building

Stephen Friend Haas School of Business 2012-03-05

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Stephen Friend Haas School of Business 2012-03-05

Similar a Stephen Friend Haas School of Business 2012-03-05 (20)

Más de Sage Base

Más de Sage Base (16)

Último

Último (20)

Stephen Friend Haas School of Business 2012-03-05