Sage Bionetworks is a non-profit organization that aims to build disease maps through open innovation and data sharing. The document discusses new ways of modeling disease using "data intensive" science and computational models. It describes several pilots Sage is conducting to further this work, including building shared clinical/genomic datasets through projects like CTCAP, and developing precompetitive drug discovery projects through Arch2POCM. The overall goal is to accelerate medical research through more open and collaborative approaches.
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Stephen Friend Haas School of Business 2012-03-05
1. The Future of Open Innovation: Development and Use of Therapies
End of the Era of Medical Guilds and Alchemy
Moving beyond the Medical Industrial Complex
Stephen Friend MD PhD
Sage Bionetworks (Non-Profit Organization)
Seattle/ Beijing/ Amsterdam
UC Berkeley Hass School of Business
Topics in Innovation
March 5, 2012
2.
3. • New
ways
of
Building
Models
of
Disease
• What
prevents
us
from
building
them?
• What
is
Sage
Bionetworks?
• Review
of
Six
Pilots
• So
what
are
the
next
steps?
4. What
is
the
problem?
Most
approved
therapies
were
assumed
to
be
monotherapies
for
diseases
represen4ng
homogenous
popula4ons
Our
exis4ng
disease
models
o9en
assume
pathway
knowledge
sufficient
to
infer
correct
therapies
9. “Data Intensive” Science- Fourth Scientific Paradigm
Equipment capable of generating
massive amounts of data
IT Interoperability
Open Information System
Host evolving computational models
in a “Compute Space”
10.
11. WHY
NOT
USE
“DATA
INTENSIVE”
SCIENCE
TO
BUILD
BETTER
DISEASE
MAPS?
12. what will it take to understand disease?
DNA
RNA
PROTEIN
(dark
maOer)
MOVING
BEYOND
ALTERED
COMPONENT
LISTS
14. Preliminary Probabalistic Models- Rosetta /Schadt
Networks facilitate direct
identification of genes that are
causal for disease
Evolutionarily tolerated weak spots
Gene symbol Gene name Variance of OFPM Mouse Source
explained by gene model
expression*
Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics
Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics
Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg
Mirochnitchenko (University of
Medicine and Dentistry at New
Jersey, NJ) [12]
Lactb Lactamase beta 52% tg Constructed using BAC transgenics
Me1 Malic enzyme 1 52% ko Naturally occurring KO
Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple
(UCLA) [13]
Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg
(Columbia University, NY) [11]
C3ar1 Complement component 46% ko Purchased from Deltagen, CA
3a receptor 1
Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CA
Nat Genet (2005) 205:370 factor beta receptor 2
18. “Data Intensive” Science- Fourth Scientific Paradigm
Score Card for Medical Sciences
Equipment capable of generating
massive amounts of data A-
IT Interoperability D
Open Information System D-
Host evolving computational models
in a “Compute Space F
19. We still consider much clinical research as if we were
hunter gathers - not sharing
.
26. Sage Mission
Sage Bionetworks is a non-profit organization with a vision to
create a commons where integrative bionetworks are evolved by
contributor scientists with a shared vision to accelerate the
elimination of human disease
Building Disease Maps Data Repository
Commons Pilots Discovery Platform
Sagebase.org
28. ALZHEIMER’S
What
is
this?
Bayesian
networks
enriched
in
inflammaVon
genes
correlated
with
disease
severity
in
pre-‐frontal
cortex
of
250
Alzheimer’s
paVents.
What
does
it
mean?
InflammaVon
in
AD
is
an
interacVve
mulV-‐pathway
system.
More
broadly,
network
structure
organizes
complex
disease
effects
into
coherent
sub-‐systems
and
can
prioriVze
key
genes.
Are
you
joking?
Gene
validaVon
shows
novel
key
drivers
increase
Abeta
uptake
and
decrease
neurite
length
through
an
ROS
burst.
(highly
relevant
to
AD
pathology)
29. A
mulV-‐Vssue
immune-‐driven
theory
of
weight
loss
Hypothalamus
Lep4n
signaling
FaDy
acids
Macrophage/
inflamma4on
Liver
Adipose
M1
macrophage
Phagocytosis-‐
Phagocytosis-‐
induced
lipolysis
induced
lipolysis
30. PLATFORM
Sage Platform and Infrastructure Builders-
( Academic Biotech and Industry IT Partners...)
PILOTS= PROJECTS FOR COMMONS
Data Sharing Commons Pilots-
(Federation, CCSB, Inspire2Live....)
M
S
FOR
MAP
PLAT
NEW
RULES GOVERN
31.
32. Why not share clinical /genomic data and model building in the
ways currently used by the software industry
(power of tracking workflows and versioning
37. Sage Metagenomics Project
Processed Data
(S3)
• > 10k genomic and expression standardized datasets indexed in SCR
• Error detection, normalization in mG
• Access raw or processed data via download or API in downstream analysis
• Building towards open, continuous community curation
38. Sage Metagenomics using Amazon Simple Workflow
Full case study at http://aws.amazon.com/swf/testimonials/swfsagebio/
39. Synapse Roadmap
• Data Repository
• Projects and security Synapse Platform Functionality
• R integration • Workflow templates
• Analysis provenance • Social networking
• Publishing figures • User-customized
• Search • Wiki & collaboration tools dashboards
• Controlled Vocabularies • Integrated management • R Studio integration
• Governance of restricted of cloud resources • Curation tool integration
data
Internal Alpha Public Beta Testing Synapse 1.0 Synapse 1.5 Future
Q1-2012 Q2-2012 Q3-2012 Q4-2012 Q1-2013 Q2-2013 Q3-2013 Q4-2013
• TCGA • Predictive modeling • TBD: Integrations with other
• METABRIC breast workflows visualization and analysis
cancer challenge • Automated processing of packages
common genomics platforms
• 40+ manually curated clinical studies
• 8000 + GEO / Array Express datasets
• Clinical, genomic, compound sensitivity
• Bioconductor and custom R analysis
Data / Analysis Capabilities
40. Six
Pilots
involving
Sage
Bionetworks
CTCAP
Arch2POCM
The
FederaVon
Portable
Legal
Consent
M
S
FOR
MAP
Sage
Congress
Project
PLAT
NEW
BRIDGE
RULES GOVERN
41. Clinical Trial Comparator Arm
Partnership (CTCAP)
Description: Collate, Annotate, Curate and Host Clinical Trial Data
with Genomic Information from the Comparator Arms of Industry and
Foundation Sponsored Clinical Trials: Building a Site for Sharing
Data and Models to evolve better Disease Maps.
Public-Private Partnership of leading pharmaceutical companies,
clinical trial groups and researchers.
Neutral Conveners: Sage Bionetworks and Genetic Alliance
[nonprofits].
Initiative to share existing trial data (molecular and clinical) from
non-proprietary comparator and placebo arms to create powerful
new tool for drug development.
Started Sept 2010
42. Shared clinical/genomic data sharing and analysis will
maximize clinical impact and enable discovery
• Graphic
of
curated
to
qced
to
models
43. Arch2POCM
Restructuring
the
PrecompeVVve
Space
for
Drug
Discovery
How
to
potenVally
De-‐Risk
High-‐Risk
TherapeuVc
Areas
44.
45. Arch2POCM: scale and scope
• Proposed Goal: Initiate 2 programs. One for Oncology/Epigenetics/
Immunology. One for Neuroscience/Schizophrenia/Autism. Both
programs will have 8 drug discovery projects (targets) - ramped up
over a period of 2 years
– It is envisioned that Arch2POCM’s funding partners will select targets
that are judged as slightly too risky to be pursued at the top of pharma’s
portfolio, but that have significant scientific potential that could benefit
from Arch2POCM’s crowdsourcing effort
• These will be executed over a period of 5 years making a total of 16
drug discovery projects
– Projected pipeline attrition by Year 5 (assuming 12 targets loaded in
early discovery)
• 30% will enter Phase 1
• 20% will deliver Ph 2 POCM data 45
46. Arch2POCM: Highlights
A PPP To De-Risk Novel Targets That The Pharmaceutical Industry Can
Then Use To Accelerate The Development of New and Effective Medicines
• The Arch2POCM will be a charitable Public Private Partnership (PPP) that will file no patents and
whose scientific plan (including target selection) will be endorsed by its pharmaceutical, private
and public funders
• Arch2POCM will de-risk novel targets by developing and using pairs of test compounds (two
different chemotypes) that interact with the selected targets: the compounds will be developed
through Phase IIb clinical trials to determine if the selected target plays a role in the biology of
human disease
• Arch2POCM will work with and leverage patient groups and clinical CROs to enable patient
recruitment, and with regulators to design novel studies and to validate novel biomarkers
• Arch2POCM will make its GMP test compounds available to academic groups and foundations so
they can use them to perform clinical studies and publish on a multitude of additional indications
• Arch2POCM will release all reagents and data to the public at pre-defined stages in its drug
development process. To ensure scientific quality, data and reagents will be released once they
have been vetted by an independent scientific committee
• Arch2POCM will publish all negative POCM data immediately in order to reduce the number of
ongoing redundant proprietary studies (in pharma, biotech and academia) on an invalidated
target and thereby
– minimize unnecessary patient exposure
– provide significant economic savings for the pharmaceutical industry
• In the rare instance in which a molecule achieves positive POCM, Arch2POCM will ensure that
the compound has the ability to reach the market by arranging for exclusive access to the
proprietary IND database for the molecule 46
47. Arch2POCM: proposed funding strategy
– $160-200M over five years is projected as necessary to advance
up to 8 drug discovery projects within each of the two therapeutic
programs
– Arch2POCM funding will come from a combination of public
funding from governments and private sector funding from
pharmaceutical and biotechnology companies and from private
philanthropists
– By investing $1.6 M annually into one or both of Arch2POCM’s
selected disease areas, partnered pharmaceutical companies:
1. obtain a vote on Arch2POCM target selection
2. have the opportunity to donate existing compounds from their
abandoned clinical programs for re-purposing on Arch2POCM’s
targets
3. gain real time data access to Arch2POCM’s 16 drug discovery
projects
4. have the strategic opportunity to expand their overall portfolio 47
48. Pipeline flow for Arch2POCM
Five Year Objective: Initiate ≈ 8 drug discovery projects with 6 entering in Early Discovery, one entering in
pre-clinical and one entering in PH I
Months → 0-6 7-12 13-18 19-24 25-30 31-36 37-42 43-48 49-54 55-60
Early discovery (2) Pre-clinical Ph 11.3 Ph 2
Year #1 Pre-clinical (1) Ph 1 Ph 2
Arch2POCM
Target Load 11
Early discovery (4) Pre-clinical Ph 1
Year #2 Ph 1 (1) Ph 2
Arch2POCM
Target Load 1
Early discovery (45% PTRS) Arch2POCM Snapshot at Year 5
Pre-clinical (70% PTRS) Targets
Loaded
8
Ph I (65% PTRS) Projected
INDs
filed
3-‐4
Ph II (10% PTRS) Ph
1
or
2
Trials
In
Progress
2
Projected
Complete
Ph
2
(POCM)
Data
1
*PTRS = Probability of technical and regulatory success
Sets
48
49. The case for epigenetics/chromatin biology
1. There are epigenetic oncology drugs on the market (HDACs)
2. A growing number of links to oncology, notably many genetic links (i.e.
fusion proteins, somatic mutations)
3. A pioneer area: More than 400 targets amenable to small molecule
intervention - most of which only recently shown to be “druggable”, and
only a few of which are under active investigation
4. Open access, early-stage science is developing quickly – significant
collaborative efforts (e.g. SGC, NIH) to generate proteins, structures,
assays and chemical starting points
49
50. Arch2POCM epigenetics program:
Assumptions for launch and completion of Year 1
• Funding necessary to prosecute 8 epigenetic target-based projects
o ≈$85M for five years with $15M available for Year 1
• $1.6M from each of 3 pharma partners ($4.8M)
• $5M from public funders and $5M from philanthropists
o Year 1: load 3 targets with 2 in Early Discovery and 1 in pre-clinical stage of development
o Year 2: load 5 targets with at least one late stage clinical asset from a pharma partner
• Partners
– In kind partners
o GE Healthcare (imaging): open sharing of its experimental oncology biomarkers
o CRUK: through some of its drug discovery and development resources participating in Arch2POCM
– Potential academic partner sites
• Institutions that have indicated willingness to let their scientists participate without patent filing: UCSF,
Massachusetts General Hospital, University of North Carolina, University of Toronto, Oxford University,
Karolinska Institute
• Costs to fund Arch2POCM academic partners will be de-frayed by crowd-sourcing: each funded
investigator will use their own network to amplify what they can do and publish on Arch2POCM targets
– Patient groups will enable patient recruitment and reduce costs for clinical studies
– FDA and EMEA team of regulators available
o Oncology experts available
o Can provide in vitro screening assays for toxicities and biomarker development to improve patient
selection
o FDA to help build and host a compliant Arch2POCM data-sharing site
o Infrastructure that needs to be in place to execute on time
o Align vendors and CROs prior to initiation of Arch2POCM projects
o IT and patient database management: harmonization of data-entry across participating clinical collaborators
in place well before start of first Arch2POCM trial
50
51. General benefits of Arch2POCM for drug
development
1. Arch2POCM s use of test compounds to de-risk previously unexplored
biology enables drug developers to initiate proprietary drug
development starting from an array of unbiased, clinically validated
targets
2. Arch2POCM’s crowdsourced research and trials provides the
pharmaceutical industry with parallel shots on goal: by aligning test
compounds to most promising unmet medical need
3. The positive and negative clinical trial data that Arch2POCM and the
crowd produce and publish will increase clinical success rates (as one
can pick targets and indications more smartly) and will save the
pharmaceutical industry money by reducing redundant proprietary
efforts on failed targets
51
52. Why is Arch2POCM a “smart bet” for Pharma
investment?
Arch2POCM:
an
external
epigeneVc
think
tank
from
which
Pharma
can
load
the
most
likely
to
succeed
targets
as
proprietary
programs
or
leverage
Arch2POCM
results
for
its
other
internal
efforts
• A
front
row
seat
on
the
progression
of
8
epigeneVc
targets
means
that:
• Pharma
can
select
the
epigeneVc
targets
that
best
compliment
their
internal
poriolio
and
for
which
there
is
the
greatest
interest
• Pharma
can
structure
Arch2POCM’s
projects
so
that
key
objecVves
line
up
with
internal
go/no-‐
go
decisions
• Pharma
can
use
Arch2POCM
data
to
trigger
its
internal
level
of
investment
on
a
parVcular
target
• Pharma
can
use
Arch2POCM
resources
to
enrich
their
internal
epigeneVcs
effort:
acVve
chemotypes,
assays,
pre-‐clinical
models,
biomarkers,
geneVc
and
phenotypic
data
for
paVent
straVficaVon,
relaVonships
to
epigeneVc
experts
•
Pharma
can
use
Arch2POCM’s
lead
compound
chemotypes
to:
•
inform
their
proprietary
medicinal
chemistry
efforts
on
the
target
•
idenVfy
chemical
scaffolds
that
impact
epigeneVc
pathways:
a
proprietary
combinaVon
therapy
opportunity
•
Toxicity
screening
of
Arch2POCM
compounds
with
FDA
tools
can
be
used
to
guide
internal
proprietary
chemistry
efforts
in
oncology,
inflammaVon
and
beyond
• Arch2POCM’s
crowd
of
scienVsts
and
clinicians
provides
its
Pharma
partners
with
parallel
shots
on
goal
at
the
best
context
for
Arch2POCM’s
compounds/targets
52
53. How will Arch2POCM provide “line of sight” to new
medicines?
• Arch2POCM’s Ph II validation of high risk high opportunity targets
focuses Pharma’s NME efforts
• Positive POCM data: De-risked validated targets for Pharma development
• Negative POCM data: public release of this data minimizes the amount of time
and money that Pharma and the industry place on failed targets
• Arch2POCM’s clinical candidate compounds provide Pharma with
multiple paths to new medicines
• Arch2POCM compounds that achieve POCM can be advanced into Ph 3 by
Arch2POCM Members
• The purchaser of Arch2POCM’s IND database obtains a significant time advantage
over competitors to generate Phase III data and proceed to market
• NMEs that derive from Arch2POCM will launch with database exclusivity protections:
5-8 years to garner a return on investment
• The crowd’s testing of Arch2POCM compounds may identify alternative/better
contexts for agonizing/antagonizing the disease biology target
• indications
• patient stratification
• combination therapy options
53
55. How can we accelerate the pace of scientific discovery?
2008
2009
2010
2011
Ways to move beyond
“traditional” collaborations?
Intra-lab vs Inter-lab
Communication
Colrain/ Industrial PPPs Academic
Unions
57. sage federation:
model of biological age
Faster Aging
Predicted
Age
(liver
expression)
Slower Aging
Clinical Association
- Gender
- BMI
- Disease
Age Differential Genotype Association
Gene Pathway Expression
Chronological
Age
(years)
58. Reproducible
science==shareable
science
Sweave: combines programmatic analysis with narrative
Dynamic generation of statistical reports
using literate data analysis
Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reports
using literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 –
Proceedings in Computational Statistics,pages 575-580.
Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9
59. Federated
Aging
Project
:
Combining
analysis
+
narraVve
=Sweave Vignette
Sage Lab
R code + PDF(plots + text + code snippets)
narrative
HTML
Data objects
Califano Lab Ideker Lab Submitted
Paper
Shared
Data
JIRA:
Source
code
repository
&
wiki
Repository
60. 1) Data
management
APIs
to
load
standaridzed
objects,
e.g.
R
ExpressionSets
(MaD
Furia):
ccleFeatureData
<-‐
getEnVty(ccleFeatureDataId)
ccleResponseData
<-‐
getEnVty(ccleResponseDataId)
2)
tAutomated,
standardized
workflows
for
cura4on
and
QC
of
large-‐scale
datasets
(-‐
getEnVty(tcgaFeatureDataId)
cgaFeatureData
< Brig
Mecham).
tcgaResponseData
<-‐
getEnVty(tcgaResponseDataId)
A. TCGA:
Automated
cloud-‐based
processing.
B. GEO
/
Array
Expression:
NormalizaVon
workflows,
curaVon
of
phenotype
using
standard
ontologies.
C. AddiVonal
studies
with
geneVc
and
phenotypic
data
in
Sage
repository
(e.g.
CCLE
and
Sanger
cell
line
datasets)
Observed Data!=! Systematic Variation! +! Random Variation!
=! +! +!
3) Pluggable
API
to
implement
predic4ve
modeling
algorithms.
Normalization: Remove the influence of
adjustment variables on data...!
A) Support
for
all
commonly
used
machine
learning
methods
4) Sta4s4cal
performance
assessment
ew
methods)
(for
automated
benchmarking
against
n across
models.
B) Pluggable
custom
=! ethods
as
R
classes
implemenVng
m
customTrain()
and
customPredict()
methods.
+!
custom
model
1
be
arbitrarily
complex
(e.g.
pathway
and
other
A) Can
custom
model
2
custom
model
N
priors)
5) Output
of
candidate
biomarkers
aeach
eature
B) Support
for
parallelizaVon
in
for
nd
f loops.
evalua4on
(e.g.
GSEA,
pathway
analysis)
custom
model
1
custom
model
2
custom
model
N
6)
Experimental
follow-‐up
on
top
predic4ons
(TBD)
E.g.
for
cell
lines:
medium
throughput
suppressor
/
enhancer
screens
of
drug
sensiVvity
for
knockdown
/
overexpression
of
predicted
biomarkers.
64. Sage
Congress
Project
April
20
2012
RealNames
Parkinson’s
Project
RevisiVng
Breast
Cancer
Prognosis
Fanconi’s
Anemia
(Responders
CompeVVons-‐
IBM-‐DREAM)