Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Friend UCSF 2012-07-27
1. Why not use data intensive science to build
better models of diseases together?
– beyond current rewards
Stephen
H
Friend
MD
PhD
Sage
Bionetworks
(non-‐profit)
July
27,
2012
UCSF
2.
3. So
what
is
the
problem?
Most
approved
therapies
were
assumed
to
be
monotherapies
for
diseases
represen4ng
homogenous
popula4ons
Our
exis4ng
disease
models
o9en
assume
pathway
knowledge
sufficient
to
infer
correct
therapies
8. “Data Intensive” Science- Fourth Scientific Paradigm
Equipment capable of generating
massive amounts of data
IT Interoperability
Open Information System
Host evolving computational models
in a “Compute Space”
9.
10. WHY
NOT
USE
“DATA
INTENSIVE”
SCIENCE
TO
BUILD
BETTER
DISEASE
MAPS?
11. what will it take to understand disease?
DNA
RNA
PROTEIN
(dark
maVer)
MOVING
BEYOND
ALTERED
COMPONENT
LISTS
13. Preliminary Probabalistic Models- Rosetta /Schadt
Networks facilitate direct
identification of genes that are
causal for disease
Evolutionarily tolerated weak spots
Gene symbol Gene name Variance of OFPM Mouse Source
explained by gene model
expression*
Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics
Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics
Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg
Mirochnitchenko (University of
Medicine and Dentistry at New
Jersey, NJ) [12]
Lactb Lactamase beta 52% tg Constructed using BAC transgenics
Me1 Malic enzyme 1 52% ko Naturally occurring KO
Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple
(UCLA) [13]
Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg
(Columbia University, NY) [11]
C3ar1 Complement component 46% ko Purchased from Deltagen, CA
3a receptor 1
Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CA
Nat Genet (2005) 205:370 factor beta receptor 2
17. “Data Intensive” Science- Fourth Scientific Paradigm
Score Card for Medical Sciences
Equipment capable of generating
massive amounts of data A-
IT Interoperability D
Open Information System D-
Host evolving computational models
in a “Compute Space F
18. We still consider much clinical research as if we were
hunter gathers - not sharing
.
26. Sage Mission
Sage Bionetworks is a non-profit organization with a vision to
create a commons where integrative bionetworks are evolved by
contributor scientists with a shared vision to accelerate the
elimination of human disease
Building Disease Maps Data Repository
Commons Pilots Discovery Platform
Sagebase.org
29. Products/Approaches
Technology Platform
Governance
Impactful Models Better Models of
Disease:
INFORMATION
COMMONS
Challenges
30. IT
Cons4tuencies
Individual
Pa4ents
Technology
PlaHorm
Biotech
Governance
ImpacHul
Models
BeVer
Models
Pa4ent
of
Disease:
Founda4ons
INFORMATION
COMMONS
Pharma
Challenges
Joint
Pa4ent/
Scien4st
Academic
Communi4es
Consor4a
31. Ongoing
Sage
Bionetworks
Ini4a4ves
IT
Individual
Pharma
EU
Pa4ents
Roche PARTICIPATION
RNDP/FA/MEL
Takeda
Technology
PlaHorm
Communi2es
engaging
COMMONS
PLATFORM
Governance
WPP
ImpacHul
Models
Pa4ent
BeVer
Models
of
Founda4ons
Biotech
Disease:
INFORMATION
COMMONS
Common
Mind/
BrCA/Challenges
Mt.
Sinai
Neuro
Cell
Line
Challenges
Challenge
TCGA/Challenge
Joint
Pa4ent/
SB/Gates
Academic
Scien4st
Discovery
Consor4a
ClearScience
Communi4es
Sage CCSB Network
32. IT
Cons4tuencies
Individual
Pa4ents
Technology
PlaHorm
Biotech
Governance
ImpacHul
Models
BeVer
Models
Pa4ent
of
Disease:
Founda4ons
INFORMATION
COMMONS
Pharma
Challenges
Joint
Pa4ent/
Scien4st
Academic
Communi4es
Consor4a
33. Impactful Models
Breast Cancer: Co-expression Networks
A) Miller 159 samples B) Christos 189 samples
NKI: N Engl J Med. 2002 Dec 19;347(25):1999.
Wang: Lancet. 2005 Feb 19-25;365(9460):671.
Miller: Breast Cancer Res. 2005;7(6):R953.
Christos: J Natl Cancer Inst. 2006 15;98(4):262.
C) NKI 295 samples
E) Super modules
Cell
cycle
Pre-mRNA
ECM
D) Wang 286 samples Blood vessel
Immune
response
33
Zhang B et al., Towards a global picture of breast cancer (manuscript).
34. CHRIS
GAITERI-‐ALZHEIMER’S
What
is
this?
Bayesian
networks
enriched
in
inflammaon
genes
correlated
with
disease
severity
in
pre-‐frontal
cortex
of
250
Alzheimer’s
paents.
What
does
it
mean?
Inflammaon
in
AD
is
an
interacve
mul-‐pathway
system.
More
broadly,
network
structure
organizes
complex
disease
effects
into
coherent
sub-‐systems
and
can
priorize
key
genes.
Are
you
joking?
Gene
validaon
shows
novel
key
drivers
increase
Abeta
uptake
and
decrease
neurite
length
through
an
ROS
burst.
(highly
relevant
to
AD
pathology)
35. IMPACTFUL
MODELS
A
mul-‐ssue
immune-‐driven
theory
of
weight
loss
Hypothalamus
Lep4n
signaling
FaNy
acids
Macrophage/
inflamma4on
Liver
Adipose
M1
macrophage
Phagocytosis-‐
Phagocytosis-‐
induced
lipolysis
induced
lipolysis
36. IT
Cons4tuencies
Individual
Pa4ents
Technology
PlaHorm
Biotech
Governance
ImpacHul
Models
BeVer
Models
Pa4ent
of
Disease:
Founda4ons
INFORMATION
COMMONS
Pharma
Challenges
Joint
Pa4ent/
Scien4st
Academic
Communi4es
Consor4a
37. Two approaches to building common
scientific and technical knowledge
Every code change versioned
Every issue tracked
Text summary of the completed project Every project the starting point for new work
Assembled after the fact All evolving and accessible in real time
Social Coding
38. Synapse is GitHub for Biomedical Data
Every code change versioned
Every issue tracked
Data and code versioned Every project the starting point for new work
Analysis history captured in real time All evolving and accessible in real time
Work anywhere, and share the results with anyone Social Coding
Social Science
43. Data Analysis with Synapse
Run Any Tool
On Any Platform
Record in Synapse
Share with Anyone
44. • Automated
workflows
for
curaon,
QC,
and
sharing
of
1%/2* 53,'6%(* !7"(%,2/"* large-‐scale
datasets.
-./#"++0%(* (3&4"#*
• All
of
TCGA,
GEO,
and
user-‐submiVed
data
processed
with
standard
normalizaon
methods.
1%/2* 53,'6%(* !7"(%,2/"* • Searchable
TCGA
data:
-./#"++0%(* (3&4"#* • 23
cancers
• 11
data
plahorms
• Standardized
meta-‐data
ontologies
-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*
!#"80)69"*&%8":*
;"("#'6%(*
!"#$%#&'()"*
'++"++&"(,*
45. 1%/2* 53,'6%(* !7"(%,2/"* • Comparison
of
many
modeling
approaches
applied
-./#"++0%(* (3&4"#*
to
the
same
data.
• Models
transparently
shared
and
reusable
through
-./#"++0%(*
1%/2* 53,'6%(* !7"(%,2/"* Synapse.
(3&4"#*
• Displayed
is
comparison
of
6
modeling
approaches
to
predict
sensivity
to
130
drugs.
• Extending
pipeline
to
evaluate
predicon
of
-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"* TCGA
phenotypes.
1%/2* 1%/2*
(3&4"#* (3&4"#* • Hosng
of
collaborave
compeons
to
compare
53,'6%(* 53,'6%(* models
from
many
groups.
1--'&2-3$4567$
!#"80)69"*&%8":*
*&+%,-./0$
;"("#'6%(*
!"#$%#&'()"*
'++"++&"(,*
!"#$%&'()$
47. IT
Cons4tuencies
Individual
Pa4ents
Technology
PlaHorm
Biotech
Governance
ImpacHul
Models
BeVer
Models
Pa4ent
of
Disease:
Founda4ons
INFORMATION
COMMONS
Pharma
Challenges
Joint
Pa4ent/
Scien4st
Academic
Communi4es
Consor4a
48. Clinical Trial Comparator Arm
Partnership (CTCAP)
Description: Collate, Annotate, Curate and Host Clinical Trial Data
with Genomic Information from the Comparator Arms of Industry and
Foundation Sponsored Clinical Trials: Building a Site for Sharing
Data and Models to evolve better Disease Maps.
Public-Private Partnership of leading pharmaceutical companies,
clinical trial groups and researchers.
Neutral Conveners: Sage Bionetworks and Genetic Alliance
[nonprofits].
Initiative to share existing trial data (molecular and clinical) from
non-proprietary comparator and placebo arms to create powerful
new tool for drug development.
Started Sept 2010
49. Shared clinical/genomic data sharing and analysis will
maximize clinical impact and enable discovery
• Graphic
of
curated
to
qced
to
models
50. Arch2POCM
Restructuring
the
Precompeve
Space
for
Drug
Discovery
How
to
potenally
De-‐Risk
High-‐Risk
Therapeuc
Areas
53. How can we accelerate the pace of scientific discovery?
2008
2009
2010
2011
Ways to move beyond
“traditional” collaborations?
Intra-lab vs Inter-lab
Communication
Pfizer CTI/ Industrial PPPs
Academic Unions
55. sage federation:
model of biological age
Faster Aging
Predicted
Age
(liver
expression)
Slower Aging
Clinical Association
- Gender
- BMI
- Disease
Age Differential Genotype Association
Gene Pathway Expression
Chronological
Age
(years)
56. Reproducible
science==shareable
science
Sweave: combines programmatic analysis with narrative
Dynamic generation of statistical reports
using literate data analysis
Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reports
using literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 –
Proceedings in Computational Statistics,pages 575-580.
Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9
62. IT
Cons4tuencies
Individual
Pa4ents
Technology
PlaHorm
Biotech
Governance
ImpacHul
Models
BeVer
Models
Pa4ent
of
Disease:
Founda4ons
INFORMATION
COMMONS
Pharma
Challenges
Joint
Pa4ent/
Scien4st
Academic
Communi4es
Consor4a
63. What
is the problem?
Our current models of disease biology are primitive and
limit doctor’s understanding and ability to treat patients
Current incentives reward those who
silo information and work in closed
systems
64. The Solution: Competitions to crowd-source
research in biology and other fields
Why competitions?
• Objective assessments
• Acceleration of progress
• Transparency
• Reproducibility
• Extensible, reusable models
Competitions in biomedical research
• CASP (protein structure)
• Fold it / EteRNA (protein / RNA structure)
• CAGI (genome annotation)
• Assemblethon / alignathon (genome assembly / alignment)
• SBV Improver (industrial methodology benchmarking)
• DREAM (co-organizer of Sage/DREAM competition)
Generic competition platforms
• Kaggle, Innocentive, MLComp
65. The Sage/DREAM breast cancer prognosis
challenge
Goal: Challenge to assess the accuracy of computational models designed to
predict breast cancer survival using patient clinical and genomic data
Why this is unique:
This Sage/DREAM Challenge is a pre-collated cohort: 2000 breast cancer samples
from the Metabric cohort
Accessible to all: A cloud-based common compute architecture is being made
available by Google to support the computational models needed to develop and test
challenge models
New Rigor:
• Contestants will evaluate their models on a validation data set composed of newly generated
data (provided by Dr. Anne-Lise Borreson Dale)
• Contestants must demonstrate their models can be reproduced by others
New incentives: leaderboard to energize participants, Science Translational Medicine
publication for winning team
Breast cancer patients, funders and researchers can track this Challenge on BRIDGE,
an open source online community being built by Sage and Ashoka Changemakers and
affiliated with this Challenge
66. Sage/DREAM Challenge: Details and Timing
Phase
1: July thru end-Sep 2012 Phase
2:
Oct 1 thru Nov 12, 2012
Training data: 2,000 breast cancer
samples from METABRIC cohort Evaluation of models in novel
dataset.
• Gene expression
• Copy number Validation data: ~500 fresh frozen
• Clinical covariates tumors from Norway group with:
• 10 year survival • Clinical covariates
Supporting data: Other Sage-curated • 10 year survival
breast cancer datasets
• >1,000 samples from GEO Gene expression and copy number
data to be generated for model
• ~800 samples from TCGA evaluation
• ~500 additional samples from • Sent to Cancer Research UK to
Norway group generate data at same facility as
• Curated and available on METABRIC
Synapse, Sage’s compute • Models built on training data
platform evaluated on newly generated
data
Data released in phases on Synapse
from now through end-September
Winners announced at November
12 DREAM conference
Will evaluate accuracy of models built
on METABRIC data to predict survival
in:
• Held out samples from
METABRIC
• Other datasets
67. METABRIC
cohort:
1%/2* 53,'6%(* !7"(%,2/"* 997
breast
cancer
samples
-./#"++0%(* (3&4"#*
Clinical
covariates
1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#* Gene
expression
(Illumina
HT12v3)
Copy
number
(Affy
SNP
6.0)
-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*
10
year
survival
!#"80)69"*&%8":*
;"("#'6%(*
!"#$%#&'()"*
'++"++&"(,*
Loaded
through
Synapse
R
client
as
Bioconductor
objects.
68. 1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#*
1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#*
-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2* Custom
models
implement
train()
and
(3&4"#* (3&4"#* predict()
API.
53,'6%(* 53,'6%(*
!#"80)69"*&%8":*
;"("#'6%(*
!"#$%#&'()"*
'++"++&"(,*
Implementa)on
of
simple
clinical-‐only
survival
model
used
as
baseline
predictor.
70. Summary
hVps://synapse.sagebase.org/
-‐
BCCOverview:0
Transparency,
Valida4on
in
novel
reproducibility
-./#"++0%(*
1%/2*
(3&4"#* 53,'6%(* !7"(%,2/"*
dataset
1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#*
-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*
!#"80)69"*&%8":*
;"("#'6%(*
!"#$%#&'()"*
'++"++&"(,*
Publica4on
in
Science
Dona4on
of
Google-‐
Transla4onal
Medicine
scale
compute
space.
For
the
goal
of
promo4ng
democra4za4on
of
medicine…
Registra4on
star4ng
NOW…
sign
up
at:
synapse.sagebase.org
71.
72.
73. SUMMARY
These
new
data
intensive
models
of
disease
will
be
strikingly
powerful
They
will
not
arise
within
the
current
academic/industrial
loop
They
will
be
harder
and
more
expensive
than
we
can
accept
Cizenss
as
donors
of
data
insights
and
funds
will
be
crical
For
these
benefits
to
be
realized
-‐
must
become
affordable
therefore
we
willl
need:
Compute
spaces
A
Commons
New
ways
of
being
rewarded
More
eyeballs
working
without
being
paid
More
willingness
to
share
ll
aoer
Clinical
Proof
of
Concept
74. Alignments between the UCSF Strategic Plan
and the matchstick pilots for the Information
Commons being pursued by Sage Bionetworks
• Invest in infrastructure that enables UCSF to excel in basic,
clinical and population science- SYNAPSE as a compute space
-links with Michael Wiener/ OneMind
• Build a Bioinformatics initiative across all school by
June
2014- Impactful Models / Challenges with Laura Esserman/
LauraVant’Veer
• Enhance existing data repositories and mining tools by June
2012 - Sage collaborations with Alex Pico, Geoff Manley
• Develop infrastructure to support new team-based,
interdisciplinary learning models- PLC (The Federation?)
• Accelerate translation of groundbreaking science into
therapies Arch2POCM