23. How often are we hurt by going from
the particular to the general
in very complex systems driven by context?
Is this going from the particular to the general
a central problem in
Hypothesis Driven Biomedical Research?
How often do we inappropriately praise
findings that go on to have awkward adjacencies?
27. BUILDING PRECISION MEDICINE
Extensions of Current Institutions
Proprietary Short term Solutions
Open Systems of Sharing in a Commons
28. Overview Technology Software Collabs Outreach Plans
NRNB Investigators
Trey Ideker, PhD
Principal Investigator, NRNB Gary Bader, PhD
Departments of Medicine and Bioengineering Assistant Professor, Terrence Donnelly Centre
University of California, San Diego for Cellular & Biomolecular Research
Dr. Ideker uses genome-scale measurements to University of Toronto
construct network models of DNA damage Dr. Bader works on biological network analysis
response and cancer. He was the 2009 recipient and pathway information resources.
of the Overton Prize from the International
Society for Computational Biology.
James Fowler, PhD
Alex Pico, PhD Associate Professor, CalIT2 Center for Wireless &
Executive Director, NRNB Population Health Systems and Political Science
Gladstone Institute of Cardiovascular Disease University of California, San Diego
Staff Research Scientist
Dr. Fowler’s research concerns social networks,
University of California, San Francisco behavioral economics, evolutionary game theory,
Dr. Pico develops software tools and resources and genopolitics (the study of the genetic basis of
that help analyze, visualize and explore political behavior). His research on social networks
biomedical data in the context of these networks has been featured in Time’s Year in Medicine.
Chris Sander, PhD
Chair, Computational Biology Center, Benno Schwikowski, PhD
Tri-Institutional Professor Chef du Laboratoire/Group Leader
Memorial Sloan-Kettering Cancer Center Pasteur Institute
Dr. Sander’s research focuses on Computational Dr. Schwikowski’s expertise lies in
and Systems Biology of molecules, pathways, and combinatorial algorithms for Computational
processes. and Systems Biology.
29. The National Resource for Network Biology:
Integrating genomes & networks to understand health & disease
NIH NCRR / NIGMS P41 GM103504
Draft Network Assembly
Patient genotype
Genome sequencing
Phenotype
Disease diagnosis
Response to therapy/drug
Side effects
Developmental outcome
1) How to assemble and visualize
Rate of aging, etc.
Gene expression & network models of the cell?
other large scale
molecular state
measurements 2) How to use networks in healthcare?
30.
31.
32. Now possible to generate massive amount of human “omic’s” data
36. Open Social Media allows citizens and experts to use gaming to solve problems
37. 1- Now possible to generate massive amount of human “omic’s” data
2-Network Modeling Approaches for Diseases are emerging
3- IT Infrastructure and Cloud compute capacity allows
a generative open approach to biomedical problem solving
4-Nascent Movement for patients to Control Sensitive information
allowing sharing
5- Open Social Media allows citizens and experts to use gaming to
solve problems
A HUGE OPPORTUNITY -- A HUGE RESPONSIBILITY
38. We focus on a world where biomedical research is about
to fundamentally change. We think it will be often
conducted in an open, collaborative way where teams of
teams far beyond the current guilds of experts will
contribute to making better, faster, relevant discoveries
40. Two recurring problems in Alzheimer’s disease research
Ambiguous pathology
Are disease-associated molecular systems &
genes destructive, adaptive, or both?
Bottom line: We need to identify causal factors
vs correlative or adaptive features of disease.
Diverse mechanisms
How do diverse mutations and environmental
factors combine into a core pathology?
Bottom line: There is no rigorous / consistent global
framework that integrates diverse disease factors.
40
41. Identifying key disease systems and genes- Gaiteri et al.
1.) Identify groups of genes that move together – coexpressed “modules”
- correlated expression of multiple genes across many patients
- coexpression calculated separately for Disease/healthy groups
- these gene groups are often coherent cellular subsystems, enriched in one or
more GO functions
Example “modules” of coexpressed genes, color-coded
42. Identifying key disease systems and genes
1.) Identify groups of genes that move together – coexpressed “modules”
2.) Prioritize the disease-relevance of the modules by clinical and network measures
Prioritize modules through expression
synchrony with clinical measures or
tendency too reconfigure themselves in
disease
vs
43. Identifying key disease systems and genes
1.) Identify groups of genes that move together – coexpressed “modules”
2.) Prioritize the disease-relevance of the modules by clinical and network measures
3.) Incorporate genetic information to find directed relationships between genes
Infer directed/causal relationships
Prioritize modules through expression
and clear hierarchical structure by
synchrony with clinical measures or tendency
too reconfigure themselves in disease incorporating eSNP information
(no hair-balls here)
vs
44. Example network finding: microglia activation in AD
Module selection – what identifies these modules as relevant to Alzheimer’s disease?
The eigengene of a module of ~400 probes correlates with Braak score, age, cognitive disease severity
and cortical atrophy. Members of this module are on average differentially expressed (both up- and
down-regulated).
Evidence these modules are related to microglia function
The members of this module are enriched with GO categories (p<.001) such as “response to biotic
stimulus” that are indicative of immunologic function for this module.
The microglia markers CD68 and CD11b/ITGAM are contained in the module (this is rare – even when a
module appears to represent a specific cell-type, the histological markers may be lacking).
Numerous key drivers (SYK, TREM2, DAP12, FC1R, TLR2) are important elements of microglia signaling .
Alzgene hits found in co-regulated microglia module:
45. Figure key:
Five main immunologic families
found in Alzheimer’s-associated
module
Square nodes in surrounding network
denote literature-supported nodes.
Node size is proportional to
connectivity in the full module.
Core family members are shaded.
(Interior circle) Width of
connections between 5
immune families are
linearly scaled to the
number of inter-family
connections.
Labeled nodes are either highly
connected in the original network,
implicated by at least 2 papers as
associated with Alzheimer’s disease,
or core members of one of the 5
immune families.
48. Design-stage AD projects at Sage
Fusing our expertise in… Gene regulatory networks
Diffusion Spectrum Imaging
Feedback
Microcircuits &
neuronal diversity
Join us in uniting genes, circuits and regions
to build multi-scale biophysical disease models.
Contact chris.gaiteri@sagebase.org
49. PORTABLE LEGAL CONSENT
Control of Private information by Citizens allows sharing
weconsent.us
John Wilbanks
John Wilbanks • Online educational wizard
TED Talk • Tutorial video
• Legal Informed Consent Document
“Let’s pool our medical data” • Profile registration
weconsent.us • Data upload
50. two approaches to building common scientific knowledge
Every code change versioned
Every issue tracked
Text summary of the completed project Every project the starting point for new work
Assembled after the fact All evolving and accessible in real time
Social Coding
51. Synapse is GitHub for Biomedical Data
• Every code change versioned
• Every issue tracked
• Every project the starting point for new work
• Data and code versioned • Social/Interactive Coding
• Analysis history captured in real time
• Work anywhere, and share the results with anyone
• Social/Interactive Science
52. Data Analysis with Synapse
Run Any Tool
On Any Platform
Record in Synapse
Share with Anyone
53. “Synapse is a nascent compute
platform for transparent, reproducible,
and modular collaborative research.”
55. Download analysis and meta-analysis
Download another Cluster Result Download Evaluation and view more stats
• Perform Model averaging
• Compare/contrast models
• Find consensus clusters
• Visualize in Cytoscape
57. Objective assessment of factors influencing model
performance (>1 million predictions evaluated)
Sanger CCLE
Cross validation prediction accuracy (R2)
Prediction accuracy
improved by…
Not discretizing
data
Including
expression data
Elastic net
regression
130 compounds In Sock Jang 24 compounds
58.
59. Sage-DREAM Breast Cancer Prognosis Challenge #1
Building better disease models together
Caldos/Aparicio
breast cancer data
154 participants; 27 countries
334 participants; >35 countries
Sep 26 Status
Challenge Launch: July 17
>500 models posted to Leaderboard
60. How to accelerate and make affordable
the efforts required to build better
models of disease ?
64. How to incent the joint evolution of ideas in a rapid
learning space- prepublication?
How to fund where data generators and analysts are
not always the same people- repeatedly?
Should we consider
Centralized Guilds vs Distributed Dynamic Teams?