Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Supercomputers and Supernetworks
1. “Analyzing the Human Gut Microbiome Dynamics in Health
and Disease Using Supercomputers and Supernetworks”
Invited Presentation
ESnet CrossConnects Bioinformatics Conference
Lawrence Berkeley National Laboratory
April 12, 2016
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
2. Abstract
To truly understand the state of the human body in health or disease, we now realize that we must consider a much more
complex system than medical science considered heretofore. This is because we now know that the human body is host
to 100 trillion microorganisms, ten times the number of DNAbearing cells in the human body and these microbes contain
300 times the number of DNA genes that our human DNA does. The microbial component of our “superorganism” is
comprised of hundreds of species with immense biodiversity. Exponential decrease in the cost of genetic sequencing
and supercomputing has enabled scientists to finally "read out" the nature of the changes in the microbial ecology in
people in health and with disease. We use the fiber optic network of the Pacific Research Platform to rapidly move these
large datasets. To put a more personal face on the “patient of the future,” I have been collecting massive amounts of
data from my own body over the last five years, which reveals detailed examples of the episodic excursions of my
coupled immunemicrobial system. As similar techniques become more widely applied, we can look forward to
revolutionary changes in medical practice over the next decade.
3. From One to a Trillion Data Points Defining Me in 15 Years:
The Exponential Rise in Body Data
Weight
Blood Biomarker
Time Series
Human Genome
SNPs
Microbial Genome
Time Series
Improving Body
Discovering Disease
Human Genome
4. As a Model for the Precision Medicine Initiative,
I Have Tracked My Internal Biomarkers To Understand My Body’s Dynamics
My Quarterly
Blood Draw
Calit2 64 Megapixel VROOM
5. Only One of My Blood Measurements
Was Far Out of Range--Indicating Chronic Inflammation
Normal Range <1 mg/L
27x Upper Limit
Complex Reactive Protein (CRP) is a Blood Biomarker
for Detecting Presence of Inflammation
Episodic Peaks in Inflammation
Followed by Spontaneous Drops
6. Adding Stool Tests Revealed
Oscillatory Behavior in an Immune Variable Which is Antibacterial
Normal Range
<7.3 µg/mL
124x Upper Limit for Healthy
Lactoferrin is a Protein Shed from Neutrophils -
An Antibacterial that Sequesters Iron
Typical
Lactoferrin Value for
Active Inflammatory
Bowel Disease
(IBD)
7. Descending Colon
Sigmoid Colon
Threading Iliac Arteries
Major Kink
Confirming the IBD (Colonic Crohn’s) Hypothesis:
Finding the “Smoking Gun” with MRI Imaging
I Obtained the MRI Slices
From UCSD Medical Services
and Converted to Interactive 3D
Working With Calit2 Staff
Transverse Colon
Liver
Small Intestine
Diseased Sigmoid Colon
Cross Section
MRI Jan 2012
Severe Colon
Wall Swelling
8. Why Did I Have an Autoimmune Disease
like Crohn’s Disease?
Despite decades of research,
the etiology of Crohn's disease
remains unknown.
Its pathogenesis may involve
a complex interplay between
host genetics,
immune dysfunction,
and microbial or environmental factors.
--The Role of Microbes in Crohn's Disease
Paul B. Eckburg & David A. Relman
Clin Infect Dis. 44:256-262 (2007)
I Have Been Quantifying All Three
9. I Found I Had One of the Earliest Known SNPs
Associated with Crohn’s Disease
From www.23andme.com
SNPs Associated with CD
Polymorphism in
Interleukin-23 Receptor Gene
— 80% Higher Risk
of Pro-inflammatory
Immune Response
NOD2
IRGM
ATG16L1
23andme is Now Collecting
10,000 IBD Patient’s SNPs
10. I Reasoned That The Driver of My Gut Autoimmune Disease
Was a Disturbance in My Gut Microbiome Ecology
Inclusion of the “Dark Matter” of the Body
Will Radically Alter Medicine
99% of Your
DNA Genes
Are in Microbe Cells
Not Human Cells
Your Body Has 10 Times
As Many Microbe Cells As DNA-Bearing
Human Cells
11. The Carl Woese Tree of Life
Shows The Most Life on Earth is Bacterial
Nature Microbiology
Hug, et al.
Source: Carl Woese, et al (1990)
12. The Human Gut
as a Super-Evolutionary Microbial Cauldron
• Enormous Density
– 1000x Ocean Water
• Highly Dynamic Microbial Ecology
– Hundreds to Thousands of Species
• Horizontal Gene Transfer
• Phages
• Adaptive Selection Pressures (Immune System)
– Innate Immune System
– Adaptive Immune System
– Macrophages and Antimicrobial proteins
• Constantly Changing Environmental Pressures
– Diet
– Antibiotics
– Pharmaceuticals
13. To Map Out the Dynamics of Autoimmune Microbiome Ecology
Couples Next Generation Genome Sequencers to Big Data Supercomputers
Source: Weizhong Li, UCSD
Our Team Used 25 CPU-years
to Compute
Comparative Gut Microbiomes
Starting From
2.7 Trillion DNA Bases
of My Samples
and Healthy and IBD Controls
Illumina HiSeq 2000 at JCVI
SDSC Gordon Data Supercomputer
14. We Gathered Raw Illumina Reads on 275 Humans
and Generated a Time Series of My Gut Microbiome
5 Ileal Crohn’s Patients,
3 Points in Time
2 Ulcerative Colitis Patients,
6 Points in Time
“Healthy” Individuals
Source: Jerry Sheehan, Calit2
Weizhong Li, Sitao Wu, CRBS, UCSD
Total of 27 Billion Reads
Or 2.7 Trillion Bases
Inflammatory Bowel Disease (IBD) Patients
250 Subjects
1 Point in Time
7 Points in Time
Each Sample Has 100-200 Million Illumina Short Reads (100 bases)
Larry Smarr
(Colonic Crohn’s)
15. Computational NextGen Sequencing Pipeline:
From Sequence to Taxonomy and Function
PI: (Weizhong Li, CRBS, UCSD):
NIH R01HG005978 (2010-2013, $1.1M)
16. Results Include Relative Abundance of Hundreds of Microbial Species
Average Over 250 Healthy People
From NIH Human Microbiome Project
Note Log Scale
Clostridium difficile
18. We Found Major State Shifts in Microbial Ecology Phyla
Between Healthy and Three Forms of IBD
Most
Common
Microbial
Phyla
Average HE
Average
Ulcerative Colitis
Average LS
Colonic Crohn’s Disease
Average
Ileal Crohn’s Disease
19. Time Series Reveals Oscillations in Immune Biomarkers
Associated with Time Progression of Autoimmune Disease
Immune &
Inflammation
Variables
Weekly
Symptoms
Pharma
Therapies
Stool
Samples
2009 20142013201220112010 2015
20. In 2016 We Are Extending My Stool Time Series by
Collaborating with the UCSD Knight Lab
Larry’s 40 Stool Samples Over 3.5 Years
to Rob’s lab on April 30, 2015
21. Precision Medicine: Coupling Longitudinal Phenotypic Changes
to Longitudinal Microbiome Evolution
Time Period of 16S
Microbial Sequences
Source: Larry Smarr, UCSD
Larry Smarr’s Weight Over 15 Years
22. Larry Smarr Gut Microbiome Ecology Shifted After Drug Therapy
Between Two Time-Stable Equilibriums Correlated to Physical Symptoms
Lialda
&
Uceris
12/1/13 to 1/1/14
12/1/13-
1/1/14
Frequent IBD Symptoms
Weight Loss
5/1/12 to 12/1/14
Blue Balls on Diagram
to the Right
Few IBD Symptoms
Weight Gain
1/1/14 to 1/1/16
Red Balls on Diagram
to the Right
Principal Coordinate Analysis of
Microbiome Ecology
PCoA by Justine Debelius and Jose Navas,
Knight Lab, UCSD
Weight Data from Larry Smarr, Calit2, UCSD
Antibiotics
Prednisone
1/1/12 to 5/1/12
5/1/12
Weekly Weight (Red Dots Stool Sample)
Few IBD Symptoms
Weight Gain
1/1/14 to 1/1/16
Red Balls on Diagram
to the Right
23. To Expand IBD Project the Knight/Smarr Labs Were Awarded
~ 1 CPU-Century Supercomputing Time
• Smarr Gut Microbiome Time Series
– From 7 Samples Over 1.5 Years
– To 50 Samples Over 4 Years
• IBD Patients: From 5 Crohn’s Disease and 2 Ulcerative Colitis
Patients to ~100 Patients
– 50 Carefully Phenotyped Patients Drawn from Sandborn BioBank
– 43 Metagenomes from the RISK Cohort of Newly Diagnosed IBD patients
• New Software Suite from Knight Lab
– Re-annotation of Reference Genomes, Functional / Taxonomic Variations
– Novel Compute-Intensive Assembly Algorithms from Pavel Pevzner
8x Compute Resources
Over Prior Study
24. Cancer Genomics Hub (UCSC) Demonstrates Need for SuperNetworks:
Large Data Flows to End Users at UCSC, UCB, UCSF, …
1G
8G
Data Source: David Haussler,
Brad Smith, UCSC
15G
Jan 2016
30,000 TB
Per Year
25. Building a UC San Diego High Performance Cyberinfrastructure
to Support Distributed Integrative Omics
FIONA
12 Cores/GPU
128 GB RAM
3.5 TB SSD
48TB Disk
10Gbps NIC
Knight Lab
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
200GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
1.3Tbps
PRP/
26. Based on Community Input and on ESnet’s Science DMZ Concept,
NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways
Red 2012 CC-NIE Awardees
Yellow 2013 CC-NIE Awardees
Green 2014 CC*IIE Awardees
Blue 2015 CC*DNI Awardees
Purple Multiple Time Awardees
Source: NSF
27. The Pacific Wave Platform
Creates a Regional Science-Driven “Big Data Freeway System”
Source:
John Hess, CENIC
Funded by NSF $5M Oct 2015-2020
Flash Disk to Flash Disk File Transfer Rate
PI: Larry Smarr, UC San Diego Calit2
Co-PIs:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2,
• Philip Papadopoulos, UC San Diego SDSC,
• Frank Wuerthwein, UC San Diego Physics
and SDSC
28. The Emergence of Precision or P4 Medicine --
Predictive, Preventive, Personalized, Participatory
Systems Biology &
Systems Medicine
Consumer-Driven
Social Networks
P4
MEDICINE
Digital Revolution
Big Data
How Will the Quantified Consumer
Be Integrated into Healthcare Systems?
Lee Hood, Director ISB
29. Thanks to Our Great Team!
Calit2@UCSD
Future Patient Team
Jerry Sheehan
Tom DeFanti
Joe Keefe
John Graham
Kevin Patrick
Mehrdad Yazdani
Jurgen Schulze
Andrew Prudhomme
Philip Weber
Fred Raab
Ernesto Ramirez
JCVI Team
Karen Nelson
Shibu Yooseph
Manolito Torralba
Ayasdi
Devi Ramanan
Pek Lum
UCSD Metagenomics Team
Weizhong Li
Sitao Wu
SDSC Team
Michael Norman
Mahidhar Tatineni
Robert Sinkovits
UCSD Health Sciences Team
David Brenner
Rob Knight Lab
Justine Debelius
Jose Navas
Gail Ackermann
Greg Humphrey
William J. Sandborn Lab
Elisabeth Evans
John Chang
Brigid Boland
Dell/R Systems
Brian Kucic
John Thompson