SlideShare una empresa de Scribd logo
1 de 36
What Can Happen when Genome
Sciences Meets Data Sciences?
Philip E. Bourne PhD, FACMI
Stephenson Chair of Data Science
Director, Data Science Institute
Professor of Biomedical Engineering
peb6a@virginia.edu
https://www.slideshare.net/pebourne
02/14/18 UVA Genome Sciences 1
I am more interested in having a
discussion than giving a lecture …
This is not about my research
specifically but what is happening
more broadly
02/14/18 UVA Genome Sciences 2
Agenda
• Some context
– My definition of data science
– What drives my thinking
– What is the NIH thinking?
• Relevant examples
• The DSI and what is happening at UVA
• Together, where do we go from here?
02/14/18 UVA Genome Sciences 3
What Do I Mean by Big Data/Data
Science?
• Use of the ever increasing amount of open,
complex, diverse digital data
• Finding ways to ask and then answer relevant
questions by combining such diverse data sets
• Arriving at statistically significant conclusions
not otherwise obtainable
• Sharing such findings in a useful way
• Translating such findings into actions that
improve the human condition
02/14/18 UVA Genome Sciences 4
What Drives my
Thinking?
Disruption:
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,Velocity,Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
From a presentation to the Advisory Board to the NIH Director
Example - Photography
502/14/18 UVA Genome Sciences
A Few Random Data {Science} Facts
• There are ~2.7 Zetabytes (2.7 x 106 PB) of digital
data currently
– = US population tweeting 3x/min for 26,976 years
• Big data currently estimated as a $50bn business
– could save $3.1tn
• 40% growth in data/yr; 5% growth in IT
expenditure
• US 140,000- 190,000 unfilled deep data analytics
jobs
• DSI has 600 applicants this year for 50 spots;
MSDS/MBA highly sought
02/14/18 UVA Genome Sciences 6
A Few Random Data {Science} Facts
• There are ~2.7 Zetabytes (2.7 x 106 PB) of digital
data currently
– = US population tweeting 3x/min for 26,976 years
• Big data currently estimated as a $50bn business
– could save $3.1tn – private sector research
• 40% growth in data/yr; 5% growth in IT
expenditure - undervalued
• US 140,000- 190,000 unfilled deep data analytics
jobs – competition for skilled researchers high
• DSI has 600 applicants this year for 50 spots;
MSDS/MBA highly sought – large human capital
02/14/18 UVA Genome Sciences 7
How Much Biomedical Data?
• Big Data
– Total data from NIH-funded research in 2016
estimated at 650 PB*
– 20 PB of that is in NCBI/NLM (3%) and it is
expected to grow by 10 PB in 2016
• Dark Data
– Only 12% of data described in published papers is
in recognized archives – 88% is dark data^
• Cost
– 2007-2014: NIH spent ~$1.2Bn extramurally on
maintaining data archives
* In 2012 Library of Congress was 3 PB
^ http://www.ncbi.nlm.nih.gov/pubmed/26207759
02/14/18 UVA Genome Sciences 8
Consider Some Current High Profile
NIH Examples Where Data Science is
Being Applied
• Moonshot - Bringing together 5 petabytes of homogenized data within the
Genome Data Commons (GDC) to explore genotype-phenotype
relationships
• MODs – Multiple high value high cost genomic resources
• Human Microbiome Project – microbe characterization and analysis
• TOPMed – Genomic, proteomic, metabolomic, image and EHR data
• All-of-Us Precision Medicine - Building a platform to support data on >1M
individuals with extensive and constantly updated health profiles
• ECHO – Effects of Environmental Exposures on Child Health and
Development - Integration of child health and environmental data
• BRAIN - Temporal and spatial analysis of neural circuits
9
How is Data Science Being Applied?
• Moonshot – new ways to analyze genotype-phenotype associations
• MODs – new curation and integration tools
• Human Microbiome Project – new cloud based tools
• TOPMed – large scale storage and analysis; data harmonization
• All-of-Us Precision Medicine – security; analysis of sensor data; EHR
integration
• ECHO – metadata descriptions of health and environmental data;
application of geospatial methods
• BRAIN – methods for network analysis, visualization
All:
Analytics, the Commons, FAIR, sustainability, workforce
10
Wilkinson et al The FAIR Guiding Principles for
scientific data management and stewardship. Sci
Data. 2016 Mar 15;3:160018
https://datascience.nih.gov/TheCommons
Some underlying concerns at NIH…
Reproducibility…
Conformance to data sharing policies
& governance more generally
11
Why a More Open Process?
Use case:
Diffuse Intrinsic Pontine Gliomas (DIPG)
• Occur 1:100,000
individuals
• Peak incidence 6-8 years
of age
• Median survival 9-12
months
• Surgery is not an option
• Chemotherapy ineffective
and radiotherapy only
transitive
From Adam Resnick
02/14/18 UVA Genome Sciences 12
Timeline of genomic studies in DIPG
• Landmark studies identify
histone mutations as
recurrent driver mutations in
DIPG ~2012
• Almost 3 years later, in
largely the same datasets,
but partially expanded, the
same two groups and 2
others identify ACVR1
mutations as a secondary, co-
occurring mutation
From Adam Resnick
02/14/18 UVA Genome Sciences 13
What do we need to do differently to
reveal ACVR1?
• ACVR1 is a targetable kinase
• Inhibition of ACVR1 inhibited tumor
progression in vitro
• ~300 DIPG patients a year
• ~60 are predicted to have ACVR1
• If large scale data sets were only
integrated with TCGA and/or rare
disease data in 2012, ACVR1 mutations
would have been identified
• 60 patients/year X 3 years = 180
children’s lives (who likely succumbed to
the disease during that time) could have
been impacted if only data were FAIR
From Adam Resnick
02/14/18 UVA Genome Sciences 14
Both funders and some institutions
see the need to move from pipes to
platforms to accelerate research…
02/14/18 UVA Genome Sciences 15
https://blog.lexicata.com/wp-content/uploads/2015/03/platform-model-
750x410.png
If platforms are the answer we could
ask the question…
Will biomedical research become more
like Airbnb?
02/14/18 UVA Genome Sciences 16
Vivien Bonazzi
Should biomedical research be Like Airbnb?
doi: 10.1371/journal.pbio.2001818
I am not crazy, hear me out
• Airbnb is a platform that supports a trusted relationship
between consumer (renter) and supplier (host)
• The platform focuses on maximizing the exchange of services
between supplier and consumer and maximizing the amount
of trust associated with a given stakeholder
• It seems to be working:
– 60 million users searching 2 million listings in 192 countries
– Average of 500,000 stays per night.
– Evaluation of US $25bn
02/14/18 UVA Genome Sciences 17
Should biomedical research be Like Airbnb?
doi: 10.1371/journal.pbio.2001818
Platforms will ultimately digitally
integrate the scholarly workflow for
human and machine analysis
Should biomedical research be Like Airbnb?
doi: 10.1371/journal.pbio.2001818UVA Genome Sciences 1802/14/18
Why a comparison to Airbnb is not fair
• Airbnb was born digital
• The exchange of services on Airbnb are
simple compared to what is required of a
platform to support biomedical research
Nevertheless there is much to be
learnt
02/14/18 UVA Genome Sciences 19
Impediments to a biomedical platform
• Current work practices by all stakeholders
• Entrenched business models
• Size of the undertaking aka resources
needed
• Trust
• Incentives to use the platform
http://www.forbes.com/sites/johnhall/2013/04/29/1
0-barriers-to-employee-innovation/#8bdbaa811133
02/14/18 UVA Genome Sciences 20
In summary there is not currently a
widely adopted single platform for
the exchange of services in
biomedical research. Either there is a
platform per service or no platform
at all….
Funders and the institutions they
fund need to work more closely to
implement platforms
02/14/18 UVA Genome Sciences 21
Example: NSF and NIH Approaches
02/14/18 UVA Genome Sciences 22
How is the DSI responding to these
various needs?
02/14/18 UVA Genome Sciences 23
02/14/18 UVA Genome Sciences 24
Working across the grounds
to break down traditional silos
• Currently sustainable
• Planning for where the academical village meets Google – an
ecosystem in which students, faculty, staff, visitors, private sector
reps, entrepreneurs live and work
• Open UVA and open data
• Not owning anything; only working through collaboration e.g.
– Dual degrees
– Research projects across disciplines
• MS DS focusing on practical training
• Dual degrees
• Soon PhD and undergraduate major
• Wikimedian in residence (March, 2018)
02/14/18 UVA Genome Sciences 25
Hallmarks
Emergent DSI Organization
02/14/18 UVA Genome Sciences 26
Data Integration
& Engineering
Machine Learning
& Analytics
Visualization
Data Acquisition
& Dissemination
Ethics, Law,
Policy,
Social Implications
Emergent DSI Organization
02/14/18 UVA Genome Sciences 27
Data Integration
& Engineering
Machine Learning
& Analytics
Visualization
Data Acquisition
& Dissemination
Ethics, Law,
Policy,
Social Implications
Biomedical Data Sciences
Paper Author Paper Reader
Data Provider Data Consumer
Employer Employee
Reagent Provider Reagent Consumer
Software Provider Software Consumer
Grant Writer Grant Reviewer
Supplier Consumer Platform
MS Project
Google Drive
Coursera
Researchgate
Academia.edu
Open Science
Framework
Synapse
F1000
Rio
Educator Student
Data Acquisition &
Dissemination
Pilot Open Data Lab
Underway
UVA Genome Sciences 28gDOC02/14/18
Data Integration and
Engineering
• Ontologies
• Object identifiers
• Indexing schemes
• Common data models
02/14/18 UVA Genome Sciences 29gDOC
Machine Learning &
Analytics
• Neural nets
• Deep learning
• NLP
• Gene expression &
neurological disease (Kipnis)
• Predicting opioid overdose
(VA Health)
• Predicting escalating care
and mortality risk of
cirrhosis patients (UVA HS)
• Human microbiome &
mental health in maternal
health (Physcology &
Nursing)
02/14/18 UVA Genome Sciences 30gDOC
Visualization
• VR
• Networks
• Sonics
• Visualizing microbial
stability (Biology &
Systems)
02/14/18 UVA Genome Sciences 31gDOC
Ethics, Law,
Policy & Social
Implications
• Data sharing
• Privacy
• Normativity
02/14/18 UVA Genome Sciences 32gDOC
Wendy Novicoff, Ph.D
Points of Interaction
• Dual degrees with an MSDS
• Specific projects for:
– Presidential fellows (due March 19, 2018)
– Capstones (due June 29, 2018)
• Thoughts on biomedical data science cluster hires
• Data Science Internship program with NIH, Inova, GMU, VT,
GWU, UMD…
• Join the DSI faculty
• Join the mailing list
– Lunch and learn
– Distinguished lectures
– Special events
02/14/18 UVA Genome Sciences 33
References
• Dunn and Bourne Building the Biomedical Data Science
Workforce PLoS Biol. 2017 Jul 17;15(7):e2003082.
• Bonazzi and Bourne Should Biomedical Research be like
Airbnb? PLoS Biol. 2017 Apr 7;15(4):e2001818.
• McKiernan et al How Open Science Helps Researchers
Succeed Elife. 2016 Jul 7;5. pii: e16800
• Wilkinson et al The FAIR Guiding Principles for scientific
data management and stewardship. Sci Data. 2016
Mar 15;3:160018.
• https://datascience.nih.gov/TheCommons
02/14/18 UVA Genome Sciences 34
Acknowledgements
02/14/18 UVA Genome Sciences 35
The BD2K Team at NIH
My New Colleagues at UVA
The 150 folks who have passed through my laboratory
https://docs.google.com/spreadsheets/d/1QZ48UaKcwDl_iFCvBmJsT03FK-bMchdfuIHe9Oxc-rw/edit#gid=0
Scott and Beth Stephenson
Anonymous donors for the DSI endowment
Thank You
peb6a@virginia.edu
3602/14/18 UVA Genome Sciences

Más contenido relacionado

La actualidad más candente

Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhilip Bourne
 
Smart Data in Health – How we will exploit personal, clinical, and social “Bi...
Smart Data in Health – How we will exploit personal, clinical, and social “Bi...Smart Data in Health – How we will exploit personal, clinical, and social “Bi...
Smart Data in Health – How we will exploit personal, clinical, and social “Bi...Amit Sheth
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceUniversity of Washington
 
Data at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsData at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsPhilip Bourne
 
Biomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterpriseBiomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterprisePhilip Bourne
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionUniversity of Washington
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down HerePhilip Bourne
 
A SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHA SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHPhilip Bourne
 
Research Metadata Mechanics - Simon Porter
Research Metadata Mechanics - Simon PorterResearch Metadata Mechanics - Simon Porter
Research Metadata Mechanics - Simon PorterCASRAI
 
Highlights from NIH Data Science
Highlights from NIH Data ScienceHighlights from NIH Data Science
Highlights from NIH Data SciencePhilip Bourne
 
Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Philip Bourne
 
Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...Academy of Science of South Africa (ASSAf)
 

La actualidad más candente (20)

Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Smart Data in Health – How we will exploit personal, clinical, and social “Bi...
Smart Data in Health – How we will exploit personal, clinical, and social “Bi...Smart Data in Health – How we will exploit personal, clinical, and social “Bi...
Smart Data in Health – How we will exploit personal, clinical, and social “Bi...
 
Science Data, Responsibly
Science Data, ResponsiblyScience Data, Responsibly
Science Data, Responsibly
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 
Urban Data Science at UW
Urban Data Science at UWUrban Data Science at UW
Urban Data Science at UW
 
Data at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsData at the NIH: Some Early Thoughts
Data at the NIH: Some Early Thoughts
 
Biomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterpriseBiomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital Enterprise
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
A SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHA SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIH
 
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent MiningAshutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
 
Data and communication of research: incentives and disincentives
Data and communication of research: incentives and disincentivesData and communication of research: incentives and disincentives
Data and communication of research: incentives and disincentives
 
Research Metadata Mechanics - Simon Porter
Research Metadata Mechanics - Simon PorterResearch Metadata Mechanics - Simon Porter
Research Metadata Mechanics - Simon Porter
 
Highlights from NIH Data Science
Highlights from NIH Data ScienceHighlights from NIH Data Science
Highlights from NIH Data Science
 
Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?
 
Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...
 
Data Science and Urban Science @ UW
Data Science and Urban Science @ UWData Science and Urban Science @ UW
Data Science and Urban Science @ UW
 

Similar a What Can Happen when Genome Sciences Meets Data Sciences?

Are Funders and Academic Institutions Approaches to Data Science Aligned
Are Funders and Academic Institutions Approaches to Data Science AlignedAre Funders and Academic Institutions Approaches to Data Science Aligned
Are Funders and Academic Institutions Approaches to Data Science AlignedPhilip Bourne
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataPhilip Bourne
 
Data Science Meets Academia - What Comes Next?
Data Science Meets Academia - What Comes Next?Data Science Meets Academia - What Comes Next?
Data Science Meets Academia - What Comes Next?Philip Bourne
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data SciencePhilip Bourne
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - finalKathy Fontaine
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis? Amit Sheth
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHPhilip Bourne
 
A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterpriseA Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterprisePhilip Bourne
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsManuel Corpas
 
Big Data and Data Science: Opportunities for Biomedical Engineering
Big Data and Data Science: Opportunities for Biomedical EngineeringBig Data and Data Science: Opportunities for Biomedical Engineering
Big Data and Data Science: Opportunities for Biomedical EngineeringPhilip Bourne
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...African Open Science Platform
 
Why the food sector needs a research infrastructure on Food and Health Consum...
Why the food sector needs a research infrastructure on Food and Health Consum...Why the food sector needs a research infrastructure on Food and Health Consum...
Why the food sector needs a research infrastructure on Food and Health Consum...e-ROSA
 
Big Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedBig Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedPhilip Bourne
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Susanna-Assunta Sansone
 

Similar a What Can Happen when Genome Sciences Meets Data Sciences? (20)

Are Funders and Academic Institutions Approaches to Data Science Aligned
Are Funders and Academic Institutions Approaches to Data Science AlignedAre Funders and Academic Institutions Approaches to Data Science Aligned
Are Funders and Academic Institutions Approaches to Data Science Aligned
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big Data
 
Data Science Meets Academia - What Comes Next?
Data Science Meets Academia - What Comes Next?Data Science Meets Academia - What Comes Next?
Data Science Meets Academia - What Comes Next?
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data Science
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterpriseA Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital Enterprise
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Big Data and Data Science: Opportunities for Biomedical Engineering
Big Data and Data Science: Opportunities for Biomedical EngineeringBig Data and Data Science: Opportunities for Biomedical Engineering
Big Data and Data Science: Opportunities for Biomedical Engineering
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Why the food sector needs a research infrastructure on Food and Health Consum...
Why the food sector needs a research infrastructure on Food and Health Consum...Why the food sector needs a research infrastructure on Food and Health Consum...
Why the food sector needs a research infrastructure on Food and Health Consum...
 
Big Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedBig Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH Headed
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
ACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBSACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBS
 

Más de Philip Bourne

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?Philip Bourne
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 

Más de Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 

Último

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 

Último (20)

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 

What Can Happen when Genome Sciences Meets Data Sciences?

  • 1. What Can Happen when Genome Sciences Meets Data Sciences? Philip E. Bourne PhD, FACMI Stephenson Chair of Data Science Director, Data Science Institute Professor of Biomedical Engineering peb6a@virginia.edu https://www.slideshare.net/pebourne 02/14/18 UVA Genome Sciences 1
  • 2. I am more interested in having a discussion than giving a lecture … This is not about my research specifically but what is happening more broadly 02/14/18 UVA Genome Sciences 2
  • 3. Agenda • Some context – My definition of data science – What drives my thinking – What is the NIH thinking? • Relevant examples • The DSI and what is happening at UVA • Together, where do we go from here? 02/14/18 UVA Genome Sciences 3
  • 4. What Do I Mean by Big Data/Data Science? • Use of the ever increasing amount of open, complex, diverse digital data • Finding ways to ask and then answer relevant questions by combining such diverse data sets • Arriving at statistically significant conclusions not otherwise obtainable • Sharing such findings in a useful way • Translating such findings into actions that improve the human condition 02/14/18 UVA Genome Sciences 4
  • 5. What Drives my Thinking? Disruption: Digitization Deception Disruption Demonetization Dematerialization Democratization Time Volume,Velocity,Variety Digital camera invented by Kodak but shelved Megapixels & quality improve slowly; Kodak slow to react Film market collapses; Kodak goes bankrupt Phones replace cameras Instagram, Flickr become the value proposition Digital media becomes bona fide form of communication From a presentation to the Advisory Board to the NIH Director Example - Photography 502/14/18 UVA Genome Sciences
  • 6. A Few Random Data {Science} Facts • There are ~2.7 Zetabytes (2.7 x 106 PB) of digital data currently – = US population tweeting 3x/min for 26,976 years • Big data currently estimated as a $50bn business – could save $3.1tn • 40% growth in data/yr; 5% growth in IT expenditure • US 140,000- 190,000 unfilled deep data analytics jobs • DSI has 600 applicants this year for 50 spots; MSDS/MBA highly sought 02/14/18 UVA Genome Sciences 6
  • 7. A Few Random Data {Science} Facts • There are ~2.7 Zetabytes (2.7 x 106 PB) of digital data currently – = US population tweeting 3x/min for 26,976 years • Big data currently estimated as a $50bn business – could save $3.1tn – private sector research • 40% growth in data/yr; 5% growth in IT expenditure - undervalued • US 140,000- 190,000 unfilled deep data analytics jobs – competition for skilled researchers high • DSI has 600 applicants this year for 50 spots; MSDS/MBA highly sought – large human capital 02/14/18 UVA Genome Sciences 7
  • 8. How Much Biomedical Data? • Big Data – Total data from NIH-funded research in 2016 estimated at 650 PB* – 20 PB of that is in NCBI/NLM (3%) and it is expected to grow by 10 PB in 2016 • Dark Data – Only 12% of data described in published papers is in recognized archives – 88% is dark data^ • Cost – 2007-2014: NIH spent ~$1.2Bn extramurally on maintaining data archives * In 2012 Library of Congress was 3 PB ^ http://www.ncbi.nlm.nih.gov/pubmed/26207759 02/14/18 UVA Genome Sciences 8
  • 9. Consider Some Current High Profile NIH Examples Where Data Science is Being Applied • Moonshot - Bringing together 5 petabytes of homogenized data within the Genome Data Commons (GDC) to explore genotype-phenotype relationships • MODs – Multiple high value high cost genomic resources • Human Microbiome Project – microbe characterization and analysis • TOPMed – Genomic, proteomic, metabolomic, image and EHR data • All-of-Us Precision Medicine - Building a platform to support data on >1M individuals with extensive and constantly updated health profiles • ECHO – Effects of Environmental Exposures on Child Health and Development - Integration of child health and environmental data • BRAIN - Temporal and spatial analysis of neural circuits 9
  • 10. How is Data Science Being Applied? • Moonshot – new ways to analyze genotype-phenotype associations • MODs – new curation and integration tools • Human Microbiome Project – new cloud based tools • TOPMed – large scale storage and analysis; data harmonization • All-of-Us Precision Medicine – security; analysis of sensor data; EHR integration • ECHO – metadata descriptions of health and environmental data; application of geospatial methods • BRAIN – methods for network analysis, visualization All: Analytics, the Commons, FAIR, sustainability, workforce 10 Wilkinson et al The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3:160018 https://datascience.nih.gov/TheCommons
  • 11. Some underlying concerns at NIH… Reproducibility… Conformance to data sharing policies & governance more generally 11
  • 12. Why a More Open Process? Use case: Diffuse Intrinsic Pontine Gliomas (DIPG) • Occur 1:100,000 individuals • Peak incidence 6-8 years of age • Median survival 9-12 months • Surgery is not an option • Chemotherapy ineffective and radiotherapy only transitive From Adam Resnick 02/14/18 UVA Genome Sciences 12
  • 13. Timeline of genomic studies in DIPG • Landmark studies identify histone mutations as recurrent driver mutations in DIPG ~2012 • Almost 3 years later, in largely the same datasets, but partially expanded, the same two groups and 2 others identify ACVR1 mutations as a secondary, co- occurring mutation From Adam Resnick 02/14/18 UVA Genome Sciences 13
  • 14. What do we need to do differently to reveal ACVR1? • ACVR1 is a targetable kinase • Inhibition of ACVR1 inhibited tumor progression in vitro • ~300 DIPG patients a year • ~60 are predicted to have ACVR1 • If large scale data sets were only integrated with TCGA and/or rare disease data in 2012, ACVR1 mutations would have been identified • 60 patients/year X 3 years = 180 children’s lives (who likely succumbed to the disease during that time) could have been impacted if only data were FAIR From Adam Resnick 02/14/18 UVA Genome Sciences 14
  • 15. Both funders and some institutions see the need to move from pipes to platforms to accelerate research… 02/14/18 UVA Genome Sciences 15 https://blog.lexicata.com/wp-content/uploads/2015/03/platform-model- 750x410.png
  • 16. If platforms are the answer we could ask the question… Will biomedical research become more like Airbnb? 02/14/18 UVA Genome Sciences 16 Vivien Bonazzi Should biomedical research be Like Airbnb? doi: 10.1371/journal.pbio.2001818
  • 17. I am not crazy, hear me out • Airbnb is a platform that supports a trusted relationship between consumer (renter) and supplier (host) • The platform focuses on maximizing the exchange of services between supplier and consumer and maximizing the amount of trust associated with a given stakeholder • It seems to be working: – 60 million users searching 2 million listings in 192 countries – Average of 500,000 stays per night. – Evaluation of US $25bn 02/14/18 UVA Genome Sciences 17 Should biomedical research be Like Airbnb? doi: 10.1371/journal.pbio.2001818
  • 18. Platforms will ultimately digitally integrate the scholarly workflow for human and machine analysis Should biomedical research be Like Airbnb? doi: 10.1371/journal.pbio.2001818UVA Genome Sciences 1802/14/18
  • 19. Why a comparison to Airbnb is not fair • Airbnb was born digital • The exchange of services on Airbnb are simple compared to what is required of a platform to support biomedical research Nevertheless there is much to be learnt 02/14/18 UVA Genome Sciences 19
  • 20. Impediments to a biomedical platform • Current work practices by all stakeholders • Entrenched business models • Size of the undertaking aka resources needed • Trust • Incentives to use the platform http://www.forbes.com/sites/johnhall/2013/04/29/1 0-barriers-to-employee-innovation/#8bdbaa811133 02/14/18 UVA Genome Sciences 20
  • 21. In summary there is not currently a widely adopted single platform for the exchange of services in biomedical research. Either there is a platform per service or no platform at all…. Funders and the institutions they fund need to work more closely to implement platforms 02/14/18 UVA Genome Sciences 21
  • 22. Example: NSF and NIH Approaches 02/14/18 UVA Genome Sciences 22
  • 23. How is the DSI responding to these various needs? 02/14/18 UVA Genome Sciences 23
  • 24. 02/14/18 UVA Genome Sciences 24 Working across the grounds to break down traditional silos
  • 25. • Currently sustainable • Planning for where the academical village meets Google – an ecosystem in which students, faculty, staff, visitors, private sector reps, entrepreneurs live and work • Open UVA and open data • Not owning anything; only working through collaboration e.g. – Dual degrees – Research projects across disciplines • MS DS focusing on practical training • Dual degrees • Soon PhD and undergraduate major • Wikimedian in residence (March, 2018) 02/14/18 UVA Genome Sciences 25 Hallmarks
  • 26. Emergent DSI Organization 02/14/18 UVA Genome Sciences 26 Data Integration & Engineering Machine Learning & Analytics Visualization Data Acquisition & Dissemination Ethics, Law, Policy, Social Implications
  • 27. Emergent DSI Organization 02/14/18 UVA Genome Sciences 27 Data Integration & Engineering Machine Learning & Analytics Visualization Data Acquisition & Dissemination Ethics, Law, Policy, Social Implications Biomedical Data Sciences
  • 28. Paper Author Paper Reader Data Provider Data Consumer Employer Employee Reagent Provider Reagent Consumer Software Provider Software Consumer Grant Writer Grant Reviewer Supplier Consumer Platform MS Project Google Drive Coursera Researchgate Academia.edu Open Science Framework Synapse F1000 Rio Educator Student Data Acquisition & Dissemination Pilot Open Data Lab Underway UVA Genome Sciences 28gDOC02/14/18
  • 29. Data Integration and Engineering • Ontologies • Object identifiers • Indexing schemes • Common data models 02/14/18 UVA Genome Sciences 29gDOC
  • 30. Machine Learning & Analytics • Neural nets • Deep learning • NLP • Gene expression & neurological disease (Kipnis) • Predicting opioid overdose (VA Health) • Predicting escalating care and mortality risk of cirrhosis patients (UVA HS) • Human microbiome & mental health in maternal health (Physcology & Nursing) 02/14/18 UVA Genome Sciences 30gDOC
  • 31. Visualization • VR • Networks • Sonics • Visualizing microbial stability (Biology & Systems) 02/14/18 UVA Genome Sciences 31gDOC
  • 32. Ethics, Law, Policy & Social Implications • Data sharing • Privacy • Normativity 02/14/18 UVA Genome Sciences 32gDOC Wendy Novicoff, Ph.D
  • 33. Points of Interaction • Dual degrees with an MSDS • Specific projects for: – Presidential fellows (due March 19, 2018) – Capstones (due June 29, 2018) • Thoughts on biomedical data science cluster hires • Data Science Internship program with NIH, Inova, GMU, VT, GWU, UMD… • Join the DSI faculty • Join the mailing list – Lunch and learn – Distinguished lectures – Special events 02/14/18 UVA Genome Sciences 33
  • 34. References • Dunn and Bourne Building the Biomedical Data Science Workforce PLoS Biol. 2017 Jul 17;15(7):e2003082. • Bonazzi and Bourne Should Biomedical Research be like Airbnb? PLoS Biol. 2017 Apr 7;15(4):e2001818. • McKiernan et al How Open Science Helps Researchers Succeed Elife. 2016 Jul 7;5. pii: e16800 • Wilkinson et al The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3:160018. • https://datascience.nih.gov/TheCommons 02/14/18 UVA Genome Sciences 34
  • 35. Acknowledgements 02/14/18 UVA Genome Sciences 35 The BD2K Team at NIH My New Colleagues at UVA The 150 folks who have passed through my laboratory https://docs.google.com/spreadsheets/d/1QZ48UaKcwDl_iFCvBmJsT03FK-bMchdfuIHe9Oxc-rw/edit#gid=0 Scott and Beth Stephenson Anonymous donors for the DSI endowment

Notas del editor

  1. $1.25bn per year to capture all data. After a significant effort at reduction, intramurally data is spread across > 60 data centers; imagine the extramural situation.
  2. 36