SlideShare una empresa de Scribd logo
1 de 36
Data Science Meets Structural
Biology
Philip E. Bourne, Cam Mura & Eli Draizen
(Open Team Science)
https://www.slideshare.net/pebourne
08/31/18 DSI Lunch & Learn 1
https://arxiv.org/abs/1807.09247
We are more interested in having a
discussion than giving a lecture …
08/31/18 DSI Lunch & Learn 2
Lets start with a couple of definitions…
08/31/18 DSI Lunch & Learn 3
What Do We Mean by Data Science?
• Use of the ever increasing amount of open,
complex, diverse digital data
• Finding ways to ask and then answer relevant
questions by combining such diverse data sets
• Arriving at statistically significant conclusions
not otherwise obtainable
• Sharing such findings in a useful way
• Translating such findings into actions that
improve the human condition
08/31/18 DSI Lunch & Learn 4
What Do We Mean by Structural
Biology?
08/31/18 DSI Lunch & Learn 5
Structure… What’s it good for??
Classic structural biology example
A point mutation (E6→V) in the Hb β globin chain results in sickle
cell anemia
Structural biology success stories
microtubule
Atomic-resolution studies of cellular-scale systems have bec-
ome increasingly possible — immense explanatory power!
mid-1990s
1960-70s
early1990s
~2002
1986
Why Do We Care About this
Intersection?
08/31/18 DSI Lunch & Learn 8
Stepping back…
Data are transforming how we think about
everything, including biomedical research…
Most folks just do not realize it yet…
Your reading of this slide relies on structural
biology (a photoreceptor called rhodopsin!)
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,Velocity,Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
From a presentation to the Advisory Board to the NIH Director
Example - Photography
908/31/18 DSI Lunch & Learn
How is the DSI Responding to this Change?
• Societal good
• Interdisciplinary
• Practical experience
• Ethical conduct
• Openness and transparency
08/31/18 DSI Lunch & Learn 10
Surge in publications involving machine
learning in the biosciences ('J-curve')
Example of Why More Openness:
Diffuse Intrinsic Pontine Gliomas (DIPG)
• Occur 1:100,000
individuals
• Peak incidence 6-8 years
of age
• Median survival 9-12
months
• Surgery is not an option
• Chemotherapy ineffective
and radiotherapy only
transitive
From Adam Resnick
08/31/18 DSI Lunch & Learn 11
Timeline of genomic studies in DIPG
• Landmark studies identify
histone mutations as
recurrent driver mutations in
DIPG ~2012
• Almost 3 years later, in
largely the same datasets,
but partially expanded, the
same two groups and 2
others identify ACVR1
mutations as a secondary, co-
occurring mutation
From Adam Resnick
08/31/18 DSI Lunch & Learn 12
What do we need to do differently to
reveal ACVR1?
• ACVR1 is a targetable kinase
• Inhibition of ACVR1 inhibited tumor
progression in vitro
• ~300 DIPG patients a year
• ~60 are predicted to have ACVR1
• If large scale data sets were only
integrated with TCGA and/or rare
disease data in 2012, ACVR1 mutations
would have been identified
• 60 patients/year X 3 years = 180
children’s lives (who likely succumbed to
the disease during that time) could have
been impacted if only data were FAIR
From Adam Resnick
08/31/18 DSI Lunch & Learn 13
08/31/18 DSI Lunch & Learn 14
Working across the Grounds
to break down traditional silos
• Sustainable
• Designing for where the academical village meets Google – an
ecosystem in which students, faculty, staff, visitors, private sector
reps, entrepreneurs live and work
• Open UVA and open data – Wikimedian in Residence
• Collaboration
– Dual degrees
– Research projects across disciplines
– Sister institutions
• MS DS focusing on practical training
• PhD program
• Undergraduate major
• Undergraduate certificate
08/31/18 DSI Lunch & Learn 15
Hallmarks
Reflecting Those
Principles
Under development
DSI Organization
Structural Biology is one of Many Cross Cutting Initiatives
08/31/18 DSI Lunch & Learn 16
Data Integration
& Engineering
Machine Learning
& Analytics
Visualization
& Dissemination
Data Acquisition Ethics, Law,
Policy,
Social Implications
Structural Biology
DSI Organization
Structural Biology is one of Many Cross Cutting Initiatives
08/31/18 DSI Lunch & Learn 17
Structural Biology mapped onto the five pillars of Data Science
Structural Biology
Lets Briefly Focus on those Five Points
of Intersection in the Context of
Structural Biology …
08/31/18 DSI Lunch & Learn 18
Data Acquisition
08/31/18 DSI Lunch & Learn 19
The data production issue (the V’s of Big Data)— Experimentally
• Estimated (2017) that ≈2.5 quintillion (2.5×1018) bytes of data generated daily, with 90%
of all the world’s data having been created in the past two years.
• Plaintext PDB files typically ≈ few 100s KB (…but, that’s just the start!)
Data Acquisition
08/31/18 DSI Lunch & Learn 20
The data production issue (the V’s of Big Data)— Computationally
• Here are some 2D RMSD matrices from a µs-scale biomolecular simulation.
• Half a mole (6.02×1023) of calculations!
Data Acquisition
08/31/18 DSI Lunch & Learn 21
The data reduction issue (the V’s of Big Data)— Computationally
• The produce/spawn/consume idiom (MapReduce)
Data Integration and
Engineering
• Data are structured
– Ontologies
– Object identifiers
– Indexing schemes
– Common data models
08/31/18 DSI Lunch & Learn 22
Machine Learning &
Analytics
08/31/18 DSI Lunch & Learn 23
• Structure->Function• Sequence->Structure
Protein•Protein
Protein•Ligand
Binding sites
Machine Learning &
Analytics
• Neural nets
• Deep learning
08/31/18 DSI Lunch & Learn 24
Machine Learning &
Analytics
08/31/18 DSI Lunch & Learn 25
• Deep Learning for Object Recognition/Segmentation
Features in
Image Slice
Predicted
Classes
Badrinarayanan, et al. 2016. arXiv:1511.00561v3
Machine Learning &
Analytics
08/31/18 DSI Lunch & Learn 26
• Deep Learning for Object Recognition/Segmentation
Features in
Volume Slice
Predicted
Classes
Badrinarayanan, et al. 2016. arXiv:1511.00561v3
Visualization
• VR
• Networks
• Sonics
08/31/18 DSI Lunch & Learn 27
Starting point: Structure of a bacterial
protein involved in RNA-associated
regulatory circuits (e.g., virulence)
Visualization
08/31/18 DSI Lunch & Learn 28
Visualization
• VR
• Networks
• Sonics
08/31/18 DSI Lunch & Learn 29
What about dynamics (life not at T=0)?
Visualization
• VR
• Networks
• Sonics
08/31/18
What about dynamics (life not at T=0)?
Visualization
08/31/18
What about physics (of RNA-binding)?
Visualization
08/31/18
What about statistics (log-odds here)?
Visualization
08/31/18
What about cellular-scale systems?
Kozlikova et al., 2016; Comp Graph Forum
Visualization
08/31/18
What about cellular-scale systems?
Kozlikova et al., 2016; Comp Graph Forum
Ethics, Law,
Policy & Social
Implications
•A Story of Fraud
08/31/18 DSI Lunch & Learn 35
Thank You
peb6a@virginia.edu
3608/31/18 DSI Lunch & Learn

Más contenido relacionado

La actualidad más candente

Social metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and QualitySocial metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and Quality
William Gunn
 
Academia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia BehindAcademia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia Behind
William Gunn
 
The Scholarly Publishing Roundtable: Recommendations for access to federally ...
The Scholarly Publishing Roundtable: Recommendations for access to federally ...The Scholarly Publishing Roundtable: Recommendations for access to federally ...
The Scholarly Publishing Roundtable: Recommendations for access to federally ...
T Scott Plutchak
 

La actualidad más candente (20)

Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big Data
 
The NIH Commons: A Cloud-based Training Environment
The NIH Commons: A Cloud-based Training EnvironmentThe NIH Commons: A Cloud-based Training Environment
The NIH Commons: A Cloud-based Training Environment
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
Next Generation Preprint Service
Next Generation Preprint ServiceNext Generation Preprint Service
Next Generation Preprint Service
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Oregon State University Keynote
Oregon State University KeynoteOregon State University Keynote
Oregon State University Keynote
 
Library Data Management Services
Library Data Management ServicesLibrary Data Management Services
Library Data Management Services
 
Curating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesCurating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research Libraries
 
FSCI Data management and data sharing
FSCI Data management and data sharingFSCI Data management and data sharing
FSCI Data management and data sharing
 
Health and clinical research - data futures, NIHR accelerating digital programme
Health and clinical research - data futures, NIHR accelerating digital programmeHealth and clinical research - data futures, NIHR accelerating digital programme
Health and clinical research - data futures, NIHR accelerating digital programme
 
Act teacherlibrarians2016
Act teacherlibrarians2016Act teacherlibrarians2016
Act teacherlibrarians2016
 
How to own your research communications - The importance of identity and owne...
How to own your research communications - The importance of identity and owne...How to own your research communications - The importance of identity and owne...
How to own your research communications - The importance of identity and owne...
 
Summit on Olive Project software emulation and curation service
Summit on Olive Project software emulation and curation serviceSummit on Olive Project software emulation and curation service
Summit on Olive Project software emulation and curation service
 
Social metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and QualitySocial metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and Quality
 
Makers Go To College - Your Digital Future 2016
Makers Go To College - Your Digital Future 2016Makers Go To College - Your Digital Future 2016
Makers Go To College - Your Digital Future 2016
 
Big Data review
Big Data reviewBig Data review
Big Data review
 
Academia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia BehindAcademia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia Behind
 
Information is beautiful
Information is beautifulInformation is beautiful
Information is beautiful
 
The Scholarly Publishing Roundtable: Recommendations for access to federally ...
The Scholarly Publishing Roundtable: Recommendations for access to federally ...The Scholarly Publishing Roundtable: Recommendations for access to federally ...
The Scholarly Publishing Roundtable: Recommendations for access to federally ...
 
The role of academic libraries in supporting social sciences research
The role of academic libraries in supporting social sciences researchThe role of academic libraries in supporting social sciences research
The role of academic libraries in supporting social sciences research
 

Similar a Data Science Meets Structural Biology

Similar a Data Science Meets Structural Biology (20)

Institutional Data Management Blueprint
Institutional Data Management BlueprintInstitutional Data Management Blueprint
Institutional Data Management Blueprint
 
Data Science BD2K Update for NIH
Data Science BD2K Update for NIH Data Science BD2K Update for NIH
Data Science BD2K Update for NIH
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening Research
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
2011.10.10 Multi-Disciplinary Research Themes and Training
2011.10.10 Multi-Disciplinary Research Themes and Training2011.10.10 Multi-Disciplinary Research Themes and Training
2011.10.10 Multi-Disciplinary Research Themes and Training
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Pre jisc datachampday_260318
Pre jisc datachampday_260318Pre jisc datachampday_260318
Pre jisc datachampday_260318
 
Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?
 
Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms:
 
Research Data Management: A Tale of Two Paradigms
Research Data Management: A Tale of Two ParadigmsResearch Data Management: A Tale of Two Paradigms
Research Data Management: A Tale of Two Paradigms
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approach
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big Data
 
Disciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curationDisciplinary and institutional perspectives on digital curation
Disciplinary and institutional perspectives on digital curation
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research Data
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
 

Más de Philip Bourne

Más de Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Data Science Meets Structural Biology

  • 1. Data Science Meets Structural Biology Philip E. Bourne, Cam Mura & Eli Draizen (Open Team Science) https://www.slideshare.net/pebourne 08/31/18 DSI Lunch & Learn 1 https://arxiv.org/abs/1807.09247
  • 2. We are more interested in having a discussion than giving a lecture … 08/31/18 DSI Lunch & Learn 2
  • 3. Lets start with a couple of definitions… 08/31/18 DSI Lunch & Learn 3
  • 4. What Do We Mean by Data Science? • Use of the ever increasing amount of open, complex, diverse digital data • Finding ways to ask and then answer relevant questions by combining such diverse data sets • Arriving at statistically significant conclusions not otherwise obtainable • Sharing such findings in a useful way • Translating such findings into actions that improve the human condition 08/31/18 DSI Lunch & Learn 4
  • 5. What Do We Mean by Structural Biology? 08/31/18 DSI Lunch & Learn 5
  • 6. Structure… What’s it good for?? Classic structural biology example A point mutation (E6→V) in the Hb β globin chain results in sickle cell anemia
  • 7. Structural biology success stories microtubule Atomic-resolution studies of cellular-scale systems have bec- ome increasingly possible — immense explanatory power! mid-1990s 1960-70s early1990s ~2002 1986
  • 8. Why Do We Care About this Intersection? 08/31/18 DSI Lunch & Learn 8 Stepping back… Data are transforming how we think about everything, including biomedical research… Most folks just do not realize it yet… Your reading of this slide relies on structural biology (a photoreceptor called rhodopsin!)
  • 9. Digitization Deception Disruption Demonetization Dematerialization Democratization Time Volume,Velocity,Variety Digital camera invented by Kodak but shelved Megapixels & quality improve slowly; Kodak slow to react Film market collapses; Kodak goes bankrupt Phones replace cameras Instagram, Flickr become the value proposition Digital media becomes bona fide form of communication From a presentation to the Advisory Board to the NIH Director Example - Photography 908/31/18 DSI Lunch & Learn
  • 10. How is the DSI Responding to this Change? • Societal good • Interdisciplinary • Practical experience • Ethical conduct • Openness and transparency 08/31/18 DSI Lunch & Learn 10 Surge in publications involving machine learning in the biosciences ('J-curve')
  • 11. Example of Why More Openness: Diffuse Intrinsic Pontine Gliomas (DIPG) • Occur 1:100,000 individuals • Peak incidence 6-8 years of age • Median survival 9-12 months • Surgery is not an option • Chemotherapy ineffective and radiotherapy only transitive From Adam Resnick 08/31/18 DSI Lunch & Learn 11
  • 12. Timeline of genomic studies in DIPG • Landmark studies identify histone mutations as recurrent driver mutations in DIPG ~2012 • Almost 3 years later, in largely the same datasets, but partially expanded, the same two groups and 2 others identify ACVR1 mutations as a secondary, co- occurring mutation From Adam Resnick 08/31/18 DSI Lunch & Learn 12
  • 13. What do we need to do differently to reveal ACVR1? • ACVR1 is a targetable kinase • Inhibition of ACVR1 inhibited tumor progression in vitro • ~300 DIPG patients a year • ~60 are predicted to have ACVR1 • If large scale data sets were only integrated with TCGA and/or rare disease data in 2012, ACVR1 mutations would have been identified • 60 patients/year X 3 years = 180 children’s lives (who likely succumbed to the disease during that time) could have been impacted if only data were FAIR From Adam Resnick 08/31/18 DSI Lunch & Learn 13
  • 14. 08/31/18 DSI Lunch & Learn 14 Working across the Grounds to break down traditional silos
  • 15. • Sustainable • Designing for where the academical village meets Google – an ecosystem in which students, faculty, staff, visitors, private sector reps, entrepreneurs live and work • Open UVA and open data – Wikimedian in Residence • Collaboration – Dual degrees – Research projects across disciplines – Sister institutions • MS DS focusing on practical training • PhD program • Undergraduate major • Undergraduate certificate 08/31/18 DSI Lunch & Learn 15 Hallmarks Reflecting Those Principles Under development
  • 16. DSI Organization Structural Biology is one of Many Cross Cutting Initiatives 08/31/18 DSI Lunch & Learn 16 Data Integration & Engineering Machine Learning & Analytics Visualization & Dissemination Data Acquisition Ethics, Law, Policy, Social Implications Structural Biology
  • 17. DSI Organization Structural Biology is one of Many Cross Cutting Initiatives 08/31/18 DSI Lunch & Learn 17 Structural Biology mapped onto the five pillars of Data Science Structural Biology
  • 18. Lets Briefly Focus on those Five Points of Intersection in the Context of Structural Biology … 08/31/18 DSI Lunch & Learn 18
  • 19. Data Acquisition 08/31/18 DSI Lunch & Learn 19 The data production issue (the V’s of Big Data)— Experimentally • Estimated (2017) that ≈2.5 quintillion (2.5×1018) bytes of data generated daily, with 90% of all the world’s data having been created in the past two years. • Plaintext PDB files typically ≈ few 100s KB (…but, that’s just the start!)
  • 20. Data Acquisition 08/31/18 DSI Lunch & Learn 20 The data production issue (the V’s of Big Data)— Computationally • Here are some 2D RMSD matrices from a µs-scale biomolecular simulation. • Half a mole (6.02×1023) of calculations!
  • 21. Data Acquisition 08/31/18 DSI Lunch & Learn 21 The data reduction issue (the V’s of Big Data)— Computationally • The produce/spawn/consume idiom (MapReduce)
  • 22. Data Integration and Engineering • Data are structured – Ontologies – Object identifiers – Indexing schemes – Common data models 08/31/18 DSI Lunch & Learn 22
  • 23. Machine Learning & Analytics 08/31/18 DSI Lunch & Learn 23 • Structure->Function• Sequence->Structure Protein•Protein Protein•Ligand Binding sites
  • 24. Machine Learning & Analytics • Neural nets • Deep learning 08/31/18 DSI Lunch & Learn 24
  • 25. Machine Learning & Analytics 08/31/18 DSI Lunch & Learn 25 • Deep Learning for Object Recognition/Segmentation Features in Image Slice Predicted Classes Badrinarayanan, et al. 2016. arXiv:1511.00561v3
  • 26. Machine Learning & Analytics 08/31/18 DSI Lunch & Learn 26 • Deep Learning for Object Recognition/Segmentation Features in Volume Slice Predicted Classes Badrinarayanan, et al. 2016. arXiv:1511.00561v3
  • 27. Visualization • VR • Networks • Sonics 08/31/18 DSI Lunch & Learn 27 Starting point: Structure of a bacterial protein involved in RNA-associated regulatory circuits (e.g., virulence)
  • 29. Visualization • VR • Networks • Sonics 08/31/18 DSI Lunch & Learn 29 What about dynamics (life not at T=0)?
  • 30. Visualization • VR • Networks • Sonics 08/31/18 What about dynamics (life not at T=0)?
  • 33. Visualization 08/31/18 What about cellular-scale systems? Kozlikova et al., 2016; Comp Graph Forum
  • 34. Visualization 08/31/18 What about cellular-scale systems? Kozlikova et al., 2016; Comp Graph Forum
  • 35. Ethics, Law, Policy & Social Implications •A Story of Fraud 08/31/18 DSI Lunch & Learn 35