The document summarizes data science education resources developed by researchers at Oregon Health & Science University. It describes the challenges of managing vast amounts of biomedical data and the goal of providing training to address this issue. The team developed skills courses and open educational resources (OERs) on topics across the data science life cycle. Courses included introductory, advanced, and targeted workshops. OER modules covered a range of data science topics and mapped to competencies for health sciences librarians. The team seeks to disseminate the resources broadly while addressing challenges around customization for different users and protection of intellectual property.
Global Lehigh Strategic Initiatives (without descriptions)
Data science education resources for everyone
1. Data science education resources for
everyone
Nicole Vasilevsky, Jackie Wirz, Bjorn Pederson, Ted Laderas, Shannon McWeeney,
William Hersh, David A. Dorr, Melissa Haendel
Oregon Health & Science University
MLA/PNC 2016
2. The problem
Major challenge: how to manage, analyze
and interpret vast amounts of data being
generated in biomedical research
One goal of NIH Big Data to Knowledge
(BD2K) initiative: provide training for
students and researchers to address this
Research team in the Library and
Department of Medical Informatics and
Clinical Epidemiology (DMICE) is
developing skills courses and open
educational resources (OERs)
http://whelf.ac.uk/activity-data-delivering-benefits-from-the-data-deluge/
3. Our Approach
Skills courses and OERs connect the dots that help researchers understand how to apply
data science techniques in the context of their whole research life cycle
Skills courses and OER topics are aimed to fill specific gaps
4. BD2K Skills Courses
Taught by BD2K Faculty,
Post-doc and Staff
In person format Targeted to a variety of
students
5. Defining The Problem Wrangling Data
Data Identification And Resources
Problems amenable to analytics
Importance of question
Team definitions
Scope
When we do this wrong: methods don't match
Finding the right data
Search methods
Use of metadata
Data management
Exploratory Data Analysis
Data Dictionary
As you touch data, what can go wrong?
Methods, Tools And Analysis Scientific Communication
Visualization
Matching algorithms to problems
…
Reporting Findings and Limitations
Giving “Elevator Speech” on ideas of how to
approach problem
Critique of related problem
6. Course Offerings
Course Length Who WhatWhen
Intro Course
Week long course
(~40 hrs)
July
2015
Interns and
undergraduates
Taught basics of data
science in the context
of the research life
cycle
Data After Dark
2 evening course
(4 hrs/nt)
January
2016
OHSU students,
staff and faculty
Emerging data science
activities/research
impact
Data and Donuts
2 morning course
(3 hrs/day)
June
2016
OHSU Summer
interns
Basics of data science
Advanced Course
4 evening course
(2 hrs/nt)
May
2016
OHSU students,
staff and faculty
Hands on Data viz /
Data wrangling
Data and Donuts
West
4 hour course
July
2016
OHSU summer
interns (West
Campus)
Basics of data science
7. Think like a data scientis
t
- the Data and Donuts workshop
will provide an introduction to data science for those new
to research. Summer interns encouraged to attend!
Topics covered will include
• What is Big Data?
• Asking the right question and getting
the right
answers from your data
• Finding data resources in the real world
• Data handling 101
• Ethics of data• Communicating your science for maximal
impact
June 2 8 & 2 9 | 9 - 1 2 PM | D onut s!
Fr ee Wor kshop!
DataAndDonuts
Interested?
Register at http
:
// bit.ly/ 1sfDeXz
or email wirzj@ohsu.eduw w w .ohsu.edu/ bd2 k
Hands-on! Learn by Doing!
Join us for a 4 evening workshop:
· Data Wrangling with Python and Pandas
· Interactive visualization with R/ Shiny
· Supervised Learning Algorithms + Kaggle Challenge
Familiarity with R and Git is required. Bring your laptop!
!
May 23-26th
5-7pm
Register at http:/ / bit.ly/ 1pFVvLv
Department of Medical Informatics + Clinical Epidemiology + OHSU Library
Funding: NIH 5R25EB020379
For more information, e-mail bd2k@ohsu.edu
FREE OHSU BD2K ADVANCED
DATA AFTER DARK WORKSHOP
8. Evaluation of Skills Courses
0% 20% 40% 60% 80% 100%
Evaluation Summary from Beginnner
Students
Beginner Percent 6 & 7 Beginner Percent 3, 4 & 5
Beginner Percent 1 & 2
0% 20% 40% 60% 80% 100%
Evaluation Summary from Advanced
Students
Advanced Percent 6 & 7 Advanced Percent 3, 4 & 5
Advanced Percent 1 & 2
The instructors clearly presented the
skills to be learned
The instructors presented
content in an organized manner
The instructors effectively presented
concepts and techniques
9. OER Modules
01 | Biomedical Big Data Science
02 | Introduction to Big Data in Biology and Medicine
03 | Ethical Issues in Use of Big Data
04 | Clinical Standards Related to Big Data
05 | Basic Research Data Standards
06 | Public Health and Big Data
07 | Team Science
08 | Secondary Use (Reuse) of Clinical Data
09 | Publication and Peer Review
10 | Information Retrieval
11 | Version Control and Identifiers
12 | Data annotation and curation
13 | Data Tools and Landscape
14 | Ontologies 101
15 | Data metadata and provenance
16 | Semantic data interoperability
17 | Choice of Algorithms and Algorithm Dynamics
18 | Visualization and Interpretation
19 | Replication, Validation and the spectrum of
Reproducibility Semantic data interoperability
20 | Regulatory Issues in Big Data for Genomics and Health
Semantic Web data
21 | Hosting data dissemination and data stewardship
workshops
22 | Hosting data dissemination and data stewardship
workshops
23 | Terminology of Biomedical, Clinical, and Translational
Research
24 | Computing Concepts for Big Data
25 | Data modeling
26 | Semantic Web data
27 | Context-based selection of data
28 | Translating the Question
29 | Implications of Provenance and Pre-processing
30 | Data tells a story
31 | Statistical Significance, P-hacking and Multiple-testing
32 | Displaying Confidence and Uncertainty
https://dmice.ohsu.edu/bd2k/topics.html
10. What is available in the modules?
Module Overview Online viewing Powerpoint files Audio files
Exercises References Resources
11. MLA- Professional Competencies For Health Sciences Librarians
https://dmice.ohsu.edu/bd2k/mapping_MLA.html
Competency #1
Understand the health sciences and
health care environment and the policies,
issues, and trends that impact that
environment
BDK02 - Introduction
To Big Data In Biology
And Medicine
BDK03 - Ethical Issues
In Use Of Big Data
BDK07- Team Science
Competency #3
Understand the principles and practices
related to providing information services
to meet users' needs
BDK10 - Information
Retrieval
BDK22 - Guidelines For
Reporting, Publications,
And Data Sharing
Competency #4
Have the ability to manage health
information resources in a broad range of
formats
BDK09 - Publication And Peer
Review
BDK12 - Data Annotation And
Curation
BDK14 - Ontologies 101
BDK15 - Data Metadata And
Provenance
Competency #5
Understand and use technology and
systems to manage all forms of
information
BDK10 - Information Retrieval
BDK12 - Data Annotation And
Curation
BDK13 - Data and tools
landscape
BDK14 - Ontologies 101
BDK26 - Introduction to
Semantic Web data
Competency #6
Understand curricular design and
instruction and have the ability to teach
ways to access, organize, and use
information
BDK21 - Hosting Data
Dissemination And Data
Stewardship Workshops
Competency #7
Understand scientific research methods
and have the ability to critically examine
and filter research literature from many
related disciplines
BDK07- Team Science
BDK18 - Visualization And
Interpretation
BDK19 - Replication,
Validation And The
Spectrum Of Reproducibility
BDK01 - Biomedical Big Data
Science
BDK04 - Clinical Data And Standards
Related To Big Data
BDK05 - Basic Research Data Standards
BDK04 - Clinical Data And Standards
Related To Big Data
BDK05 - Basic Research Data Standards
12. Challenges
Scope
Images
Style
Dissemination
How to scope generic curricula for different
levels of users
How to translate diverse teaching
styles into general materials
How to maximize dissemination
while protecting intellectual
property
How to incorporate images and other
copyrighted materials into open
resources
13. Who are these resources for? EVERYONE!
thenounproject.com
Undergraduate
Students
Graduate
Students
Clinicians Post-docs
Librarians Staff Faculty
14. Help review our modules:
https://dmice.ohsu.edu/bd2k/topics.html
15. Acknowledgements
Bill Hersh, PI Melissa Haendel, PI Shannon McWeeney, PI David Dorr, PI
Ted Laderas,
Instructor
Jackie Wirz,
Instructor
Nicole Vasilevsky,
Instructor
Bjorn Pederson,
Instructional Designer
This work is supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01.
16. You can find me at:
@n_vasilevsky
vasilevs@ohsu.edu
Thanks!
Notas del editor
I won’t read all the content on this slide, the point will just be that we mapped the MLA professional competencies to the BD2K modules. For 6 of the 7 MLA professional competencies, there are BD2K modules that could help train Librarians in these areas.