A large part of the NECDMC curriculum uses case studies to teach best practices in data management for many different science disciplines. This presentation goes through the methodology of a case study, how to develop a case study, and presents an actual example of a research case study.
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Developing a Research Case Study
1. Developing a Research
Case Study
Julie Goldman, MLIS
@jgolds2
Library Fellow
Lamar Soutter Library
UMass Medical School
New England Collaborative
Data Management Curriculum
Scientific Research Data Management by Lamar Soutter Library, University of Massachusetts Medical
School is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
#teachingNECDMC
2. Outline
• Case Study Methodology
• Developing Case Study
• NECDMC Research Case Study
• Teaching NECDMC
4. “The case method packs more experience into every
hour of learning than any other instructional approach.”
Harvard Business Publishing,
Hints for Case Teaching
http://www.expand2web.com/blog/marketing-case-study-how-weight-watchers-dominated-the-weight-loss-industry/
5. Case Study Method
• Present problems
• Focus on a topic or
“teachable moment”
• Prepare users
http://blog.tradeshift.com/tradeshift-case-study-broadway-design-company/
6. Creating a Case Study
• Background Research
• Data Interview
• Case Narrative
• Discussion Questions
• Data Management Plan
http://amitkaps.com/bring-the-right-brain-at-work/
13. Using Zebrafish as a Model
System for Studying Motor Axon
Guidance & Motoneuron Disease
14. Research Questions
• What is the biological basis of the
motoneuron disease SMA?
• How can modeling ALS in zebrafish be
useful as a tool for drug and genetic
screening?
• What genes define motor axon
outgrowth?
16. Data Interview
• Draw out information about project
• Questions focus on the data story
https://www.youtube.com/user/nealsciencebootcamp/feed
17. Tips and Reminders
• Do your homework
• Use follow-up questions
• Make the meeting about the
researcher not the library
• Establishes relationships
• Associates the library with data
https://yellowdoggraphics.wordpress.com/page/3/
19. Initial Interview
1. As a research focused university
what kind of NIH grant do you have?
2. What other funding sources do
you have?
3. How long has this research project
been going on?
4. What is the overarching purpose
of this research?
5. What is your role in the research
process?
6. What kinds of experiments are
you doing with the zebrafish?
7. Who else works on this project?
8. What types of data products are
being produced?
9. What file formats are your data
produced in?
10. How is data analyzed?
11. How is the data managed?
12. Does your lab have naming
conventions for files/data?
13. Where and how long is data and
notebooks stored?
14. What kind of backup and security
protocols does your lab have?
15. What is shared publicly and with
the neuroscience community?
16. Who is allowed access to the data?
17. Are there security concerns within
the lab?
18. Who owns and is responsible for the
research data?
19. With a long research project, there is
personnel turnover within the lab. How
is data passed down among the research
team?
20. How is you lab ensuring long term
preservation of your research and data?
20. Follow up Email
1. What lab instruments you use?
2. You mentioned .TIFF files, what
other file formats do these
instruments create?
3. Do you have to change file formats
to make them accessible to everyone?
4. Any idea how many files are being
produced daily?
5. In terms of metadata, is there are a
data dictionary to go along with it?
6. Who exactly is reusing the research
data? Other OS labs? US labs?
International?
7. You told me about your lab notebook
and I am aware your lab is very low tech,
but has the university or the lab thought
about implementing electronic lab
notebooks?
8. How much interaction is there between
your PI and you and everyone else working
in the lab?
21. Second Interview
1. When you collect data about your fish,
how do you make sure that information is
linked to that fish?
2. Where do all of the image files you
are collecting end up? Only in your lab
notebook and on your personal
computer?
3. Can you explain the scoring system
you use to quantify the defect in the
motor neurons that are imaged on
the florescent microscope?
4. Do you know how often the programs you
use are updated for new versions?
5. Only zebrafish genetic lines are sent to
the international registry. What happens
with all the other data about the fish?
6. Does your lab submit data or publications
to Ohio State University’s institutional
repository?
7. Do you know if Ohio State has policies for
data sharing or data preservation?
24. Research
• Motoneuron diseases SMA and ALS
• Genetic and molecular cues
• Genetic models of zebrafish
• Research since 1996
https://science.nichd.nih.gov/confluence/display/zfig/Home
26. SMA
• Spinal muscular atrophy
• Caused by mutations in the
survival motoneuron gene (SMN)
• SMN protein is critical to the
health and survival of nerve cells in
the spinal cord responsible for
muscle contraction
• Occurs early in lifehttp://en.wikipedia.org/wiki/Spinal_muscular_atrophy
27. SMA
• Protein knockdown technology
• What function of SMN leads to motoneuron dysfunction
• Cell death in SMA caused by motor neuron defects
• Use scoring system on florescent microscope images
http://www.smasupportuk.org.uk/blog/research/sma-support-uk-at-the-cure-sma-conference-2013
28. ALS
• Amyotrophic lateral sclerosis
or Lou Gehrig’s disease
• Muscle weakness and atrophy
• Defect on chromosome 21
which codes for superoxide
dismutase (SOD1) enzyme http://en.academic.ru/dic.nsf/enwiki/17085
29. ALS
• Genetic mutation :
SOD1 gene to
generate SOD G93A
and G85R transgenic
zebrafish
• Drug screens with
zebrafish larva
• Rescue motor neurons
early in development
http://oncampus.osu.edu/article.php?id=1645
33. Zebrafish Facility
• Facility supports three labs
• 1200 sq ft
• 1234 tanks & 40,000 fish
• Tank labels : research’s name,
fish name, DOB, stock number
http://medicine.osu.edu/neuroscience/neuroscience-core-
services/core-b-genetics/ii-zebrafish-and-genome-manipulation-
facility/pages/index.aspx
34. General Lab Work
• PCR : polymerase chain reaction
• Agarose gel electrophoresis: separate DNA
• Western Blot : detect protein levels in tissue
• Microscopy : scoring system (axon morphology)
doi: 10.1083/jcb.200303168
35. Equipment and Products
• Bio-Rad (RT)-qPCR : Microsoft™ Excel™ files
• Thermo Scientific™ NanoDrop™ : Excel™ files
• Western Blots : film developed in a dark room
• Agarose gels : read on a gel box and
printed/scanned for densitometry quantification
• Microscopes : TIFF and JPEG files
• Data analysis : Excel™ or SPSS™
36. Programs
• SPSS® : statistics software
• ImageJ™ : public domain, Java™-based image
processing program developed by NIH
• Adobe® Photoshop® : photo editing
• Microsoft® Office : Word™, Excel™, PowerPoint™
37. Data Flow
• Data produced on old computers attached to equipment
• Transferred to the big (old) lab computer for processing
and data analysis
Example: florescent microscopy images
are saved on the computer attached to
the microscope which are then printed
out and sent to other computers
www.labx.com
https://u.osu.edu/beattie.24/
38. File Naming Conventions
• No standardization
• Personal
• Become more professional when sent
to the PI and goes to publication
http://dilbert.com/strip/2011-04-23
39. Lab Notebooks
• Paper lab notebooks for non-digital data
• Personal data keeping techniques
• Records detailed descriptions of experiments
• Notebooks stay in the lab
http://2012.igem.org/Team:LMU-Munich/Lab_Notebook
40. Backup and Security
• Use personal computers
• Responsible for keeping
external hard drives
• Security: passwords and
key access to labhttp://d7.library.gatech.edu/research-data/home
41. Sharing
• Sharing via Dropbox™ and Google Drive™
• Data from previous graduate students
passed down through the use of CDs
http://www.creativewomenscircle.com.au/social-media-using-dropbox/
42. Access
• Once published: public access to data
• Anyone can ask for reagents and animals
• Fish genetic lines are submitted to
international database for zebrafish
http://www.ru.nl/library/services/research/researchdata/finding/
43. • Nature
• Science
• PubMed
• Any one can ask for reagents,
antibodies, enzymes, and/or fish that
• OSU: get anything pre-publication
45. Preservation
• Archive: duration of the grant
• NIH: 3 years to have access to it
https://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm
NIH Data Sharing Policy and Implementation Guidance
http://wiki.dpconline.org/
53. Data Management Plan
• Breakout Activity
• Use SDMP
• Create DMP
http://www.ru.nl/library/services/research/researchdata/dmp/
54. Creating a Case Study
• Background Research
• Data Interview
• Case Narrative
• Discussion Questions
• Data Management Plan
http://amitkaps.com/bring-the-right-brain-at-work/
55. Develop Your Own
• Use the case study methodology
• Understand the process and steps involved
• Follow this format: teaching points, narrative
and discussion questions
• NECDMC research case
http://www.dtpli.vic.gov.au/planning/urban-design-and-development/design-case-studies
56. Next Up…
• Research Case Study
• Identify Data Management Needs
• Create Data Management Plan
• Teaching with NECDMC
57. References
Ferguson (2012) Lurking in the Lab: Analysis of Data from Molecular Biology Laboratory Instruments:
http://escholarship.umassmed.edu/jeslib/vol1/iss3/5/
The Beattie Lab at OSU, Department of Neuroscience:
https://u.osu.edu/beattie.24/
Ohio State University Neuroscience Graduate Program:
http://ngsp.osu.edu/
Ohio State University Library:
http://library.osu.edu/staff/admin-plus/AdminPlusNotes_20110427.pdf
Johns Hopkins University Data Management Services:
http://dmp.data.jhu.edu/sites/default/files/Questionnaire.doc
National Institute of Environmental Health Sciences:
http://www.niehs.nih.gov/news/newsletter/2013/9/science-ntptalk/
58. Developing a Research
Case Study
Julie Goldman, MLIS
@jgolds2
Library Fellow
Lamar Soutter Library
UMass Medical School
New England Collaborative
Data Management Curriculum
Scientific Research Data Management by Lamar Soutter Library, University of Massachusetts Medical
School is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
#teachingNECDMC
Notas del editor
Current library fellow
Graduate of Simmons – took scientific research data management taught by UMMS
Contributed to the curriculum – biological lab case study
Joined NECDMC teaching team
Methodology – what is it and how it works for NECMDC
Developing – background, interview, narrative, discussion questions, DMP
Case Study – zebrafish developed case study which will be used for the breakout activity
Teaching – Donna will then expand on the case study method and how it relates to data management education and teaching using NCDMC case studies and materials
A large part of the NECDMC curriculum uses case studies to teach best practices in data management for many different science disciplines
Used for education in many areas – science, business, law
Present problems users must attempt to solve within acceptable practices
Teachable moment useful for educating others
highlights specific data management practices or needs of a specific discipline or type of research
builds an understanding of scientific research
Prepare users for similar situations in the future
First step is to identify the environment: background information on institution, researcher, research project (who, what, when, where, why)
Hold a data interview with a researcher
Write narrative of research project
NECMDC case studies include the narrative, questions and highlight data management issues
Develop discussion questions
Craft a mock DMP
Typical research lifecycle through a project – data flow in the in center
Librarians typically downstream – collecting books/journals/datasets – data discovery/archiving
Should be more involved upstream at the iteration/project start – project planning with data collection/organization/description
Case study example in NECDMC
Research project using model organisms in a neuroscience research lab
My researcher (at the time) was a graduate student in the neuroscience program at OS
Specifically interested in neurological disorders and gene therapy
Research lab at the Ohio State University
The Beattie Lab at OS
Use zebrafish as model organism for research
Using zebrafish as a model organism for studying motor axons in motoneuron diseases
Research Questions:
Biological basis of SMA
Modeling ALS in zebrafish for drug and genetic screening
Identifying genes for motor axon growth
Go through the steps in setting up, conducting and analyzing a data interview
Use this information so the librarian can address data lifecycle issues and begin to implement a data management plan
Acquire as much information as possible from the researcher
Questions on data story, purpose, and life span
Great resource from New England Science Boot Camp – videos featuring UMMS librarians discussing “how to talk to researchers”
Homework on researcher and their research – impress them with your knowledge
Follow up during and after the interview – ask more questions to fully understand the project workflow
Make the meeting about the researcher – not about the library and what the library can do (yet)
Establishes relationships with institutional researchers
Researchers will see that librarians are interested in the research process and understanding research needs/issues/challenges
Associates the library with data – once the librarian understands the research project – make recommendations more how the library can help with the data needs…
Create an interview instrument – lots of options for templates
Digital Curation Centre’s Checklist for a Data Management Plan
Purdue Data Information Literacy Interview Instruments
University of Virginia Data Interview Initiative
John Hopkins
In addition to using a template – create own questions related to specific researcher/research
One question at a time
Avoid yes/no questions – open ended (#12 & 17 bad)
Limit questions to CORE aspects (I might have had too many questions…)
Set up interview – I asked for 30 minutes of her time to talk about her research
Skype interview and recorded
Use follow-up questions – #10 (You have a lot of digital files being created, what file formats are those generated in? → Is there non-digital data?)
Offer check-ins/copy of interview transcript
More follow up – multiple meetings
I sent very specific questions via email
More follow-up
Make sure you fully understand the research – shows your dedication to the project but do not become overbearing/disruptive
Questions really tailored to the research projects
This is what I learned: could have done more investigation beforehand!
Who what when where why
Construct narrative from the data interview transcript
Telling the story of the research project and the research data lifecycle
Include as many details as possible and point out missing pieces/challenges researcher expresses
Now share my research case…
Neuroscience research lab
Investigating the biological basis of motoneuron diseases SMA and ALS
Genetic and molecular cues that guide motor axons to their target muscle
Using zebrafish as a genetic model for these diseases (click)
Research project since 1996 – multiple grants
NIH ro1 – award made to support a discrete, specified, circumscribed project
Government is strict about data keeping and can ask to see data and notebooks any time
NIH has the legal right to audit and examine record relevant to any research grant award
Private funding: SMA & ALA families, foundations/organizations and private companies
Overview of SMA helps to understand the research experiments and data collected
Low levels of “survival of motor neuron” (SMN) protein leads to muscle atrophy and weakness
Occurs early in life - is the leading genetic cause of death in infants and toddlers
How zebrafish serve as genetic model of SMA
Protein knockdown technology in zebrafish development
Cell death in SMA caused by motor neuron defects during early development
Use scoring system on florescent microscope images to determine conditions and development of motor neurons (severe, moderate, mild, no defect)
Overview of ALS to understand experiments and data collection
Muscle weakness and atrophy throughout the body due to degeneration of the upper and lower motor neurons
Superoxide disumates 1, soluble (or SOD1) is a gene responsible for the enzyme on chromosome 21 that protects the body from free radicals
Free radical accumulation can damage DNA and proteins produced within cells
Looking at 2 specific gene mutations – correcting the effects of the mutant SOD1 gene
Use zebrafish larva to understand defects in the early stages of neuron development
Example of zebrafish microscopy images:
Optineurin (OPTN) is a 577 amino acid protein of versatile functions which interacts with a variety of proteins
Mutations in OPTN gene have been associated with ALS
OPTN interacts with aggregating proteins (SOD1 and G93A) involved in ALS
OPTN depletion in zebrafish causes motor axonopathy and mutant SOD1 increases motor axonopathy
Images shows:
Zebrafish injected with OPTN-specific translation blocking (OPTN ATG-AMO) morpholino showed a phenotype (curved tail indicated by the arrow)
This research project is just focusing on SOD1
The overexpressing SOD1 G93A at 48 hours after fertilization in comparison with non-injected zebrafish or zebrafish injected with control morpholino (control AMO)
Looking at the interaction of these proteins in motor axon development – causing axonopathy or axon degradation – curved spinal cord and loss of mobility
From the previous image you can see that zebrafish larva are easy to use as an experimental model organism – easy to grow and see through
Used for many kids of research
Why zebrafish?
Model established
Genome fully sequenced
Well-understood, easily observable
Testable developmental behaviors
Rapid embryonic development
Large, robust, transparent embryos
Develop outside mother
Similar to mammalian models and humans
Important factor in this research project is the zebrafish facility
Facility supports three research labs – 10-12 people using fish for different projects
Each person has their own fish – labeled accordingly
Common stock for breeding and controls
Facility manager over sees breeding, facility and IACUC compliance (institutional animal care and use control)
Google Drive database for logging information and updates on fish/experiments – used to have a paper log book in the facility
Researcher described “general lab work”:
PCR: polymerase chain reaction amplifies copies of a particular DNA sequence
Running agarose gel electrophoresis - DNA manipulation/separation
Western blots to detect protein levels
Microscope imaging
Bio-Rad (RT)-qPCR machine – amplifies a single or a few copies of a piece of DNA across several orders of magnitude, produce Microsoft™ Excel™ files
Thermo Scientific™ NanoDrop™ spectrophotometer – measures light transmittance or reflection intensity as a function of the light source wavelength
Western blot images developed are scanned
Agarose gels are read on a gel box and images are scanned and printed for densitometry quantification, the measure of light absorption through the medium
Microscopes produce Tagged Image File Format (TIFF) and JPEG (.jpeg) files
The team uses Excel™ and SPSS™ for data analysis
SPSS™ statistical software for data analysis
Edits image files in ImageJ™ - Java™-based image processing program developed by the NIH - or Adobe™ Photoshop™
Uses Portable Document Format (PDF) (.pdf) and PowerPoint™ (.pptx) files for figures and publications.
Typical data flow involves analysis of direct microscope images of manipulated fish samples, or gel images of DNA/protein analysis
These image files are produced on computers attached to the lab equipment
Files are analyzed on a computer and/or sent to the experimenter’s personal computer
The researcher always prints out a hard copy of any images and pastes them in her paper lab notebook
Lab does not have any standardized way to document its data
No naming conventions for saving and locating documents and images
No data dictionary
Files usually just involve the person’s name and a title or description meaningful to that person
These files are re-named for the PI and/or publication
Pastes hard copy images (gels, microscope) in notebook
Graduate student feels her notebook is comprehensible and easy to follow/better than post doc
Team uses their personal computers for lab work
She feels each person is responsible for his or her own data management
She backs up the lab files on her computer to an external hard drive
Security: passwords and key access to building and lab
When the graduate student began working in the lab - given CDs with pervious data
CDs containing images and data analysis
Now the lab team shares its data with each other using Google Drive and Dropbox on university server
Ultimately PI: responsible for data
Graduate student feels she can get any data she needs pre-publication
The graduate student feels that once the research is published, then anyone who wants it has access to the relevant data in the article
Anyone can ask for reagents and animals used in published study
Places of publication – science/nature/pubmed
Any one can ask for reagents, antibodies, enzymes, and/or fish that were used in any published study
Share and use data via repositories that house data on genetically modified zebrafish
ZFIN – NIH-funded zebrafish model organism database
Zebrafish Gene Collection (ZGC) – NIH initiative supports the production of cDNA libraries, clones and sequences of expressed genes for zebrafish – publicly accessible to the biomedical research community – all ZGC sequences are deposited in GenBan
ZF-HEALTH – is a Large-scale Integrating Project funded by the European Commission
Publications are considered the primary “electronic” form of data conservation in her lab
At the time – NIH’s Data Sharing Policy – no formal data management plan beyond sharing
Expected to archive for the duration of the project/grant as the NIH could ask for data/lab notebooks/ect
Case study on site
Teaching points
Case narrative
Discussion Questions
Teaching points – integrate the data story into the simplified data management plan & highlight where the researcher is doing something well, not doing something, or doing something that is not of best practice
Narrative – overview of research project (background, researcher) & data flow (collection, storage, sharing, preservation)
Discussion questions – highlight the data management topics and needs within a research case, help when teaching with the case study, prompt people to identify the data flow within a project, understand the components of a SDMP and what areas librarians can help researchers
Case study example in NECDMC
Breakout activity after lunch
Create a data management plan using the case study (case narrative, teaching points, discussion questions) and the SDMP that Donna will expand on in her presentation
First step is to identify the environment: background information on institution, researcher, research project (who, what, when, where, why)
Hold a data interview with a researcher
Write narrative
Develop discussion questions
Craft a mock DMP
NECMDC case studies include the narrative, questions and highlight data management issues
Now that you have the methodology for why case studies are useful, and the steps to create one – create your own
As you can see this case is not long
Go through the process –identify a researcher at your institution or other place, understand the institution or environment, conduct an interview, create a narrative, create discussion questions to highlight the data management needs and challenges
We can add your research case to our list of examples on the NECDMC site – we are always looking for new areas of science and disciplines
After the research story and understanding the data flow throughout the project…
Librarians can identify the data management flaws/challenges and needs
Make recommendations for fixing/avoiding problems
Use the Simplified Data Management Plan to develop a formalized plan for the research team to follow
Donna will work though another research case study and talk about how to identify the data management needs, then how to develop a DMP, and how to teach with NECDMC…