Presentation given at ACRL 2015, with Christine Murray, on teaching undergraduate students to discover and evaluate datasets for secondary data analysis.
Promoting Data Literacy at the Grassroots (ACRL 2015, Portland, OR)
1. Promoting Data Literacy
at the Grassroots
Teaching & Learning with Data
in the Undergraduate Classroom
Adam Beauchamp
Christine Murray
2. Instruction Between
Data Reference & Data Management
• Data Reference
– Locate statistics
– Access data sets
– Citation help
• Presumes an
understanding of what
to do with data once
discovered
• Data Management
– Project Planning
– Metadata
– Storage & Sharing
• Presumes an
understanding of the
research process and
data analysis
3. Novices in the Data Life Cycle
Image: Research Data Services, University of Virginia Library
4. Pedagogical Models
•Statistical literacy
– Start with Exploratory Data Analysis (EDA)
• Cobb & Moore, 1997
•History teaching
– Heuristics of reading primary sources
• Wineburg, 2001
5. Three Lesson Plans
• Discovering Data Through the Literature
– Modeling the experts
• Evaluating Data Sets
– Exploring data and formulating questions
• Research Design
– Operationalizing a research question
6. LESSON 1: DISCOVERING DATALESSON 1: DISCOVERING DATA
THROUGH THE LITERATURETHROUGH THE LITERATURE
7. Sociological ResearchSociological Research
MethodsMethods
Students will be able to…
•Find data from a citation
•Understand the value of
secondary data analysis
•See the inextricable
connection between data
and the scholarly literature
•Use documentation to
evaluate a dataset
Goal
Students require an introduction to
sources of quantitative data. The
course is focused on attitudes toward
immigration in the U.S.
Library session
•Mid-semester
•45 minutes
8. Hoelter, L. F., Leclere, F. B., Pienta, A. M., Barlow, R. E., & McNally, J. W.
(2008). Using ICPSR Resources to Teach Sociology. Teaching Sociology,
36(1), 17-25. doi: 10.1177/0092055X0803600103
10. Reading the Study Description
• What is available, and how?
• What is the universe, data collection period,
and study methodology?
• What variables are available?
11. Activity
Search for one of the articles below by title or author in the
ICPSR Bibliography of Data-Related Literature. Locate the
dataset related to this article, and from the study page identify:
•the year the data were collected
•the universe of the study
•a variable related to attitudes toward immigration
Alba, R., Rumbaut, R.G., & Marotz, K. (2005). A distorted nation: Perceptions
of racial/ethnic group sizes and attitudes toward immigrants and other
minorities. Social Forces, 84(2), 901-919. doi: 10.1353/sof.2006.0002
12. Reflections
• Relies heavily on ICPSR, to the exclusion of
other data sources
• Citations are provided, rather than arising out
of the students’ own research
• May emphasize relying on some highly used
datasets, rather than seeking out novel
sources
14. Health and PopulationsHealth and Populations
Students will be able to…
•Understand the complex
web of data producers
•Use documentation to
evaluate a dataset
•Articulate how variables
relate to research questions
Goal
Students in an introductory
demography course must write a short
paper based on their original data
analysis, using a dataset of their
choosing that relates to population
dynamics, health disparities, etc.
Library session
•Mid-semester
•90 minutes
15. Who cares?
• Which organizations/entities care enough
about this topic to collect data about it?
• Who has the authority to collect data of
this type?
16. What can the data show?
• What will you use the data to investigate, and
what variables will you need to do so?
– a trend: a temporal variable
– a disparity: different populations
– a comparison: a geographic variable
– a spatial pattern: smaller regions within a larger
geographic area
• How could this data be collected?
17. Activity
• Small groups are assigned a pre-selected data
source
• By examining the documentation and the
data itself, students report back on the
characteristics of the data, and suggest
possible questions the data could be used to
answer
18.
19. Reflections
• For datasets not discussed, students must
make the leap themselves between topics
and possible data producers
• The investigation starts with the data, rather
the research question
21. SOCI 3040:SOCI 3040:
Research AnalysisResearch Analysis
Students will be able to…
•Identify potential collectors
and disseminators of data.
•Describe the accessibility
issues associated with data
sources
•Operationalize a research
question in order to develop
a data search strategy
Group Project
Present a research proposal that uses
pre-existing quantitative data to test a
hypothesis relevant to sociology.
Library session
•Early in semester
•75 minutes
•All sociology majors have library
session focused on the literature
review in previous course
(prerequisite)
22. Part 1: Collectors of Data
Who collects data?
• As a class, students suggest
potential creators of data.
Is it accessible to us?
• For each “collector,”
students consider reasons to
provide or to block access.
Photo: U.S. Census Bureau
23. Part 2: Operationalization
Group Exercise
•What variables can you
use to measure
gentrification in New
Orleans?
•Where might you find this
data?
Photo: Infrogmation, New Orleans
24. Part 3: Data Search
• Introduce LibGuide of Data
Sources
• Allow time for student
groups to work on their
projects
Photo: TrekCore
25. Reflections
Thoughts
•Students will benefit
from more practice with
operationalization
•Instructor collaboration
critical to effective
integration &
reinforcement
Assessments
•Formative assessments
•One-minute paper
•Professor feedback
26. Bibliography
Cobb, George W. and David S. Moore. 1997. “Mathematics, Statistics, and Teaching.”
The American Mathematical Monthly 104(9):801–23.
Hoelter, L. F., Leclere, F. B., Pienta, A. M., Barlow, R. E., & McNally, J. W. (2008). Using
ICPSR Resources to Teach Sociology. Teaching Sociology, 36(1), 17-25. doi:
10.1177/0092055X0803600103
Wineburg, Samuel S. 2001. Historical Thinking and Other Unnatural Acts: Charting the
Future of Teaching the Past. Philadelphia: Temple University Press.
27. Image Credits
Infrogmation of New Orleans. 2008. “BywaterKeepOffHipstersStepsB.jpg.” Wikimedia
Commons. Available at
http://commons.wikimedia.org/wiki/File:BywaterKeepOffHipstersStepsB.jpg.
Accessed 9 March 2015.
Research Data Services, University of Virginia Library. “Steps in the Data Life Cycle.”
Available at http://data.library.virginia.edu/data-management/lifecycle. Accessed 2 March
2015.
TrekCore. “Elementary, Dear Data, No. 130.” Star Trek The Next Generation HD
Screencaps (Season 2, episode 3). Available at
http://tng.trekcore.com/hd/thumbnails.php?album=36. Accessed 9 March 2015.
United States. Census Bureau. 2010. “1940 Census.” Slideshow. Available at
http://www.census.gov/1940census/. Accessed 9 March 2015.
Adam Beauchamp
Research & Instruction Librarian (Social Sciences)
Tulane University
New Orleans, Louisiana
[email_address]
Christine Murray
Social Science Librarian
Bates College
Lewiston, Maine
[email_address]
Data reference is probably the most familiar data service. Part of traditional reference services, helping users find statistical information, and more recently, data sets for secondary analysis. Statistical Abstracts and Census data; World Development Indicators.
Attention now is largely focused on data management and data curation services, which have been among the top trends identified by ACRL in both 2012 & 2014.
For us, asked to provide instruction with data in the library, the challenge is to translate some of the reference skills and resources into a pedagogically fruitful classroom setting for students who are new to the research endeavor and may have never used quantitative data before.
Where to begin with the data novice?
The data life cycle, a common feature of data management services and training, models the research process and shows how data management fits into familiar research processes. There are numerous other models, but I like this one from the University of Virginia because it adopts a user/researcher perspective and language, with descriptions on the website of data management steps associated with each step.
This approach to data management services assumes the user understands the place of data in the research process, a perfectly reasonable assumption to make of faculty and experienced students. It also presents a logical sequence of events in the research process, from planning, to collection, to analysis, to communication.
But does this work for the data novice? If you don’t know how to analyze quantitative data, how do you know what to look for or collect?
This is not limited to undergrads. I had a grad student ask me for help finding data on how easy it is to start a business in West Africa. When I asked how would he measure this “ease” he looked at me confused. Having never done this before, he hadn’t thought about what kinds of data he needed to analyze, hence his difficulty in trying to find and collect them.
Image credit:
Research Data Services, University of Virginia Library. “Steps in the Data Life Cycle.” Available at http://data.library.virginia.edu/data-management/lifecycle. Accessed 2 March 2015.
Pedagogical approaches from two different disciplines offer insight.
First, from statistics education, Cobb and Moore suggest we begin with data analysis. Provide students with quantitative data, and teach them to interpret and ask questions of them. Once students are comfortable with exploring data, and have learned to recognize the potentials and limitations of any given dataset, then they can appreciate the need for careful design and collection techniques, and be able to think ahead to their analytical needs when planning for and implementing a data collection project.
Note: EDA is a specific statistical teaching method developed by John Turkey in the 1970s, but we can adopt this approach in the library through a lens of hands-on exploration and brainstorming.
We might also draw inspiration from history education. Sam Wineburg studied the different ways in which experts (historians) and novices (students) read primary source texts. The experts engage critically, considering the author’s intentions, audience reception, etc., while the students read for basic comprehension (as they’ve been taught to do). This led Wineburg to call for discipline- or context-specific teaching methods, implying students should learn how to engage with primary sources as a first step to learning history content, and certainly before being asked to locate and work with primary sources independently.
References:
Cobb, George W., and David S. Moore. 1997. “Mathematics, Statistics, and Teaching.” American Mathematical Monthly 104(9):801-823.
Wineburg, Samuel S. 2001. Historical Thinking and Other Unnatural Acts: Charting the Future of Teaching the Past. Philadelphia: Temple University. Press.
This lesson was developed for a sociology research methods course. Methods courses are important point for intervention for data skills and understanding.
This lesson is in part a simplified version of Rachel Barlow’s “Exploring Data through Research Literature,” a resource available via ICPSR.
The previous lesson illustrated how to approach data through the scholarly literature. This lesson, on the other hand, challenges students to think about where data come from.
Learning objectives correlate to two class activities.
Objectives 1 & 2 go together. The implied “in order to” for objective 1 is the same as for objective 3, develop a data search strategy.
The implied “in order to” for objective 2 to establish realistic expectations.
Objective 3 attaches to the second class activity.
Sample of student responses, with reasons for sharing/not sharing data collected.
Researchers:
YES: institutional mandates (e.g. repositories), support research findings, contribute to collective knowledge
NO: embargo until publication, not stored in accessible space (e.g. hard drives, personal files)
Government
YES: taxpayer funded, democracy, transparency
NO: security, privacy
NGOs, Think tanks, Other organizations
YES: potential public mission, advocacy/persuasion
NO: competitive advantage?, limited to subscribers only as revenue source (e.g. ACRL)
Businesses
YES: required disclosures (e.g. SEC filings)
NO: competitive advantage, profits
Photo credit:
United States. Census Bureau. 2010. “1940 Census.” Slideshow. Available at http://www.census.gov/1940census/. Accessed 9 March 2015.
In groups, students came up with potential variables that could be used to measure gentrification in neighborhoods.
After allowing adequate time, they shared their findings with the class. Variables, which were listed on the board, included:
House values, rents, crime levels, business in neighborhood (number and type).
Household incomes, education levels, occupations (could indicate status).
Bonus: # of hipsters – would require an operational definition of hipster, and perhaps observational strategy to count them.
For each variable, consider whether this data is being collected and is available to us.
Re: housing values, of interest to real estate industry, tax collecting gov’t entities, banks. Unless house is actively for sale, real estate data may be limited, and bank data likely unavailable for privacy and proprietary reasons. Gov’t data most likely to be available.
Source of gov’t data in aggregate: Census (historic data available, but only in aggregate (block group or tract level))
Source of gov’t data for specific properties: Orleans Parish Tax Assessor’s office (only recent data)
Photo credit
Infrogmation of New Orleans. 2008. “BywaterKeepOffHipstersStepsB.jpg.” Wikimedia Commons. Available at http://commons.wikimedia.org/wiki/File:BywaterKeepOffHipstersStepsB.jpg. Accessed 9 March 2015.
While students work on their group topics, the professor and I circulate and:
Assist with operationalization
Recommend data sources
Encourage literature review search to generate ideas
Photo credit
TrekCore. “Elementary, Dear Data, No. 130.” Star Trek The Next Generation HD Screencaps (Season 2, episode 3). Available at http://tng.trekcore.com/hd/thumbnails.php?album=36. Accessed 9 March 2015.
Formative assessments:
Student participation during class & group activities
Individualized assistance during data search time (offer clarifications on what remains unclear)
One-minute paper: (one thing you learned, one thing that remains unclear, additional comments)
Most note having learned about helpful sources of data
Several note having learned about operationalization
Professor feedback:
Pleased with session
Used the student feedback from one-minute paper to reinforce & review some concepts in next class meeting
Professor has continued to come to the library for variations of this lesson plan
Last two semesters I handed this over to her grad student TA (train the trainer)