Data Management Lab: Session 3 Data Entry Best Practices (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Data Management Lab: Session 3 Data Entry Best Practices
1. IUPUI University Library Center for Digital Scholarship
Data Management Lab: Spring 2014
Data Entry Best Practices
Data Entry
1. Dataset creation and integrity
a. Separate the coding and data entry tasks as much as possible
b. Coding should be performed so that distractions to coding tasks are minimized
c. Arrange for particularly complex tasks to be carried out by people specially trained for
the task
d. Use a data-entry program that is designed to catch typing errors (i.e., one that's pre-
programmed to detect out of range values)
e. Perform double entry of data
f. Carefully check the first 5-10 percent of the data records created, then choose random
records to quality-control checks throughout the process
g. Let the computer do complex coding and recoding, if possible
2. Things to check
a. Wild codes and out-of-range values
b. Consistency checks - comparisons across variables
c. Record matches and counts - relevant in longitudinal studies where subjects may have
more than one record and varying numbers of records
3. Variable names
a. Prefix, root, suffix systems is a systematic approach (compared to one-up numbers,
question numbers, and mnemonic names)
4. Variable labels
a. Should provide three pieces of information
i. The item or question number in the original data collection instrument
ii. A clear indication of the variable's content
iii. An indication of whether the variable is constructed from other items
5. Variable groups
a. Groups are recommended if a dataset contains a large number of variables
b. Can effectively organize a dataset an enable secondary analysts get an overview of a
dataset quickly
6. Over the long-term, store data in a consistent format
References
1. ICPSR. (2012). Guide to Social Science Data Preparation and Archiving, University of Michigan,
Ann Arbor, MI. From http://www.icpsr.umich.edu/files/deposit/dataprep.pdf.
2. Scott, T. 2012. Guidelines for data collection and entry.
From http://www.mc.vanderbilt.edu/gcrc/workshop_files/2012-09-07.pdf
3. DataONE Education Module: Data Entry and Manipulation. DataONE.
From http://www.dataone.org/sites/all/documents/L04_DataEntryManipulation.pptx
Heather Coates, 2013