SlideShare una empresa de Scribd logo
1 de 1
Descargar para leer sin conexión
IUPUI University Library Center for Digital Scholarship
Data Management Lab: Pilot [January 2014]
Data Coding Best Practices
Data Coding
Guidelines adapted from the ICPSR Guide to Social Science Data Preparation and Archiving
1. Use common coding conventions
a. Assure that all statistical software packages can handle the data
b. Promote greater measurement comparability
2. Check out Federal Information Processing Codes (FIPS) - standard schemes.
3. Identification variables - provide fields at the beginning of each record to accommodate all
identification variables (e.g., unique study number and respondent number).
4. Code categories - should be mutually exclusive, exhaustive, and precisely defined.
5. Preserving original information - code as much detail as possible; recording original data, such as
age and income is more useful than collapsing or bracketing the information.
6. Closed-ended questions - responses to survey questions that are pre-coded in the questionnaire
should retain the coding scheme to avoid errors and confusion.
7. Open-ended questions - either use a predetermined coding scheme or review the initial survey
responses to construct a coding scheme based on major categories that emerge; any coding
scheme and its derivation should be reported in study documentation.
8. User-coded responses - must be reviewed for disclosure risk; if necessary, treated to protect
confidentiality prior to dissemination.
9. Check-coding - it's a good idea to verify or check-code some cases during the coding process;
i.e., repeat the process with an independent coder.
10. Series of responses - if a series of responses requires more than one field, organizing the
responses into meaningful major classifications is helpful; permits analysis of the data using
broad groupings or more detailed categories.
11. Missing data
a. Codes should match the content of the field (i.e., numeric, alphanumeric,).
b. Codes should be standardized such that the same code is used for each type of
missing data for all variables.
c. Blanks should not be used as missing data codes unless there is no need to differentiate
types of missing data such as "don't know" or "refused" etc.
d. If an entire sequence of variables is blank due to inapplicability or another reason, an
indicator field should be used.
e. Skip patterns & "not applicable" - not applicable and inapplicable should be distinct
from other missing data codes.
References
1. ICPSR. (2012). Guide to Social Science Data Preparation and Archiving, University of Michigan,
Ann Arbor, MI. From http://www.icpsr.umich.edu/files/deposit/dataprep.pdf.
Heather Coates, 2013

Más contenido relacionado

Similar a Data Management Lab: Session 3 Data Coding Best Practices

Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptxUnit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
tesfkeb
 
Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...
Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...
Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...
Glenn Villanueva
 

Similar a Data Management Lab: Session 3 Data Coding Best Practices (20)

IRJET- A Privacy Leakage Upper Bound Constraint-Based Approach for Cost-E...
IRJET-  	  A Privacy Leakage Upper Bound Constraint-Based Approach for Cost-E...IRJET-  	  A Privacy Leakage Upper Bound Constraint-Based Approach for Cost-E...
IRJET- A Privacy Leakage Upper Bound Constraint-Based Approach for Cost-E...
 
Mba2216 week 11 data analysis part 01
Mba2216 week 11 data analysis part 01Mba2216 week 11 data analysis part 01
Mba2216 week 11 data analysis part 01
 
Amcat test-syllabus
Amcat test-syllabusAmcat test-syllabus
Amcat test-syllabus
 
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptxUnit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
 
ml-03x01.pdf
ml-03x01.pdfml-03x01.pdf
ml-03x01.pdf
 
10. empirical phase.pdf
10. empirical phase.pdf10. empirical phase.pdf
10. empirical phase.pdf
 
NREM 601/605 Data Management Plans
NREM 601/605 Data Management PlansNREM 601/605 Data Management Plans
NREM 601/605 Data Management Plans
 
Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...
Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...
Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...
 
Managing the analysis of high-throughput data
Managing the analysis of high-throughput dataManaging the analysis of high-throughput data
Managing the analysis of high-throughput data
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
 
best data science course institutes in Hyderabad
best data science course institutes in Hyderabadbest data science course institutes in Hyderabad
best data science course institutes in Hyderabad
 
Data Science course in Hyderabad .
Data Science course in Hyderabad            .Data Science course in Hyderabad            .
Data Science course in Hyderabad .
 
Data Science course in Hyderabad .
Data Science course in Hyderabad         .Data Science course in Hyderabad         .
Data Science course in Hyderabad .
 
data science course in Hyderabad data science course in Hyderabad
data science course in Hyderabad data science course in Hyderabaddata science course in Hyderabad data science course in Hyderabad
data science course in Hyderabad data science course in Hyderabad
 
data science course training in Hyderabad
data science course training in Hyderabaddata science course training in Hyderabad
data science course training in Hyderabad
 
data science course training in Hyderabad
data science course training in Hyderabaddata science course training in Hyderabad
data science course training in Hyderabad
 
data science.pptx
data science.pptxdata science.pptx
data science.pptx
 
Bj32809815 (2)
Bj32809815 (2)Bj32809815 (2)
Bj32809815 (2)
 
Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...
Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...
Research Ethics and Integrity | Ethical Standards | Data Mining | Mixed Metho...
 
Data processing in research methodology
Data processing in research methodologyData processing in research methodology
Data processing in research methodology
 

Más de IUPUI

Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
IUPUI
 

Más de IUPUI (20)

Altmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in LibrariesAltmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in Libraries
 
Gather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your researchGather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your research
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interity
 
Case studies for open science
Case studies for open scienceCase studies for open science
Case studies for open science
 
Midwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data PanelMidwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data Panel
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate Impact
 
Citation & altmetrics - a comparison
Citation & altmetrics - a comparisonCitation & altmetrics - a comparison
Citation & altmetrics - a comparison
 
Altmetrics for Team Science
Altmetrics for Team ScienceAltmetrics for Team Science
Altmetrics for Team Science
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data quality
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Practical Data Management Plans
Practical Data Management PlansPractical Data Management Plans
Practical Data Management Plans
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
NIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - SlidesNIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - Slides
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 Slides
 
Data Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review OutlineData Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review Outline
 
Data Management Lab: Session 3 Slides
Data Management Lab: Session 3 SlidesData Management Lab: Session 3 Slides
Data Management Lab: Session 3 Slides
 
Data Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review ChecklistData Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review Checklist
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 

Data Management Lab: Session 3 Data Coding Best Practices

  • 1. IUPUI University Library Center for Digital Scholarship Data Management Lab: Pilot [January 2014] Data Coding Best Practices Data Coding Guidelines adapted from the ICPSR Guide to Social Science Data Preparation and Archiving 1. Use common coding conventions a. Assure that all statistical software packages can handle the data b. Promote greater measurement comparability 2. Check out Federal Information Processing Codes (FIPS) - standard schemes. 3. Identification variables - provide fields at the beginning of each record to accommodate all identification variables (e.g., unique study number and respondent number). 4. Code categories - should be mutually exclusive, exhaustive, and precisely defined. 5. Preserving original information - code as much detail as possible; recording original data, such as age and income is more useful than collapsing or bracketing the information. 6. Closed-ended questions - responses to survey questions that are pre-coded in the questionnaire should retain the coding scheme to avoid errors and confusion. 7. Open-ended questions - either use a predetermined coding scheme or review the initial survey responses to construct a coding scheme based on major categories that emerge; any coding scheme and its derivation should be reported in study documentation. 8. User-coded responses - must be reviewed for disclosure risk; if necessary, treated to protect confidentiality prior to dissemination. 9. Check-coding - it's a good idea to verify or check-code some cases during the coding process; i.e., repeat the process with an independent coder. 10. Series of responses - if a series of responses requires more than one field, organizing the responses into meaningful major classifications is helpful; permits analysis of the data using broad groupings or more detailed categories. 11. Missing data a. Codes should match the content of the field (i.e., numeric, alphanumeric,). b. Codes should be standardized such that the same code is used for each type of missing data for all variables. c. Blanks should not be used as missing data codes unless there is no need to differentiate types of missing data such as "don't know" or "refused" etc. d. If an entire sequence of variables is blank due to inapplicability or another reason, an indicator field should be used. e. Skip patterns & "not applicable" - not applicable and inapplicable should be distinct from other missing data codes. References 1. ICPSR. (2012). Guide to Social Science Data Preparation and Archiving, University of Michigan, Ann Arbor, MI. From http://www.icpsr.umich.edu/files/deposit/dataprep.pdf. Heather Coates, 2013