SlideShare una empresa de Scribd logo
1 de 55
Descargar para leer sin conexión
Introduction to Reproducible Research
in R and R Studio.
Susan Johnston
April 1, 2016
What is Reproducible Research?
Reproducibility is the ability of an entire experiment or
study to be reproduced, either by the researcher or by
someone else working independently, [and] is one of the
main principles of the scientific method.
Wikipedia
In the lab:
Many of us are clicking, copying and pasting...
Can you repeat all of this again. . .
. . . and would you get the same results every time?
Worst Case Scenario
Scenarios that benefit from reproducibility
New raw data becomes available.
You return to the project after a period of time.
Project gets handed to new PhD student/postdoc.
Working collaboratively.
A reviewer wants you to change a model parameter.
When you find an error, but not sure where you went wrong.
Four rules for reproducibility.
1. Create a portable project.
2. Avoid manual data manipulation steps - use code!
3. Connect results to text.
4. Version control all custom scripts and documents.
Disclaimer
Many solutions to the same problem!
The Environment: http://www.rstudio.com
Reproducible Research in
1. Creating a Portable Project (.Rproj)
2. Automate analyses - stop clicking and start typing.
3. Dynamic report writing with R Markdown and knitr
4. Version control using git
Reproducible Research in .
1. Creating a Portable Project (.Rproj)
2. Automate analyses - stop clicking and start typing.
3. Dynamic report writing with R Markdown and knitr
4. Version control using git
Structuring an R Project.
http://nicercode.github.io/blog/2013-05-17-organising-my-project/
Structuring an R Project.
http://nicercode.github.io/blog/2013-05-17-organising-my-project/
All data, scripts and output should be kept within the same project
directory.
Structuring an R Project.
http://nicercode.github.io/blog/2013-04-05-projects/
R/ Contains functions relevant to analysis.
data/ Contains raw data as read only.
doc/ Contains the paper.
figs/ Contains the figures.
output/ Contains analysis output
(processed data, logs, etc. Treat as disposable).
.R Code for the analysis.
Structuring an R Project.
http://robjhyndman.com/hyndsight/workflow-in-r/
load.R - read in data from files
clean.R - pre-processing and cleaning
functions.R - define what you need for anlaysis
do.R - do the analysis!
Bad habits can hinder portability.
https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects
setwd("C:/Users/susjoh/Desktop/SalmoAnalysis")
setwd("C:/Users/Susan Johnston/Desktop/SalmoAnalysis")
setwd("C:/Users/sjohns10/Drive/SalmoAnalysis")
source("../../OvisAnalysis/GWASplotfunctions.R")
An analysis should be contained within a directory, and it should
be easy to move it or pass on to someone new.
Solution: using Projects.
https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects
Establishes a directory with associated .Rproj file.
Automatically sets the working directory.
Can save and source .Rprofile, .Rhistory, .Rdata files.
Allows version control within R Studio.
Creating a Portable Project (.Rproj)
Creating a Portable Project (.Rproj)
Creating a Portable Project (.Rproj)
Creating a Portable Project (.Rproj)
Reproducible Research in .
1. Creating a Portable Project (.Rproj)
2. Automate analyses - stop clicking and start typing.
3. Dynamic report writing with R Markdown and knitr
4. Version control using git
This is R. There is no “if”. Only “how”.
CRAN, Bioconductor, github
Reading in data and functions
read.table(), read.csv(), read.xlsx(), source()
Reorganising data
reshape, plyr, dplyr
Generate figures
plot(), library(ggplot2)
Running external programmes with system()
Unix/Mac: system("plink -file OvGen --freq")
Windows: system("cmd", input = "plink -file OvGen --freq")
Reproducible Research in .
1. Creating a Portable Project (.Rproj)
2. Automate analyses - stop clicking and start typing.
3. Dynamic report writing with R Markdown and knitr
4. Version control using git
The knitr package allows R code and document templates to be
compiled into a single report containing text, results and figures.
Output script as Notebook
Write reports directly in R
Write reports directly in R
Creating an R Markdown Script (.Rmd).
Creating an R Markdown Script (.Rmd).
A Quick Start Guide
http://nicercode.github.io/guides/reports/
1. Type report text into .Rmd file
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
2. Enclose code to be evaluated in chunks
```{r}
model1 <- lm(speed ~ dist, data = cars)
```
3. Evaluate code inline
The slope of the model is `r coefficients(model1)[2]`
The slope of the model is 0.16557
4. Compile report as .html, .pdf or .doc
A Quick Start Guide
http://nicercode.github.io/guides/reports/
NB. PDF and Word docs require additional software.
http://rmarkdown.rstudio.com/?version=0.98.1103&mode=desktop
A Quick Start Guide
http://nicercode.github.io/guides/reports/
http://rmarkdown.rstudio.com/?version=0.98.1103&mode=desktop
Advanced Tips
Control how chunks are reported and evaluated
```{r echo = F, warning = F, fig.width = 3}
model1 <- lm(speed ~ dist, data = cars)
plot(model1)
```
spin(): compile .R files using #’, #+ and #-
http://deanattali.com/2015/03/24/knitrs-best-hidden-gem-spin/
LATEXdocuments, Presentations, Shiny, etc.
Reproducible Research in .
1. Creating a Portable Project (.Rproj)
2. Automate analyses - stop clicking and start typing.
3. Dynamic report writing with R Markdown and knitr
4. Version control using git
Version Control can revert a document to a previous
version.
Version Control can revert a document to a previous
version.
Version Control can revert a document to a previous
version.
Version Control can revert a document to a previous
version.
Version Control Using git.
https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN
Git can be installed on all platforms, and can be used to implement
version control within an R Studio Project.
http://git-scm.com/downloads
Version Control in R Studio
Tools >Project Options allows setup of git version control.
Version Control in R Studio
Select git as a version control system
Version Control in R Studio
Select git as a version control system
Version Control in R Studio
git information will appear in the top-right frame.
Version Control in R Studio
git information will appear in the top-right frame.
Version Control in R Studio
Select files to version control, write a meaningful commit message
>Commit
Version Control in R Studio
Select files to version control, write a meaningful commit message
>Commit
Version Control in R Studio
After modifying the file, repeat the process.
Version Control in R Studio
After modifying the file, repeat the process.
Version Control in R Studio
Previous versions can be viewed and restored from the History tab.
Version Control in R Studio
Previous versions can be viewed and restored from the History tab.
Advanced Steps: Github
Forking projects
All scripts are backed up online
Facilitates collaboration and working on different computers
Take home messages
Manage projects reproducibly: The first researcher who will
need to reproduce the results is likely to be YOU.
Time invested in learning to code pays off - do it.
Supervisors should be patient and encourage students to code.
Online Resources
RStudio: Idiot-proof guides and cheat sheets
http://www.rstudio.com/
Nice R Code: How-tos and advice on good coding practice
http://nicercode.github.io/guide.html
Ten Simple Rules for Reproducible Computational Research
http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285
Yihui Xie’s blog (knitr) http://yihui.name/en/categories/
R Bloggers: http://www.r-bloggers.com/
StackOverflow questions on R and knitr
http://stackoverflow.com/questions/tagged/r+knitr

Más contenido relacionado

La actualidad más candente

R studio practical file
R studio  practical file R studio  practical file
R studio practical file Ketan Khaira
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...PyData
 
Zelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalysesZelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalysesWagner M. S. Sampaio
 
Mining Code Examples with Descriptive Text from Software Artifacts
Mining Code Examples with Descriptive Text from Software ArtifactsMining Code Examples with Descriptive Text from Software Artifacts
Mining Code Examples with Descriptive Text from Software ArtifactsPreetha Chatterjee
 
Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)Anand Sampat
 
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
Finding Help with Programming Errors: An Exploratory Study of Novice Software...Finding Help with Programming Errors: An Exploratory Study of Novice Software...
Finding Help with Programming Errors: An Exploratory Study of Novice Software...Preetha Chatterjee
 

La actualidad más candente (6)

R studio practical file
R studio  practical file R studio  practical file
R studio practical file
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
Zelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalysesZelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalyses
 
Mining Code Examples with Descriptive Text from Software Artifacts
Mining Code Examples with Descriptive Text from Software ArtifactsMining Code Examples with Descriptive Text from Software Artifacts
Mining Code Examples with Descriptive Text from Software Artifacts
 
Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)
 
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
Finding Help with Programming Errors: An Exploratory Study of Novice Software...Finding Help with Programming Errors: An Exploratory Study of Novice Software...
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
 

Destacado

Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Ram Narasimhan
 
R-Studio, diferencia estadísticamente significativa 1
R-Studio, diferencia estadísticamente significativa 1R-Studio, diferencia estadísticamente significativa 1
R-Studio, diferencia estadísticamente significativa 1Jose Quezada
 
Manual r commander By Juan Guarangaa
Manual r commander By Juan GuarangaaManual r commander By Juan Guarangaa
Manual r commander By Juan GuarangaaJuan Ete
 
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016Penn State University
 
20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料安隆 沖
 
Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)Vladimir Gutierrez, PhD
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8Muhammad Nabi Ahmad
 
Paquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en RPaquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en RNestor Montaño
 
Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)Fan Li
 
WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016Penn State University
 
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016Penn State University
 
Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using RC. Tobin Magle
 
Chunked, dplyr for large text files
Chunked, dplyr for large text filesChunked, dplyr for large text files
Chunked, dplyr for large text filesEdwin de Jonge
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in RFlorian Uhlitz
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical SoftwareR and Rcmdr Statistical Software
R and Rcmdr Statistical Softwarearttan2001
 

Destacado (20)

Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)
 
R-Studio, diferencia estadísticamente significativa 1
R-Studio, diferencia estadísticamente significativa 1R-Studio, diferencia estadísticamente significativa 1
R-Studio, diferencia estadísticamente significativa 1
 
Setup R and R Studio
Setup R and R StudioSetup R and R Studio
Setup R and R Studio
 
Manual r commander By Juan Guarangaa
Manual r commander By Juan GuarangaaManual r commander By Juan Guarangaa
Manual r commander By Juan Guarangaa
 
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
WF ED 540, Class Meeting 3 - Introduction to dplyr, 2016
 
20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料20160611 kintone Café 高知 Vol.3 LT資料
20160611 kintone Café 高知 Vol.3 LT資料
 
Rlecturenotes
RlecturenotesRlecturenotes
Rlecturenotes
 
Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)Análisis espacial con R (asignatura de Master - UPM)
Análisis espacial con R (asignatura de Master - UPM)
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8
 
Paquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en RPaquete ggplot - Potencia y facilidad para generar gráficos en R
Paquete ggplot - Potencia y facilidad para generar gráficos en R
 
Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)Learn to use dplyr (Feb 2015 Philly R User Meetup)
Learn to use dplyr (Feb 2015 Philly R User Meetup)
 
WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016WF ED 540, Class Meeting 3 - mutate and summarise, 2016
WF ED 540, Class Meeting 3 - mutate and summarise, 2016
 
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
WF ED 540, Class Meeting 3 - select, filter, arrange, 2016
 
R seminar dplyr package
R seminar dplyr packageR seminar dplyr package
R seminar dplyr package
 
Dplyr and Plyr
Dplyr and PlyrDplyr and Plyr
Dplyr and Plyr
 
Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using R
 
Chunked, dplyr for large text files
Chunked, dplyr for large text filesChunked, dplyr for large text files
Chunked, dplyr for large text files
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical SoftwareR and Rcmdr Statistical Software
R and Rcmdr Statistical Software
 
Tokyor36
Tokyor36Tokyor36
Tokyor36
 

Similar a Reproducible Research in R and R Studio

Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...Felipe Prado
 
Data Analysis and Visualization: R Workflow
Data Analysis and Visualization: R WorkflowData Analysis and Visualization: R Workflow
Data Analysis and Visualization: R WorkflowOlga Scrivner
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R StudioRupak Roy
 
Introduction to r
Introduction to rIntroduction to r
Introduction to rgslicraf
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practiceC. Tobin Magle
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in Rliz__is
 
.NET Recommended Resources
.NET Recommended Resources.NET Recommended Resources
.NET Recommended ResourcesGreg Sohl
 
Intro to Reproducible Research
Intro to Reproducible ResearchIntro to Reproducible Research
Intro to Reproducible ResearchC. Tobin Magle
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTHaritikaChhatwal1
 
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Charles Guedenet
 
Study of R Programming
Study of R ProgrammingStudy of R Programming
Study of R ProgrammingIRJET Journal
 
Why your build matters
Why your build mattersWhy your build matters
Why your build mattersPeter Ledbrook
 
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWhat We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWork-Bench
 

Similar a Reproducible Research in R and R Studio (20)

20150422 repro resr
20150422 repro resr20150422 repro resr
20150422 repro resr
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
DEF CON 27 - workshop - ISAAC EVANS - discover exploit and eradicate entire v...
 
Data Analysis and Visualization: R Workflow
Data Analysis and Visualization: R WorkflowData Analysis and Visualization: R Workflow
Data Analysis and Visualization: R Workflow
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in R
 
R tutorial
R tutorialR tutorial
R tutorial
 
.NET Recommended Resources
.NET Recommended Resources.NET Recommended Resources
.NET Recommended Resources
 
Pyhton-1a-Basics.pdf
Pyhton-1a-Basics.pdfPyhton-1a-Basics.pdf
Pyhton-1a-Basics.pdf
 
Django by rj
Django by rjDjango by rj
Django by rj
 
Intro to Reproducible Research
Intro to Reproducible ResearchIntro to Reproducible Research
Intro to Reproducible Research
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
 
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
 
Study of R Programming
Study of R ProgrammingStudy of R Programming
Study of R Programming
 
Reproducible research
Reproducible researchReproducible research
Reproducible research
 
Why your build matters
Why your build mattersWhy your build matters
Why your build matters
 
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics PipelineWhat We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
 

Último

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 

Último (20)

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 

Reproducible Research in R and R Studio

  • 1. Introduction to Reproducible Research in R and R Studio. Susan Johnston April 1, 2016
  • 2. What is Reproducible Research? Reproducibility is the ability of an entire experiment or study to be reproduced, either by the researcher or by someone else working independently, [and] is one of the main principles of the scientific method. Wikipedia
  • 4. Many of us are clicking, copying and pasting... Can you repeat all of this again. . . . . . and would you get the same results every time?
  • 6. Scenarios that benefit from reproducibility New raw data becomes available. You return to the project after a period of time. Project gets handed to new PhD student/postdoc. Working collaboratively. A reviewer wants you to change a model parameter. When you find an error, but not sure where you went wrong.
  • 7. Four rules for reproducibility. 1. Create a portable project. 2. Avoid manual data manipulation steps - use code! 3. Connect results to text. 4. Version control all custom scripts and documents.
  • 8. Disclaimer Many solutions to the same problem!
  • 10. Reproducible Research in 1. Creating a Portable Project (.Rproj) 2. Automate analyses - stop clicking and start typing. 3. Dynamic report writing with R Markdown and knitr 4. Version control using git
  • 11. Reproducible Research in . 1. Creating a Portable Project (.Rproj) 2. Automate analyses - stop clicking and start typing. 3. Dynamic report writing with R Markdown and knitr 4. Version control using git
  • 12. Structuring an R Project. http://nicercode.github.io/blog/2013-05-17-organising-my-project/
  • 13. Structuring an R Project. http://nicercode.github.io/blog/2013-05-17-organising-my-project/ All data, scripts and output should be kept within the same project directory.
  • 14. Structuring an R Project. http://nicercode.github.io/blog/2013-04-05-projects/ R/ Contains functions relevant to analysis. data/ Contains raw data as read only. doc/ Contains the paper. figs/ Contains the figures. output/ Contains analysis output (processed data, logs, etc. Treat as disposable). .R Code for the analysis.
  • 15. Structuring an R Project. http://robjhyndman.com/hyndsight/workflow-in-r/ load.R - read in data from files clean.R - pre-processing and cleaning functions.R - define what you need for anlaysis do.R - do the analysis!
  • 16. Bad habits can hinder portability. https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects setwd("C:/Users/susjoh/Desktop/SalmoAnalysis") setwd("C:/Users/Susan Johnston/Desktop/SalmoAnalysis") setwd("C:/Users/sjohns10/Drive/SalmoAnalysis") source("../../OvisAnalysis/GWASplotfunctions.R") An analysis should be contained within a directory, and it should be easy to move it or pass on to someone new.
  • 17. Solution: using Projects. https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects Establishes a directory with associated .Rproj file. Automatically sets the working directory. Can save and source .Rprofile, .Rhistory, .Rdata files. Allows version control within R Studio.
  • 18. Creating a Portable Project (.Rproj)
  • 19. Creating a Portable Project (.Rproj)
  • 20. Creating a Portable Project (.Rproj)
  • 21. Creating a Portable Project (.Rproj)
  • 22. Reproducible Research in . 1. Creating a Portable Project (.Rproj) 2. Automate analyses - stop clicking and start typing. 3. Dynamic report writing with R Markdown and knitr 4. Version control using git
  • 23. This is R. There is no “if”. Only “how”. CRAN, Bioconductor, github Reading in data and functions read.table(), read.csv(), read.xlsx(), source() Reorganising data reshape, plyr, dplyr Generate figures plot(), library(ggplot2) Running external programmes with system() Unix/Mac: system("plink -file OvGen --freq") Windows: system("cmd", input = "plink -file OvGen --freq")
  • 24. Reproducible Research in . 1. Creating a Portable Project (.Rproj) 2. Automate analyses - stop clicking and start typing. 3. Dynamic report writing with R Markdown and knitr 4. Version control using git
  • 25. The knitr package allows R code and document templates to be compiled into a single report containing text, results and figures.
  • 26. Output script as Notebook
  • 27.
  • 30. Creating an R Markdown Script (.Rmd).
  • 31. Creating an R Markdown Script (.Rmd).
  • 32. A Quick Start Guide http://nicercode.github.io/guides/reports/ 1. Type report text into .Rmd file Lorem ipsum dolor sit amet, consectetuer adipiscing elit. 2. Enclose code to be evaluated in chunks ```{r} model1 <- lm(speed ~ dist, data = cars) ``` 3. Evaluate code inline The slope of the model is `r coefficients(model1)[2]` The slope of the model is 0.16557 4. Compile report as .html, .pdf or .doc
  • 33. A Quick Start Guide http://nicercode.github.io/guides/reports/ NB. PDF and Word docs require additional software. http://rmarkdown.rstudio.com/?version=0.98.1103&mode=desktop
  • 34. A Quick Start Guide http://nicercode.github.io/guides/reports/ http://rmarkdown.rstudio.com/?version=0.98.1103&mode=desktop
  • 35. Advanced Tips Control how chunks are reported and evaluated ```{r echo = F, warning = F, fig.width = 3} model1 <- lm(speed ~ dist, data = cars) plot(model1) ``` spin(): compile .R files using #’, #+ and #- http://deanattali.com/2015/03/24/knitrs-best-hidden-gem-spin/ LATEXdocuments, Presentations, Shiny, etc.
  • 36. Reproducible Research in . 1. Creating a Portable Project (.Rproj) 2. Automate analyses - stop clicking and start typing. 3. Dynamic report writing with R Markdown and knitr 4. Version control using git
  • 37. Version Control can revert a document to a previous version.
  • 38. Version Control can revert a document to a previous version.
  • 39. Version Control can revert a document to a previous version.
  • 40. Version Control can revert a document to a previous version.
  • 41. Version Control Using git. https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN Git can be installed on all platforms, and can be used to implement version control within an R Studio Project. http://git-scm.com/downloads
  • 42. Version Control in R Studio Tools >Project Options allows setup of git version control.
  • 43. Version Control in R Studio Select git as a version control system
  • 44. Version Control in R Studio Select git as a version control system
  • 45. Version Control in R Studio git information will appear in the top-right frame.
  • 46. Version Control in R Studio git information will appear in the top-right frame.
  • 47. Version Control in R Studio Select files to version control, write a meaningful commit message >Commit
  • 48. Version Control in R Studio Select files to version control, write a meaningful commit message >Commit
  • 49. Version Control in R Studio After modifying the file, repeat the process.
  • 50. Version Control in R Studio After modifying the file, repeat the process.
  • 51. Version Control in R Studio Previous versions can be viewed and restored from the History tab.
  • 52. Version Control in R Studio Previous versions can be viewed and restored from the History tab.
  • 53. Advanced Steps: Github Forking projects All scripts are backed up online Facilitates collaboration and working on different computers
  • 54. Take home messages Manage projects reproducibly: The first researcher who will need to reproduce the results is likely to be YOU. Time invested in learning to code pays off - do it. Supervisors should be patient and encourage students to code.
  • 55. Online Resources RStudio: Idiot-proof guides and cheat sheets http://www.rstudio.com/ Nice R Code: How-tos and advice on good coding practice http://nicercode.github.io/guide.html Ten Simple Rules for Reproducible Computational Research http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285 Yihui Xie’s blog (knitr) http://yihui.name/en/categories/ R Bloggers: http://www.r-bloggers.com/ StackOverflow questions on R and knitr http://stackoverflow.com/questions/tagged/r+knitr