SlideShare una empresa de Scribd logo
1 de 8
Advanced Data Analytics:
 Moving Data Around

         Jeffrey Stanton
  School of Information Studies
      Syracuse University
R and the File System
• R maintains a current working directory to simplify the
  process of reading and saving files

getwd() # shows the pathname of current folder
setwd("pathname") # Sets a new path
history() # shows most recent commands

# Creates a CSV file using data from a dataframe
write.table(dataFr, sep=",", file="filename.csv")

# Reads a CSV file into a dataframe
targetFrame = read.table("filename.csv", sep=",")

                                                            2
R and the Windows Clipboard
• For small chunks of data, it may be
  convenient to “cut and paste”
• Create a small rectangle of data in
  Excel and copy it to the clipboard
• Then, in R:
    > read.DIF("clipboard",transpose=TRUE)
     V1 V2
   1 1 1
   2 2 0
   3 3 1
   4 4 0
   5 5 1
   6 6 0




                                             3
Include Variable Names
• You can pull in the variable names (the
  column headings) as well
• Then, in R:
   > read.DIF("clipboard",transpose=TRUE,header=TRUE)
     Subject Code
   1       1    1
   2       2    0
   3       3    1
   4       4    0
   5       5    1
   6       6    0




                                                        4
Best Option: Put Clipboard into Dataframe
 > newDF =
    read.DIF("clipboard",transpose=TRUE,header=TRUE)
 > newDF
   Subject Code
 1       1    1
 2       2    0
 3       3    1
 4       4    0
 5       5    1
 6       6    0
 > class(newDF)
 [1] "data.frame"




                                                       5
An Explanation of Data Frames
• Every single piece of data in R is a “vector”: A list of “scalar” values all
  of the same mode
    – Scalar just means a single element or value, like the number 5
    – R vectors can be lists with any number of elements, including just one
      element; so a scalar could be stored in a vector of length one
    – The mode of a vector can be numerical, or character, or logical
• Just like Excel spreadsheets and other data programs like SPSS, vectors
  in R can be two dimensional, with a certain number of columns and a
  certain number of rows; a two dimensional vector is called a matrix
• But, being a vector, a matrix has to contain elements all of the same
  mode, so a matrix cannot always hold a typical spreadsheet or data set,
  because these often have different types in each column
• This is where the data frame comes in: A data frame is a list of vectors,
  all of the same length, each of which can be a different type

                                                                               6
read.DIF also works with files
> setwd(“C:/DataMining/DataFiles")
> newDF =
   read.DIF(“excelExport.dif",
   transpose=TRUE,header=TRUE)
> class(newDF)
[1] "data.frame"
> attach(newDF)

#   Note that Excel, DIF, and R
#   don’t always agree on data
#   formats. For example, currency
#   in Excel will not export to
#   integer values in R, so remove
#   as much formatting as possible.


                                      7
Demonstrating Mastery
• Create or find data in an Excel spreadsheet and export as a
  CSV file
• Import data into R from a CSV or TXT file
• Export a data frame into a CSV file
• Read the CSV file into Excel
• Advanced: Use data interchange format (“DIF”) to
  exchange files between R and Excel
• Advanced: Use a data frame in R to store data obtained from
  a spreadsheet




                                                           8

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Python and CSV Connectivity
Python and CSV ConnectivityPython and CSV Connectivity
Python and CSV Connectivity
 
Data Analysis with Python Pandas
Data Analysis with Python PandasData Analysis with Python Pandas
Data Analysis with Python Pandas
 
Export Data using R Studio
Export Data using R StudioExport Data using R Studio
Export Data using R Studio
 
File organization continued
File organization continuedFile organization continued
File organization continued
 
Indexing
IndexingIndexing
Indexing
 
PT- Oracle session01
PT- Oracle session01 PT- Oracle session01
PT- Oracle session01
 
CBSE - Class 12 - Ch -5 -File Handling , access mode,CSV , Binary file
CBSE - Class 12 - Ch -5 -File Handling , access mode,CSV , Binary fileCBSE - Class 12 - Ch -5 -File Handling , access mode,CSV , Binary file
CBSE - Class 12 - Ch -5 -File Handling , access mode,CSV , Binary file
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
 
Indexing
IndexingIndexing
Indexing
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
 
Import and Export Excel Data using openxlsx in R Studio
Import and Export Excel Data using openxlsx in R StudioImport and Export Excel Data using openxlsx in R Studio
Import and Export Excel Data using openxlsx in R Studio
 
Pa2 session 2
Pa2 session 2Pa2 session 2
Pa2 session 2
 
Pandas csv
Pandas csvPandas csv
Pandas csv
 
Hive and HiveQL - Module6
Hive and HiveQL - Module6Hive and HiveQL - Module6
Hive and HiveQL - Module6
 
Apache TAJO
Apache TAJOApache TAJO
Apache TAJO
 
Arrays in c
Arrays in cArrays in c
Arrays in c
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Data indexing presentation
Data indexing presentationData indexing presentation
Data indexing presentation
 
Import and Export Excel files using XLConnect in R Studio
Import and Export Excel files using XLConnect in R StudioImport and Export Excel files using XLConnect in R Studio
Import and Export Excel files using XLConnect in R Studio
 
Chapter12
Chapter12Chapter12
Chapter12
 

Destacado

Overview spss instructor
Overview spss instructorOverview spss instructor
Overview spss instructor
aswhite
 
Data Analysis With Spss - Reliability
Data Analysis With Spss -  ReliabilityData Analysis With Spss -  Reliability
Data Analysis With Spss - Reliability
Dr Ali Yusob Md Zain
 
Quantitative analysis using SPSS
Quantitative analysis using SPSSQuantitative analysis using SPSS
Quantitative analysis using SPSS
Alaa Sadik
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
songoten77
 

Destacado (16)

Getting Started with R
Getting Started with RGetting Started with R
Getting Started with R
 
Overview spss instructor
Overview spss instructorOverview spss instructor
Overview spss instructor
 
Reliability, validity, generalizability and the use of multi-item scales
Reliability, validity, generalizability and the use of multi-item scalesReliability, validity, generalizability and the use of multi-item scales
Reliability, validity, generalizability and the use of multi-item scales
 
Introduction To R
Introduction To RIntroduction To R
Introduction To R
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudio
 
Data Analysis With Spss - Reliability
Data Analysis With Spss -  ReliabilityData Analysis With Spss -  Reliability
Data Analysis With Spss - Reliability
 
Quantitative analysis using SPSS
Quantitative analysis using SPSSQuantitative analysis using SPSS
Quantitative analysis using SPSS
 
Language R
Language RLanguage R
Language R
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in R
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
 
Rsplit apply combine
Rsplit apply combineRsplit apply combine
Rsplit apply combine
 

Similar a Moving Data to and From R

Unit I - introduction to r language 2.pptx
Unit I - introduction to r language 2.pptxUnit I - introduction to r language 2.pptx
Unit I - introduction to r language 2.pptx
SreeLaya9
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
SumitMajukar
 

Similar a Moving Data to and From R (20)

Introduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSIntroduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICS
 
Python - Lecture 11
Python - Lecture 11Python - Lecture 11
Python - Lecture 11
 
Data Migration with Spark to Hive
Data Migration with Spark to HiveData Migration with Spark to Hive
Data Migration with Spark to Hive
 
Unit I - introduction to r language 2.pptx
Unit I - introduction to r language 2.pptxUnit I - introduction to r language 2.pptx
Unit I - introduction to r language 2.pptx
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
Bigdata
BigdataBigdata
Bigdata
 
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
DataStructures.pptx
DataStructures.pptxDataStructures.pptx
DataStructures.pptx
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
R data interfaces
R data interfacesR data interfaces
R data interfaces
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptx
 
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
Spark sql
Spark sqlSpark sql
Spark sql
 
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptxfINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
 
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICSHive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache Hive
 
Оформление пайплайна в NLP проекте​, Виталий Радченко. 22 июня, 2019
Оформление пайплайна в NLP проекте​, Виталий Радченко. 22 июня, 2019Оформление пайплайна в NLP проекте​, Виталий Радченко. 22 июня, 2019
Оформление пайплайна в NLP проекте​, Виталий Радченко. 22 июня, 2019
 
Python Pandas.pptx
Python Pandas.pptxPython Pandas.pptx
Python Pandas.pptx
 
Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
 

Más de Syracuse University

Carma internet research module scale development
Carma internet research module   scale developmentCarma internet research module   scale development
Carma internet research module scale development
Syracuse University
 
Mining tweets for security information (rev 2)
Mining tweets for security information (rev 2)Mining tweets for security information (rev 2)
Mining tweets for security information (rev 2)
Syracuse University
 
Carma internet research module: Future data collection
Carma internet research module: Future data collectionCarma internet research module: Future data collection
Carma internet research module: Future data collection
Syracuse University
 

Más de Syracuse University (20)

Discovery informaticsstanton
Discovery informaticsstantonDiscovery informaticsstanton
Discovery informaticsstanton
 
Basic SEVIS Overview for U.S. University Faculty
Basic SEVIS Overview for U.S. University FacultyBasic SEVIS Overview for U.S. University Faculty
Basic SEVIS Overview for U.S. University Faculty
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
Chapter9 r studio2
Chapter9 r studio2Chapter9 r studio2
Chapter9 r studio2
 
Basic Overview of Data Mining
Basic Overview of Data MiningBasic Overview of Data Mining
Basic Overview of Data Mining
 
Strategic planning
Strategic planningStrategic planning
Strategic planning
 
Carma internet research module scale development
Carma internet research module   scale developmentCarma internet research module   scale development
Carma internet research module scale development
 
Carma internet research module getting started with question pro
Carma internet research module   getting started with question proCarma internet research module   getting started with question pro
Carma internet research module getting started with question pro
 
Carma internet research module visual design issues
Carma internet research module   visual design issuesCarma internet research module   visual design issues
Carma internet research module visual design issues
 
Siop impact of social media
Siop impact of social mediaSiop impact of social media
Siop impact of social media
 
Basic Graphics with R
Basic Graphics with RBasic Graphics with R
Basic Graphics with R
 
R-Studio Vs. Rcmdr
R-Studio Vs. RcmdrR-Studio Vs. Rcmdr
R-Studio Vs. Rcmdr
 
Introduction to Advance Analytics Course
Introduction to Advance Analytics CourseIntroduction to Advance Analytics Course
Introduction to Advance Analytics Course
 
Installing R and R-Studio
Installing R and R-StudioInstalling R and R-Studio
Installing R and R-Studio
 
Mining tweets for security information (rev 2)
Mining tweets for security information (rev 2)Mining tweets for security information (rev 2)
Mining tweets for security information (rev 2)
 
What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 
Reducing Response Burden
Reducing Response BurdenReducing Response Burden
Reducing Response Burden
 
PACIS Survey Workshop
PACIS Survey WorkshopPACIS Survey Workshop
PACIS Survey Workshop
 
Carma internet research module: Future data collection
Carma internet research module: Future data collectionCarma internet research module: Future data collection
Carma internet research module: Future data collection
 
Carma internet research module: Sampling for internet
Carma internet research module: Sampling for internetCarma internet research module: Sampling for internet
Carma internet research module: Sampling for internet
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 

Moving Data to and From R

  • 1. Advanced Data Analytics: Moving Data Around Jeffrey Stanton School of Information Studies Syracuse University
  • 2. R and the File System • R maintains a current working directory to simplify the process of reading and saving files getwd() # shows the pathname of current folder setwd("pathname") # Sets a new path history() # shows most recent commands # Creates a CSV file using data from a dataframe write.table(dataFr, sep=",", file="filename.csv") # Reads a CSV file into a dataframe targetFrame = read.table("filename.csv", sep=",") 2
  • 3. R and the Windows Clipboard • For small chunks of data, it may be convenient to “cut and paste” • Create a small rectangle of data in Excel and copy it to the clipboard • Then, in R: > read.DIF("clipboard",transpose=TRUE) V1 V2 1 1 1 2 2 0 3 3 1 4 4 0 5 5 1 6 6 0 3
  • 4. Include Variable Names • You can pull in the variable names (the column headings) as well • Then, in R: > read.DIF("clipboard",transpose=TRUE,header=TRUE) Subject Code 1 1 1 2 2 0 3 3 1 4 4 0 5 5 1 6 6 0 4
  • 5. Best Option: Put Clipboard into Dataframe > newDF = read.DIF("clipboard",transpose=TRUE,header=TRUE) > newDF Subject Code 1 1 1 2 2 0 3 3 1 4 4 0 5 5 1 6 6 0 > class(newDF) [1] "data.frame" 5
  • 6. An Explanation of Data Frames • Every single piece of data in R is a “vector”: A list of “scalar” values all of the same mode – Scalar just means a single element or value, like the number 5 – R vectors can be lists with any number of elements, including just one element; so a scalar could be stored in a vector of length one – The mode of a vector can be numerical, or character, or logical • Just like Excel spreadsheets and other data programs like SPSS, vectors in R can be two dimensional, with a certain number of columns and a certain number of rows; a two dimensional vector is called a matrix • But, being a vector, a matrix has to contain elements all of the same mode, so a matrix cannot always hold a typical spreadsheet or data set, because these often have different types in each column • This is where the data frame comes in: A data frame is a list of vectors, all of the same length, each of which can be a different type 6
  • 7. read.DIF also works with files > setwd(“C:/DataMining/DataFiles") > newDF = read.DIF(“excelExport.dif", transpose=TRUE,header=TRUE) > class(newDF) [1] "data.frame" > attach(newDF) # Note that Excel, DIF, and R # don’t always agree on data # formats. For example, currency # in Excel will not export to # integer values in R, so remove # as much formatting as possible. 7
  • 8. Demonstrating Mastery • Create or find data in an Excel spreadsheet and export as a CSV file • Import data into R from a CSV or TXT file • Export a data frame into a CSV file • Read the CSV file into Excel • Advanced: Use data interchange format (“DIF”) to exchange files between R and Excel • Advanced: Use a data frame in R to store data obtained from a spreadsheet 8