SlideShare una empresa de Scribd logo
1 de 8
Pickling & CSV
Preservation through Serialization and Tabulation
Pickle
Module for (de)serialization: Storing complete Python objects into files and later
loading them back.
● Supports almost all data types – good.
● Works only with Python – bad.
import pickle
pickle.dump(object, openBinaryFile) # Save object to an open file
object = pickle.load(openBinaryFile) # Restore an object from an open file
2
What Is CSV?
● “Comma Separated Values”
● Tabular file with rows and columns.
● All rows have the same number of fields.
● Fields separated by commas.
○ “Commas” do not have to be commas. Any other character can be used, such as TAB (TSV,
“tab separated values”), vertical bar, space...
● The first row often serves as headers.
3
CSV Example
4
Student, ID, E-mail Address, Phone Number, Class, Academic Level
“Almarar, Hassan A”, 16897**, halmarar2@suffolk.edu, Junior, UG
“Arakelyan, Artur”, 17577**, aarakelyan@suffolk.edu, Sophomore, UG
“Batista, Christopher A”, 16357**, cbatista@suffolk.edu, Senior, UG
Complete file...
Reading CSV
import csv
with open("path-words.csv") as csvfileIn:
reader = csv.reader(csvfileIn, delimiter=',', quotechar='"')
# Returns the next row parsed as a list, if necessary
headers = next(reader)
# Process the rest of the file
for row in reader:
do_something(row)
# Or, since reader is a generator:
all_rows = list(reader)
5
Writing CSV
import csv
with open("path-words.csv", "w") as csvfileOut:
writer = csv.writer(csvfileOut, delimiter=',', quotechar='"')
writer.writerow([..., ..., ...]) # Write headers
# Write the rest of the file; each row is a list of strings or numbers
writer.writerows([row1, row2, row3 ...])
6
Example: Who Are the Students? (students.py)
import csv, collections
with open("class-2017.csv") as mystudents:
reader = csv.reader(mystudents)
headers = next(reader)
class_position = headers.index("Class") # Where is the Class column?
class_levels = [row[class_position] for row in reader]
who_s_who = collections.Counter(class_levels) # Summary
with open("class-summary.csv", "w") as levels:
writer = csv.writer(levels)
writer.writerow(['Class', 'count']) # New headers
writer.writerows(who_s_who.items()) # New content
7
Whenever
possible, use
Pandas
8

Más contenido relacionado

Similar a Pickling and CSV

Similar a Pickling and CSV (14)

Lenguaje Python
Lenguaje PythonLenguaje Python
Lenguaje Python
 
pysdasdasdsadsadsadsadsadsadasdasdthon1.ppt
pysdasdasdsadsadsadsadsadsadasdasdthon1.pptpysdasdasdsadsadsadsadsadsadasdasdthon1.ppt
pysdasdasdsadsadsadsadsadsadasdasdthon1.ppt
 
coolstuff.ppt
coolstuff.pptcoolstuff.ppt
coolstuff.ppt
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
Introductio_to_python_progamming_ppt.ppt
Introductio_to_python_progamming_ppt.pptIntroductio_to_python_progamming_ppt.ppt
Introductio_to_python_progamming_ppt.ppt
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
Python by ganesh kavhar
Python by ganesh kavharPython by ganesh kavhar
Python by ganesh kavhar
 
manish python.pptx
manish python.pptxmanish python.pptx
manish python.pptx
 
Python Data-Types
Python Data-TypesPython Data-Types
Python Data-Types
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
Programming in Python
Programming in Python Programming in Python
Programming in Python
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way Presentation
 

Más de Dmitry Zinoviev

Network analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweetsNetwork analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweets
Dmitry Zinoviev
 

Más de Dmitry Zinoviev (20)

Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)
 
WHat is star discourse in post-Soviet film journals?
WHat is star discourse in post-Soviet film journals?WHat is star discourse in post-Soviet film journals?
WHat is star discourse in post-Soviet film journals?
 
The “Musk” Effect at Twitter
The “Musk” Effect at TwitterThe “Musk” Effect at Twitter
The “Musk” Effect at Twitter
 
Are Twitter Networks of Regional Entrepreneurs Gendered?
Are Twitter Networks of Regional Entrepreneurs Gendered?Are Twitter Networks of Regional Entrepreneurs Gendered?
Are Twitter Networks of Regional Entrepreneurs Gendered?
 
Using Complex Network Analysis for Periodization
Using Complex Network Analysis for PeriodizationUsing Complex Network Analysis for Periodization
Using Complex Network Analysis for Periodization
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
Text analysis of The Book Club Play
Text analysis of The Book Club PlayText analysis of The Book Club Play
Text analysis of The Book Club Play
 
Exploring the History of Mental Stigma
Exploring the History of Mental StigmaExploring the History of Mental Stigma
Exploring the History of Mental Stigma
 
Roles and Words in a massive NSSI-Related Interaction Network
Roles and Words in a massive NSSI-Related Interaction NetworkRoles and Words in a massive NSSI-Related Interaction Network
Roles and Words in a massive NSSI-Related Interaction Network
 
“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...
“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...
“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...
 
Network analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweetsNetwork analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweets
 
Network Analysis of The Shining
Network Analysis of The ShiningNetwork Analysis of The Shining
Network Analysis of The Shining
 
The Lord of the Ring. A Network Analysis
The Lord of the Ring. A Network AnalysisThe Lord of the Ring. A Network Analysis
The Lord of the Ring. A Network Analysis
 
Python overview
Python overviewPython overview
Python overview
 
Welcome to CS310!
Welcome to CS310!Welcome to CS310!
Welcome to CS310!
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
The P4 of Networkacy
The P4 of NetworkacyThe P4 of Networkacy
The P4 of Networkacy
 
DaVinci Code. Network Analysis
DaVinci Code. Network AnalysisDaVinci Code. Network Analysis
DaVinci Code. Network Analysis
 
Soviet Popular Music Landscape: Community Structure and Success Predictors
Soviet Popular Music Landscape: Community Structure and Success PredictorsSoviet Popular Music Landscape: Community Structure and Success Predictors
Soviet Popular Music Landscape: Community Structure and Success Predictors
 
C for Java programmers (part 2)
C for Java programmers (part 2)C for Java programmers (part 2)
C for Java programmers (part 2)
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Pickling and CSV

  • 1. Pickling & CSV Preservation through Serialization and Tabulation
  • 2. Pickle Module for (de)serialization: Storing complete Python objects into files and later loading them back. ● Supports almost all data types – good. ● Works only with Python – bad. import pickle pickle.dump(object, openBinaryFile) # Save object to an open file object = pickle.load(openBinaryFile) # Restore an object from an open file 2
  • 3. What Is CSV? ● “Comma Separated Values” ● Tabular file with rows and columns. ● All rows have the same number of fields. ● Fields separated by commas. ○ “Commas” do not have to be commas. Any other character can be used, such as TAB (TSV, “tab separated values”), vertical bar, space... ● The first row often serves as headers. 3
  • 4. CSV Example 4 Student, ID, E-mail Address, Phone Number, Class, Academic Level “Almarar, Hassan A”, 16897**, halmarar2@suffolk.edu, Junior, UG “Arakelyan, Artur”, 17577**, aarakelyan@suffolk.edu, Sophomore, UG “Batista, Christopher A”, 16357**, cbatista@suffolk.edu, Senior, UG Complete file...
  • 5. Reading CSV import csv with open("path-words.csv") as csvfileIn: reader = csv.reader(csvfileIn, delimiter=',', quotechar='"') # Returns the next row parsed as a list, if necessary headers = next(reader) # Process the rest of the file for row in reader: do_something(row) # Or, since reader is a generator: all_rows = list(reader) 5
  • 6. Writing CSV import csv with open("path-words.csv", "w") as csvfileOut: writer = csv.writer(csvfileOut, delimiter=',', quotechar='"') writer.writerow([..., ..., ...]) # Write headers # Write the rest of the file; each row is a list of strings or numbers writer.writerows([row1, row2, row3 ...]) 6
  • 7. Example: Who Are the Students? (students.py) import csv, collections with open("class-2017.csv") as mystudents: reader = csv.reader(mystudents) headers = next(reader) class_position = headers.index("Class") # Where is the Class column? class_levels = [row[class_position] for row in reader] who_s_who = collections.Counter(class_levels) # Summary with open("class-summary.csv", "w") as levels: writer = csv.writer(levels) writer.writerow(['Class', 'count']) # New headers writer.writerows(who_s_who.items()) # New content 7