SlideShare una empresa de Scribd logo
1 de 46
Descargar para leer sin conexión
Introduction to 
computational thinking
Module 11 : File Management
Asst Prof Michael Lees
Office: N4‐02c‐76
email: mhlees[at]ntu.edu.sg
Module 11 : File Management 1 of 53
Contents
• File basics
• File interaction
– writing, reading
• File facts
• Other file operations
• OS module
Module 11 : File Management 2 of 53
Chapter 5 & 14
FILES BASICS
Module 11 : File Management
Module 11 : File Management 3 of 53
What is a File?
• A file is a collection of data that is 
stored on secondary storage like 
a disk or a thumb drive.
• Accessing a file means 
establishing a connection 
between the file and the program
and moving data between the 
two. 
Module 11 : File Management 4 of 53
Two types of file
• Files come in two general types:
– Text files: files where control characters such as 
“/n” are translated. These are generally human 
readable
– Binary files: all the information is taken directly 
without translation. Not readable and contains 
non‐readable info. 
Module 11 : File Management 5 of 53
Binary vs. plain text
• Plain text 
+ human readable, useful for certain file types.
‐ inefficient storage (each character requires ? 
bytes) : 256 combinations for ASCII = 8 bits:1byte. 
Unicode could be 32 bits : 4 bytes
• Binary
+ More efficient storage, custom format
‐ Not human readable
Module 11 : File Management 6 of 53
Example
• Storing all the ages of the class (assume 500 students)
Module 11 : File Management 7 of 53
ASCII
‘20’
‘18’
‘21’
‘19’
How many bytes per entry?  ASCII = 2
2 x 500 = 1000 bytes
Binary
20
18
21
19
How many bytes per entry?  Binary= 1 byte
1 byte is 0‐255 (enough for age)
1 x 500 = 500 bytes
File Objects or Stream
• When opening a file, you create a file object 
or file stream that is a connection between 
the file information on disk and the program.
• The stream contains a “buffer” of the 
information from the file, and provides the 
information to the program
Module 11 : File Management 8 of 53
Module 11 : File Management 9 of 53
Buffering
• Reading from a disk is very slow. Thus the 
computer will read a lot of data from a file in 
the hope that, if you need the data in the 
future, it will be “buffered” in the file object.
• This means that the file object contains a copy 
of information from the file called a cache
(pronounced “cash”).
Module 11 : File Management 10 of 53
Buffering
Module 11 : File Management 11 of 53
File Buffer (Cache)Program Disk
write
read
write
read
Creating a file object
myFile = open(“myFile.txt”, “r”)
• myFile is the file object. 
• It contains the buffer of information. 
• The open function creates the connection 
between the disk file and the file object. 
• The first quoted string is the file name on disk, 
the second is the mode to open it (here,“r” 
means to read).
Module 11 : File Management 12 of 53
File location
• When opened, the name of the file can come 
in one of two forms:
– “file.txt” assumes the file name is file.txt, and it is 
located in the current program directory.
– “c:billfile.txt” is the fully qualified file name and 
includes the directory information.
– ‘/Users/michaellees/python/CZCE1003’
Module 11 : File Management 13 of 53
File modes
Mode Description
‘r’ read a text file
‘w’ write a text (wipes contents)
‘a’ append to existing file
‘b’ binary file
‘+’ both read and write
Module 11 : File Management 14 of 53
• Be careful if you open a file with the ‘w’ mode. It sets an 
existing file’s contents to be empty, destroying any existing 
data.
• The ‘a’ mode is nicer, allowing you to write to the end of an 
existing file without changing the existing contents.
FILE INTERACTION
Module 11 : File Management
Module 11 : File Management 15 of 53
Everything is a string
• If you are interacting with plain text files 
(which is all we will do for this semester), 
remember that everything is a string:
–everything read is a string
–if you write to a file, you can only write a 
string
Module 11 : File Management 16 of 53
File contents
• Once you have a file object:
• fileObject.read()
– Reads the entire contents of the file as a string 
and returns it. It can take an optional argument 
integer to limit the read to N bytes, that is 
fileObject.read(N).
• fileObject.readline()
– Delivers the next line as a string.
Module 11 : File Management 17 of 53
More contents
• fileObject.readLines()
–Returns a single list of all the lines from the 
file.
• for line in fileObject:
–Iterator to go through the lines of a file.
Module 11 : File Management 18 of 53
Close the door behind you
• When done, you close the file. Closing is 
important because the information in the 
fileObject buffer is “flushed” out of the buffer 
and into the file on disk, making sure that no 
information is lost.
fileObject.close()
Module 11 : File Management 19 of 53
Python elegance
• Python is often praised for it’s simplicity and 
elegance.
for line in file(“fileToRead.txt”):
print(line)
• File is automatically opened (by file( )).
• File is automatically closed at the end of the for 
loop.
• Defaults are read and text.
Module 11 : File Management 20 of 53
Writing
• Once opened, you can write to a file (if 
the mode is appropriate):
• fileObject.write(s)
–writes the string s to the file
• fileObject.writelines(list)
–write a list of strings (one at a time) to the 
file
Module 11 : File Management 21 of 53
Errors?
• What if the file doesn’t exist?
• Your program should behave gracefully if the 
file can’t be opened.
• When writing software, treat others as you 
would like to be treated.
• In later chapters we will describe “exception,” 
but for now we will just assume that you can 
get the file.
Module 11 : File Management 22 of 53
Challenge 11.1 File Copy
Write a program to copy the contents of a file but removing all vowels
Module 11 : File Management 23 of 53
Thought process
• Open input and output file (‘r’, ‘w’)
• Process each line of the input file.
• Take each line and replace any vowel with 
empty string “”
• Write new string to output file
• Close both files!!
Module 11 : File Management 24 of 53
Copy without vowels
vowels= [‘a’, ‘e’, ‘i’, ‘o’, ‘u’, ‘A’, ‘E’, ‘I’,
‘O’, ‘U’]
# File reading and writing
inFile = open("input.txt", "r")
outFile = open("output.txt", "w")
for line in inFile:
for letter in line:
if letter in vowels:
line = line.replace(letter,’’)
outFile.write(line) # written to the output file
inFile.close()
outFile.close()
Module 11 : File Management 25 of 53
FILE FACTS
Module 11 : File Management
Module 11 : File Management 26 of 53
Newline character
• Each operating system (Windows, OS X, Linux) 
developed certain standards for representing 
text.
• In particular, they chose different ways to 
represent the end of a file, the end of a line, 
etc.
• This can confuse our text readers!
Module 11 : File Management 27 of 53
Universal new line
• To get around this, Python provides a special 
file option to deal with variations of OS text 
encoding.
• The ‘U’ option means that Python deals with 
the problem so you don’t have to!
fileObj = open(‘myFile.txt’, ‘rU’)
Module 11 : File Management 28 of 53
Current file position
• Every file maintains a “current file position.” 
• It is the current position in the file and 
indicates what the file will read next.
• It is set by the mode table above.
Module 11 : File Management 29 of 53
File buffer
• When the disk file is opened, the contents of 
the file are copied into the buffer of the file 
object.
• Think of the file object as a very big list where 
every index is one of the pieces of information 
of the file.
• The current position is the present index in 
that list.
Module 11 : File Management 30 of 53
Module 11 : File Management 31 of 53
OTHER FILE OPERATIONS
Module 11 : File Management
Module 11 : File Management 32 of 53
tell()
• The tell() method tells you the current file 
position.
• The positions are in bytes (think characters for 
ASCII) from the beginning of the file:
fileObject.tell() => 42L
Module 11 : File Management 33 of 53
seek()
• The seek() method updates the current file 
position to where you like (in bytes offset from 
the beginning of the file):
– fd.seek(0) # to the beginning of the file
– fd.seek(100) # 100 bytes from beginning
• Counting bytes is a pain. 
• Seek has a optional second argument:
– 0: count from the beginning
– 1: count for the current file position
– 2: count from the end (backwards)
Module 11 : File Management 34 of 53
e.g., fd.seek(-100,2)
100 bytes from end of file
Reading forward
• Every read/readline/readlines moves the 
current pos forward.
• When you hit the end, every read will just 
yield “” since you are at the end.
• You need to seek to the beginning to start 
again (or close and open, seek is easier).
Module 11 : File Management 35 of 53
The power of pickle
• Everything is a string. So how about things 
that aren’t?
• Python provides a standard module called 
pickle. This is an amazing module that can 
take almost any Python object and transfer 
it to a file (converting to string in the 
process); this process is called pickling. 
(import pickle – it’s a module)
Module 11 : File Management 36 of 53
x = pickle.load(f)pickle.dump(x, f)
Pickle object x to file f Unpickle object x to file f
Remember:
str(2)=>
‘2’
Challenge 11.2 MP3 ID3 tag
Take an mp3 file and output the song name and artist name from the ID3 
tag
Module 11 : File Management 37 of 53
Thought process
• MP3 is a binary file (can see that if you load in 
text editor)
• Generally file headers (like ID3) are at specific 
locations.
• Use the internet to find this location (Wiki)
• Open file, seek to correct bytes and then print 
out.
Module 11 : File Management 38 of 53
OS MODULE
Module 11 : File Management
Module 11 : File Management 39 of 53
What is the OS module
• The os module in Python is an interface 
between the operating system and the Python 
language.
• As such, it has many sub‐functionalities 
dealing with various aspects.
• We will look mostly at the file‐related stuff.
import os # to use os
Module 11 : File Management 40 of 53
What is a directory/folder
• Whether in Windows, Linux or on OS X, all 
OS’s maintain a directory structure.
• A directory is a container of files and other 
Directories.
• These directories are arranged in a hierarchy 
or tree.
Module 11 : File Management 41 of 53
Different paths styles
• It turns out that each OS has its own way of 
specifying a path:
– C:billpythonmyFile.py
– /Users/bill/python/myFile.py
• Nicely, Python knows that and translates to 
the appropriate OS.
Module 11 : File Management 42 of 53
Some OS methods
• os.getcwd(): Returns the full path of the 
current working directory.
• os.chdir(pathString): Change the current 
directory to the path provided.
• os.listdir(pathString): Return a list of the files 
and directories in the path (including ‘.’).
Module 11 : File Management 43 of 53
More OS methods
• os.rename(sourcePathStr, destPathStr): 
Renames a file or directory.
• os.mkdir(pathStr): make a new directory. So 
os.mkdir(‘/Users/bill/python/new’) creates 
the directory new under the directory python.
• os.remove(pathStr). Removes the file.
• os.rmdir(pathStr). Removes the directory, but 
the directory must be empty.
Module 11 : File Management 44 of 53
Take home lessons
• Files are important! (Obviously)
• Binary vs. Plain text
• Buffers – why and how.
• Reading/Writing files (binary & plain text)
• Elegant degradation (more in next module)
Module 11 : File Management 45 of 53
Further reading
• http://docs.python.org/tutorial/inputoutput.h
tml
• http://diveintopython3.org/files.html
Module 11 : File Management 46 of 53

Más contenido relacionado

Destacado

Lecture 10 user defined functions and modules
Lecture 10  user defined functions and modulesLecture 10  user defined functions and modules
Lecture 10 user defined functions and modules
alvin567
 
Play with python lecture 2
Play with python lecture 2Play with python lecture 2
Play with python lecture 2
iloveallahsomuch
 
Lecture 8 strings and characters
Lecture 8  strings and charactersLecture 8  strings and characters
Lecture 8 strings and characters
alvin567
 
Lecture 6.2 flow control repetition
Lecture 6.2  flow control repetitionLecture 6.2  flow control repetition
Lecture 6.2 flow control repetition
alvin567
 
Lecture 0 beginning
Lecture 0  beginningLecture 0  beginning
Lecture 0 beginning
alvin567
 
Lecture 4 variables data types and operators
Lecture 4  variables data types and operatorsLecture 4  variables data types and operators
Lecture 4 variables data types and operators
alvin567
 
Lecture 1 computing and algorithms
Lecture 1  computing and algorithmsLecture 1  computing and algorithms
Lecture 1 computing and algorithms
alvin567
 
Lecture 12 exceptions
Lecture 12  exceptionsLecture 12  exceptions
Lecture 12 exceptions
alvin567
 
Lecture 9 composite types
Lecture 9  composite typesLecture 9  composite types
Lecture 9 composite types
alvin567
 

Destacado (18)

Lecture 10 user defined functions and modules
Lecture 10  user defined functions and modulesLecture 10  user defined functions and modules
Lecture 10 user defined functions and modules
 
Play with python lecture 2
Play with python lecture 2Play with python lecture 2
Play with python lecture 2
 
Lecture 8 strings and characters
Lecture 8  strings and charactersLecture 8  strings and characters
Lecture 8 strings and characters
 
Lecture 6.2 flow control repetition
Lecture 6.2  flow control repetitionLecture 6.2  flow control repetition
Lecture 6.2 flow control repetition
 
Lecture 0 beginning
Lecture 0  beginningLecture 0  beginning
Lecture 0 beginning
 
Lecture 4 variables data types and operators
Lecture 4  variables data types and operatorsLecture 4  variables data types and operators
Lecture 4 variables data types and operators
 
Python 3 Days
Python  3 DaysPython  3 Days
Python 3 Days
 
Introduction to WEB HTML, CSS
Introduction to WEB HTML, CSSIntroduction to WEB HTML, CSS
Introduction to WEB HTML, CSS
 
Python GUI Course Summary - 7 Modules
Python GUI Course Summary - 7 ModulesPython GUI Course Summary - 7 Modules
Python GUI Course Summary - 7 Modules
 
Lecture 1 computing and algorithms
Lecture 1  computing and algorithmsLecture 1  computing and algorithms
Lecture 1 computing and algorithms
 
Programming for Everybody in Python
Programming for Everybody in PythonProgramming for Everybody in Python
Programming for Everybody in Python
 
Python - Lecture 1
Python - Lecture 1Python - Lecture 1
Python - Lecture 1
 
Training Google Drive and Hangouts.pptx
Training Google Drive and Hangouts.pptxTraining Google Drive and Hangouts.pptx
Training Google Drive and Hangouts.pptx
 
Lecture 12 exceptions
Lecture 12  exceptionsLecture 12  exceptions
Lecture 12 exceptions
 
Building the Internet of Things with Raspberry Pi
Building the Internet of Things with Raspberry PiBuilding the Internet of Things with Raspberry Pi
Building the Internet of Things with Raspberry Pi
 
Basic concepts for python web development
Basic concepts for python web developmentBasic concepts for python web development
Basic concepts for python web development
 
Python Introduction
Python IntroductionPython Introduction
Python Introduction
 
Lecture 9 composite types
Lecture 9  composite typesLecture 9  composite types
Lecture 9 composite types
 

Similar a Lecture 11 file management

Microsoft power point chapter 5 file edited
Microsoft power point   chapter 5 file editedMicrosoft power point   chapter 5 file edited
Microsoft power point chapter 5 file edited
Linga Lgs
 
271 instructional ppt+benchmarks
271 instructional ppt+benchmarks271 instructional ppt+benchmarks
271 instructional ppt+benchmarks
kellymre
 
271 instructional pdf+benchmarks
271 instructional pdf+benchmarks271 instructional pdf+benchmarks
271 instructional pdf+benchmarks
kellymre
 
Ds
DsDs
Ds
HDRS
 

Similar a Lecture 11 file management (20)

File Management in Operating System
File Management in Operating SystemFile Management in Operating System
File Management in Operating System
 
File management in OS
File management in OSFile management in OS
File management in OS
 
Microsoft power point chapter 5 file edited
Microsoft power point   chapter 5 file editedMicrosoft power point   chapter 5 file edited
Microsoft power point chapter 5 file edited
 
File System operating system operating system
File System  operating system operating systemFile System  operating system operating system
File System operating system operating system
 
271 instructional ppt+benchmarks
271 instructional ppt+benchmarks271 instructional ppt+benchmarks
271 instructional ppt+benchmarks
 
271 instructional pdf+benchmarks
271 instructional pdf+benchmarks271 instructional pdf+benchmarks
271 instructional pdf+benchmarks
 
Computer Software - Lecture D
Computer Software - Lecture DComputer Software - Lecture D
Computer Software - Lecture D
 
Operating Systems: Linux in Detail
Operating Systems: Linux in DetailOperating Systems: Linux in Detail
Operating Systems: Linux in Detail
 
chapter-4-data-file-handlingeng.pdf
chapter-4-data-file-handlingeng.pdfchapter-4-data-file-handlingeng.pdf
chapter-4-data-file-handlingeng.pdf
 
Lecture 6
Lecture 6Lecture 6
Lecture 6
 
File and directory
File and directoryFile and directory
File and directory
 
1 cs xii_python_file_handling text n binary file
1 cs xii_python_file_handling text n binary file1 cs xii_python_file_handling text n binary file
1 cs xii_python_file_handling text n binary file
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Chapter 04
Chapter 04Chapter 04
Chapter 04
 
file management
 file management file management
file management
 
Ds
DsDs
Ds
 
Ds
DsDs
Ds
 
File organisation
File organisationFile organisation
File organisation
 
File system in operating system e learning
File system in operating system e learningFile system in operating system e learning
File system in operating system e learning
 
Computer Science 12th Topic- introduction to syllabus.pdf
Computer Science 12th Topic- introduction to syllabus.pdfComputer Science 12th Topic- introduction to syllabus.pdf
Computer Science 12th Topic- introduction to syllabus.pdf
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Lecture 11 file management