SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Functions and modules
       Karin Lagesen

  karin.lagesen@bio.uio.no
Homework:
            TranslateProtein.py
●   Input files are in
    /projects/temporary/cees-python-course/Karin
      ●   translationtable.txt - tab separated
      ●   dna31.fsa
●   Script should:
      ●   Open the translationtable.txt file and read it into a
          dictionary
      ●   Open the dna31.fsa file and read the contents.
      ●   Translates the DNA into protein using the dictionary
      ●   Prints the translation in a fasta format to the file
          TranslateProtein.fsa. Each protein line should be 60
          characters long.
Modularization
●   Programs can get big
●   Risk of doing the same thing many times
●   Functions and modules encourage
     ●   re-usability
     ●   readability
     ●   helps with maintenance
Functions
●   Most common way to modularize a
    program
●   Takes values as parameters, executes
    code on them, returns results
●   Functions also found builtin to Python:
     ●   open(filename, mode)
     ●   sum([list of numbers]
●   These do something on their parameters,
    and returns the results
Functions – how to define
    def FunctionName(param1, param2, ...):

          """ Optional Function desc (Docstring) """

          FUNCTION CODE ...

          return DATA



●
    keyword: def – says this is a function
●   functions need names
●   parameters are optional, but common
●   docstring useful, but not mandatory
●   FUNCTION CODE does something
●   keyword return results: return
Function example
     >>> def hello(name):
     ... results = "Hello World to " + name + "!"
     ... return results
     ...
     >>> hello()
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: hello() takes exactly 1 argument (0 given)
     >>> hello("Lex")
     'Hello World to Lex!'
     >>>



●   Task: make script from this – take name
    from command line
●   Print results to screen
Function example
                    script
import sys

def hello(name):
  results = "Hello World to " + name + "!"
  return results

name = sys.argv[1]
functionresult = hello(name)
print functionresult




[karinlag@freebee]% python hello.py
Traceback (most recent call last):
  File "hello.py", line 8, in ?
   name = sys.argv[1]
IndexError: list index out of range
[karinlag@freebee]% python hello.py Lex
Hello World to Lex!
[karinlag@freebee]%
Returning values
●   Returning is not mandatory, if no return,
    None is returned by default
●   Can return more than one value - results
    will be shown as a tuple

               >>> def test(x, y):
               ... a = x*y
               ... return x, a
               ...
               >>> test(1,2)
               (1, 2)
               >>>
Function scope
●   Variables defined inside a function can
    only be seen there!
●   Access the value of variables defined
    inside of function: return variable
Scope example
>>> def test(x):
... z = 10
... print "the value of z is " + str(z)
... return x*2
...
>>> z = 50
>>> test(3)
the value of z is 10
6
>>> z
50
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>>
Parameters
●   Functions can take parameters – not
    mandatory
●   Parameters follow the order in which they
    are given
    >>> def test(x, y):
    ... print x*2
    ... print y + str(x)
    ...
    >>> test(2, "y")
    4
    y2
    >>> test("y", 2)
    yy
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>>
Named parameters
●   Can use named parameters
    >>> def test(x, y):
    ... print x*2
    ... print y + str(x)
    ...
    >>> test(2, "y")
    4
    y2
    >>> test("y", 2)
    yy
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>> test(y="y", x=2)
    4
    y2
    >>>
Default parameters
●   Parameters can be given a default value
●   With default, parameter does not have to
    be specified, default will be used
●   Can still name parameter in parameter list
>>> def hello(name = "Everybody"):
... results = "Hello World to " + name + "!"
... return results
...
>>> hello("Anna")
'Hello World to Anna!'
>>> hello()
'Hello World to Everybody!'
>>> hello(name = "Annette")
'Hello World to Annette!'
>>>
Exercise
TranslateProteinFunctions.py
●   Use script from homework
●   Create the following functions:
     ●   get_translation_table(filename)
           –   return dict with codons and protein codes
     ●   read_dna_string(filename)
           –   return tuple with (descr, DNA_string)
     ●   translate_protein(dictionary, DNA_string)
           –   return the protein version of the DNA string
     ●   pretty_print(descr, protein_string, outname)
           –   write result to outname in fasta format
TranslateProteinFunctions.py
import sys



YOUR CODE GOES HERE!!!!

translationtable = sys.argv[1]
fastafile = sys.argv[2]
outfile      = sys.argv[3]

translation_dict = get_translation_table(translationtable)
description, DNA_string = read_dna_string(fastafile)
protein_string = translate_protein(translation_dict, DNA_string)
pretty_print(description, protein_string, outfile)
get_translation_table
def get_translation_table(translationtable):
  fh = open('translationtable.txt' , 'r')
  trans_dict = {}
  for line in fh:
     codon = line.split()[0]
         aa = line.split()[1]
     trans_dict[codon] = aa
  fh.close()
  return trans_dict
read_dna_string

def read_dna_string(fastafile):
  fh = open(fastafile, "r")
  line = fh.readline()
  header_line = line[1:-1]

  seq = ""
  for line in fh:
     seq += line[:-1]
  fh.close()
  return (header_line, seq)
translate_protein

def translate_protein(translation_dict, DNA_string):
  aa_seq = ""

  for i in range(0, len(DNA_string)-3, 3):
         codon = DNA_string[i:i+3]
         one_letter = translation_dict[codon]
         aa_seq += one_letter
  return aa_seq
pretty_print

def pretty_print(description, protein_string, outfile):
  fh = open(outfile, "w")
  fh.write(">" + description + "n")

  for i in range(0, len(protein_string), 60):
     fh.write(protein_string[i:i+60] + "n")
  fh.close()
Modules
●   A module is a file with functions, constants
    and other code in it
●   Module name = filename without .py
●   Can be used inside another program
●   Needs to be import-ed into program
●   Lots of builtin modules: sys, os, os.path....
●   Can also create your own
Using module
●   One of two import statements:
         1: import modulename
         2: from module import function/constant
●   If method 1:
     ●   modulename.function(arguments)
●   If method 2:
     ●   function(arguments) – module name not
         needed
     ●   beware of function name collision
Operating system modules –
      os and os.path
●   Modules dealing with files and operating
    system interaction
●   Commonly used methods:
     ●   os.getcwd() - get working directory
     ●   os.chdir(path) – change working directory
     ●   os.listdir([dir = .]) - get a list of all files in this
         directory
     ●   os.mkdir(path) – create directory
     ●   os.path.join(dirname, dirname/filename...)
Your own modules
●   Three steps:
       1. Create file with functions in it. Module
       name is same as filename without .py
       2. In other script, do
         import modulename
       3. In other script, use function like this:
         modulename.functionname(args)
Separating module use and
             main use
●   Files containing python code can be:
     ●   script file
     ●   module file
●   Module functions can be used in scripts
●   But: modules can also be scripts
●   Question is – how do you know if the code
    is being executed in the module script or
    an external script?
Module use / main use
●   When a script is being run, within that
    script a variable called __name__ will be
    set to the string “__main__”
●   Can test on this string to see if this script is
    being run
●   Benefit: can define functions in script that
    can be used in module mode later
Module mode / main mode
import sys



<code as before>
                                                     When this script is being used,
translationtable = sys.argv[1]                       this will always run, no matter what!
fastafile = sys.argv[2]
outfile      = sys.argv[3]

translation_dict = get_translation_table(translationtable)
description, DNA_string = read_dna_string(fastafile)
protein_string = translate_protein(translation_dict, DNA_string)
pretty_print(description, protein_string, outfile)
Module use / main use
# this is a script
import sys
import TranslateProteinFunctions

description, DNA_string = read_dna_string(sys.argv[1])
print description


[karinlag@freebee]% python modtest.py dna31.fsa
Traceback (most recent call last):
  File "modtest.py", line 2, in ?
   import TranslateProteinFunctions
  File "TranslateProteinFunctions.py", line 44, in ?
   fastafile = sys.argv[2]
IndexError: list index out of range
[karinlag@freebee]Karin%
TranslateProteinFuctions.py
         with main
import sys



<code as before>
if __name__ == “__main__”:
      translationtable = sys.argv[1]
      fastafile = sys.argv[2]
      outfile      = sys.argv[3]

      translation_dict = get_translation_table(translationtable)
      description, DNA_string = read_dna_string(fastafile)
      protein_string = translate_protein(translation_dict, DNA_string)
      pretty_print(description, protein_string, outfile)
ConcatFasta.py
●   Create a script that has the following:
      ●   function get_fastafiles(dirname)
            –   gets all the files in the directory, checks if they are
                fasta files (end in .fsa), returns list of fasta files
            –   hint: you need os.path to create full relative file
                names
      ●   function concat_fastafiles(filelist, outfile)
            –   takes a list of fasta files, opens and reads each of
                them, writes them to outfile
      ●   if __name__ == “__main__”:
            –   do what needs to be done to run script
●   Remember imports!

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Structure in C
Structure in CStructure in C
Structure in C
 
Presentation on Function in C Programming
Presentation on Function in C ProgrammingPresentation on Function in C Programming
Presentation on Function in C Programming
 
Recursive Function
Recursive FunctionRecursive Function
Recursive Function
 
Python: Modules and Packages
Python: Modules and PackagesPython: Modules and Packages
Python: Modules and Packages
 
File handling in c
File handling in cFile handling in c
File handling in c
 
File and directories in python
File and directories in pythonFile and directories in python
File and directories in python
 
Array ppt
Array pptArray ppt
Array ppt
 
Dynamic memory allocation
Dynamic memory allocationDynamic memory allocation
Dynamic memory allocation
 
Python list
Python listPython list
Python list
 
Function overloading(c++)
Function overloading(c++)Function overloading(c++)
Function overloading(c++)
 
Parameter passing to_functions_in_c
Parameter passing to_functions_in_cParameter passing to_functions_in_c
Parameter passing to_functions_in_c
 
Data Structures in Python
Data Structures in PythonData Structures in Python
Data Structures in Python
 
Constructor and Types of Constructors
Constructor and Types of ConstructorsConstructor and Types of Constructors
Constructor and Types of Constructors
 
Python programming : Files
Python programming : FilesPython programming : Files
Python programming : Files
 
STACKS IN DATASTRUCTURE
STACKS IN DATASTRUCTURESTACKS IN DATASTRUCTURE
STACKS IN DATASTRUCTURE
 
Introduction to Python
Introduction to Python Introduction to Python
Introduction to Python
 
Unit 4 python -list methods
Unit 4   python -list methodsUnit 4   python -list methods
Unit 4 python -list methods
 
Data types in python
Data types in pythonData types in python
Data types in python
 
Command line arguments
Command line argumentsCommand line arguments
Command line arguments
 
Programming in c Arrays
Programming in c ArraysProgramming in c Arrays
Programming in c Arrays
 

Similar a Functions and modules in python

PYTHON -Chapter 2 - Functions, Exception, Modules and Files -MAULIK BOR...
PYTHON -Chapter 2 - Functions,   Exception, Modules  and    Files -MAULIK BOR...PYTHON -Chapter 2 - Functions,   Exception, Modules  and    Files -MAULIK BOR...
PYTHON -Chapter 2 - Functions, Exception, Modules and Files -MAULIK BOR...
Maulik Borsaniya
 
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbsSystem Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
ashukiller7
 

Similar a Functions and modules in python (20)

Functions2.pptx
Functions2.pptxFunctions2.pptx
Functions2.pptx
 
Functions and Modules.pptx
Functions and Modules.pptxFunctions and Modules.pptx
Functions and Modules.pptx
 
3. functions modules_programs (1)
3. functions modules_programs (1)3. functions modules_programs (1)
3. functions modules_programs (1)
 
PYTHON -Chapter 2 - Functions, Exception, Modules and Files -MAULIK BOR...
PYTHON -Chapter 2 - Functions,   Exception, Modules  and    Files -MAULIK BOR...PYTHON -Chapter 2 - Functions,   Exception, Modules  and    Files -MAULIK BOR...
PYTHON -Chapter 2 - Functions, Exception, Modules and Files -MAULIK BOR...
 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
 
Python_Functions_Unit1.pptx
Python_Functions_Unit1.pptxPython_Functions_Unit1.pptx
Python_Functions_Unit1.pptx
 
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbsSystem Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
 
Functions2.pdf
Functions2.pdfFunctions2.pdf
Functions2.pdf
 
functions modules and exceptions handlings.ppt
functions modules and exceptions handlings.pptfunctions modules and exceptions handlings.ppt
functions modules and exceptions handlings.ppt
 
Introduction to c ++ part -2
Introduction to c ++   part -2Introduction to c ++   part -2
Introduction to c ++ part -2
 
Python programming
Python programmingPython programming
Python programming
 
Functions2.pdf
Functions2.pdfFunctions2.pdf
Functions2.pdf
 
Advance python
Advance pythonAdvance python
Advance python
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
 
Functions_21_22.pdf
Functions_21_22.pdfFunctions_21_22.pdf
Functions_21_22.pdf
 
Chapter Functions for grade 12 computer Science
Chapter Functions for grade 12 computer ScienceChapter Functions for grade 12 computer Science
Chapter Functions for grade 12 computer Science
 
Built in function
Built in functionBuilt in function
Built in function
 
Python basic
Python basicPython basic
Python basic
 
Functions.pdf
Functions.pdfFunctions.pdf
Functions.pdf
 
Functionscs12 ppt.pdf
Functionscs12 ppt.pdfFunctionscs12 ppt.pdf
Functionscs12 ppt.pdf
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Functions and modules in python

  • 1. Functions and modules Karin Lagesen karin.lagesen@bio.uio.no
  • 2. Homework: TranslateProtein.py ● Input files are in /projects/temporary/cees-python-course/Karin ● translationtable.txt - tab separated ● dna31.fsa ● Script should: ● Open the translationtable.txt file and read it into a dictionary ● Open the dna31.fsa file and read the contents. ● Translates the DNA into protein using the dictionary ● Prints the translation in a fasta format to the file TranslateProtein.fsa. Each protein line should be 60 characters long.
  • 3. Modularization ● Programs can get big ● Risk of doing the same thing many times ● Functions and modules encourage ● re-usability ● readability ● helps with maintenance
  • 4. Functions ● Most common way to modularize a program ● Takes values as parameters, executes code on them, returns results ● Functions also found builtin to Python: ● open(filename, mode) ● sum([list of numbers] ● These do something on their parameters, and returns the results
  • 5. Functions – how to define def FunctionName(param1, param2, ...): """ Optional Function desc (Docstring) """ FUNCTION CODE ... return DATA ● keyword: def – says this is a function ● functions need names ● parameters are optional, but common ● docstring useful, but not mandatory ● FUNCTION CODE does something ● keyword return results: return
  • 6. Function example >>> def hello(name): ... results = "Hello World to " + name + "!" ... return results ... >>> hello() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: hello() takes exactly 1 argument (0 given) >>> hello("Lex") 'Hello World to Lex!' >>> ● Task: make script from this – take name from command line ● Print results to screen
  • 7. Function example script import sys def hello(name): results = "Hello World to " + name + "!" return results name = sys.argv[1] functionresult = hello(name) print functionresult [karinlag@freebee]% python hello.py Traceback (most recent call last): File "hello.py", line 8, in ? name = sys.argv[1] IndexError: list index out of range [karinlag@freebee]% python hello.py Lex Hello World to Lex! [karinlag@freebee]%
  • 8. Returning values ● Returning is not mandatory, if no return, None is returned by default ● Can return more than one value - results will be shown as a tuple >>> def test(x, y): ... a = x*y ... return x, a ... >>> test(1,2) (1, 2) >>>
  • 9. Function scope ● Variables defined inside a function can only be seen there! ● Access the value of variables defined inside of function: return variable
  • 10. Scope example >>> def test(x): ... z = 10 ... print "the value of z is " + str(z) ... return x*2 ... >>> z = 50 >>> test(3) the value of z is 10 6 >>> z 50 >>> x Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'x' is not defined >>>
  • 11. Parameters ● Functions can take parameters – not mandatory ● Parameters follow the order in which they are given >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: 'int' and 'str' >>>
  • 12. Named parameters ● Can use named parameters >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: 'int' and 'str' >>> test(y="y", x=2) 4 y2 >>>
  • 13. Default parameters ● Parameters can be given a default value ● With default, parameter does not have to be specified, default will be used ● Can still name parameter in parameter list >>> def hello(name = "Everybody"): ... results = "Hello World to " + name + "!" ... return results ... >>> hello("Anna") 'Hello World to Anna!' >>> hello() 'Hello World to Everybody!' >>> hello(name = "Annette") 'Hello World to Annette!' >>>
  • 14. Exercise TranslateProteinFunctions.py ● Use script from homework ● Create the following functions: ● get_translation_table(filename) – return dict with codons and protein codes ● read_dna_string(filename) – return tuple with (descr, DNA_string) ● translate_protein(dictionary, DNA_string) – return the protein version of the DNA string ● pretty_print(descr, protein_string, outname) – write result to outname in fasta format
  • 15. TranslateProteinFunctions.py import sys YOUR CODE GOES HERE!!!! translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 16. get_translation_table def get_translation_table(translationtable): fh = open('translationtable.txt' , 'r') trans_dict = {} for line in fh: codon = line.split()[0] aa = line.split()[1] trans_dict[codon] = aa fh.close() return trans_dict
  • 17. read_dna_string def read_dna_string(fastafile): fh = open(fastafile, "r") line = fh.readline() header_line = line[1:-1] seq = "" for line in fh: seq += line[:-1] fh.close() return (header_line, seq)
  • 18. translate_protein def translate_protein(translation_dict, DNA_string): aa_seq = "" for i in range(0, len(DNA_string)-3, 3): codon = DNA_string[i:i+3] one_letter = translation_dict[codon] aa_seq += one_letter return aa_seq
  • 19. pretty_print def pretty_print(description, protein_string, outfile): fh = open(outfile, "w") fh.write(">" + description + "n") for i in range(0, len(protein_string), 60): fh.write(protein_string[i:i+60] + "n") fh.close()
  • 20. Modules ● A module is a file with functions, constants and other code in it ● Module name = filename without .py ● Can be used inside another program ● Needs to be import-ed into program ● Lots of builtin modules: sys, os, os.path.... ● Can also create your own
  • 21. Using module ● One of two import statements: 1: import modulename 2: from module import function/constant ● If method 1: ● modulename.function(arguments) ● If method 2: ● function(arguments) – module name not needed ● beware of function name collision
  • 22. Operating system modules – os and os.path ● Modules dealing with files and operating system interaction ● Commonly used methods: ● os.getcwd() - get working directory ● os.chdir(path) – change working directory ● os.listdir([dir = .]) - get a list of all files in this directory ● os.mkdir(path) – create directory ● os.path.join(dirname, dirname/filename...)
  • 23. Your own modules ● Three steps: 1. Create file with functions in it. Module name is same as filename without .py 2. In other script, do import modulename 3. In other script, use function like this: modulename.functionname(args)
  • 24. Separating module use and main use ● Files containing python code can be: ● script file ● module file ● Module functions can be used in scripts ● But: modules can also be scripts ● Question is – how do you know if the code is being executed in the module script or an external script?
  • 25. Module use / main use ● When a script is being run, within that script a variable called __name__ will be set to the string “__main__” ● Can test on this string to see if this script is being run ● Benefit: can define functions in script that can be used in module mode later
  • 26. Module mode / main mode import sys <code as before> When this script is being used, translationtable = sys.argv[1] this will always run, no matter what! fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 27. Module use / main use # this is a script import sys import TranslateProteinFunctions description, DNA_string = read_dna_string(sys.argv[1]) print description [karinlag@freebee]% python modtest.py dna31.fsa Traceback (most recent call last): File "modtest.py", line 2, in ? import TranslateProteinFunctions File "TranslateProteinFunctions.py", line 44, in ? fastafile = sys.argv[2] IndexError: list index out of range [karinlag@freebee]Karin%
  • 28. TranslateProteinFuctions.py with main import sys <code as before> if __name__ == “__main__”: translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 29. ConcatFasta.py ● Create a script that has the following: ● function get_fastafiles(dirname) – gets all the files in the directory, checks if they are fasta files (end in .fsa), returns list of fasta files – hint: you need os.path to create full relative file names ● function concat_fastafiles(filelist, outfile) – takes a list of fasta files, opens and reads each of them, writes them to outfile ● if __name__ == “__main__”: – do what needs to be done to run script ● Remember imports!