SlideShare una empresa de Scribd logo
1 de 51
Introduction to Python
Chen LinChen Lin
clin@brandeis.educlin@brandeis.edu
COSI 134aCOSI 134a
Volen 110Volen 110
Office Hour: Thurs. 3-5Office Hour: Thurs. 3-5
For More Information?
http://python.org/
- documentation, tutorials, beginners guide, core
distribution, ...
Books include:
 Learning Python by Mark Lutz
 Python Essential Reference by David Beazley
 Python Cookbook, ed. by Martelli, Ravenscroft and
Ascher
 (online at
http://code.activestate.com/recipes/langs/python/)
 http://wiki.python.org/moin/PythonBooks
Python VideosPython Videos
http://showmedo.com/videotutorials/python
“5 Minute Overview (What Does Python
Look Like?)”
“Introducing the PyDev IDE for Eclipse”
“Linear Algebra with Numpy”
And many more
4 Major Versions of Python4 Major Versions of Python
“Python” or “CPython” is written in C/C++
- Version 2.7 came out in mid-2010
- Version 3.1.2 came out in early 2010
“Jython” is written in Java for the JVM
“IronPython” is written in C# for the .Net
environment
Go To Website
Development EnvironmentsDevelopment Environments
what IDE to use?what IDE to use? http://stackoverflow.com/questions/81584http://stackoverflow.com/questions/81584
1. PyDev with Eclipse
2. Komodo
3. Emacs
4. Vim
5. TextMate
6. Gedit
7. Idle
8. PIDA (Linux)(VIM Based)
9. NotePad++ (Windows)
10.BlueFish (Linux)
Pydev with EclipsePydev with Eclipse
Python Interactive ShellPython Interactive Shell
% python% python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.Type "help", "copyright", "credits" or "license" for more information.
>>>>>>
You can type things directly into a running Python sessionYou can type things directly into a running Python session
>>> 2+3*4>>> 2+3*4
1414
>>> name = "Andrew">>> name = "Andrew"
>>> name>>> name
'Andrew''Andrew'
>>> print "Hello", name>>> print "Hello", name
Hello AndrewHello Andrew
>>>>>>
 BackgroundBackground
 Data Types/StructureData Types/Structure
 Control flowControl flow
 File I/OFile I/O
 ModulesModules
 ClassClass
 NLTKNLTK
ListList
A compound data type:A compound data type:
[0][0]
[2.3, 4.5][2.3, 4.5]
[5, "Hello", "there", 9.8][5, "Hello", "there", 9.8]
[][]
Use len() to get the length of a listUse len() to get the length of a list
>>> names = [“Ben", “Chen", “Yaqin"]>>> names = [“Ben", “Chen", “Yaqin"]
>>> len(names)>>> len(names)
33
Use [ ] to index items in the listUse [ ] to index items in the list
>>> names[0]>>> names[0]
‘‘Ben'Ben'
>>> names[1]>>> names[1]
‘‘Chen'Chen'
>>> names[2]>>> names[2]
‘‘Yaqin'Yaqin'
>>> names[3]>>> names[3]
Traceback (most recent call last):Traceback (most recent call last):
File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module>
IndexError: list index out of rangeIndexError: list index out of range
>>> names[-1]>>> names[-1]
‘‘Yaqin'Yaqin'
>>> names[-2]>>> names[-2]
‘‘Chen'Chen'
>>> names[-3]>>> names[-3]
‘‘Ben'Ben'
[0] is the first item.
[1] is the second item
...
Out of range values
raise an exception
Negative values
go backwards from
the last element.
Strings share many features with listsStrings share many features with lists
>>> smiles = "C(=N)(N)N.C(=O)(O)O">>> smiles = "C(=N)(N)N.C(=O)(O)O"
>>> smiles[0]>>> smiles[0]
'C''C'
>>> smiles[1]>>> smiles[1]
'(''('
>>> smiles[-1]>>> smiles[-1]
'O''O'
>>> smiles[1:5]>>> smiles[1:5]
'(=N)''(=N)'
>>> smiles[10:-4]>>> smiles[10:-4]
'C(=O)''C(=O)'
Use “slice” notation to
get a substring
String Methods: find, splitString Methods: find, split
smiles = "C(=N)(N)N.C(=O)(O)O"smiles = "C(=N)(N)N.C(=O)(O)O"
>>> smiles.find("(O)")>>> smiles.find("(O)")
1515
>>> smiles.find(".")>>> smiles.find(".")
99
>>> smiles.find(".", 10)>>> smiles.find(".", 10)
-1-1
>>> smiles.split(".")>>> smiles.split(".")
['C(=N)(N)N', 'C(=O)(O)O']['C(=N)(N)N', 'C(=O)(O)O']
>>>>>>
Use “find” to find the
start of a substring.
Start looking at position 10.
Find returns -1 if it couldn’t
find a match.
Split the string into parts
with “.” as the delimiter
String operators: in, not inString operators: in, not in
if "Br" in “Brother”:if "Br" in “Brother”:
print "contains brother“print "contains brother“
email_address = “clin”email_address = “clin”
if "@" not in email_address:if "@" not in email_address:
email_address += "@brandeis.edu“email_address += "@brandeis.edu“
String Method: “strip”, “rstrip”, “lstrip” are ways toString Method: “strip”, “rstrip”, “lstrip” are ways to
remove whitespace or selected charactersremove whitespace or selected characters
>>> line = " # This is a comment line n">>> line = " # This is a comment line n"
>>> line.strip()>>> line.strip()
'# This is a comment line''# This is a comment line'
>>> line.rstrip()>>> line.rstrip()
' # This is a comment line'' # This is a comment line'
>>> line.rstrip("n")>>> line.rstrip("n")
' # This is a comment line '' # This is a comment line '
>>>>>>
More String methodsMore String methods
email.startswith(“c") endswith(“u”)email.startswith(“c") endswith(“u”)
True/FalseTrue/False
>>> "%s@brandeis.edu" % "clin">>> "%s@brandeis.edu" % "clin"
'clin@brandeis.edu''clin@brandeis.edu'
>>> names = [“Ben", “Chen", “Yaqin"]>>> names = [“Ben", “Chen", “Yaqin"]
>>> ", ".join(names)>>> ", ".join(names)
‘‘Ben, Chen, Yaqin‘Ben, Chen, Yaqin‘
>>> “chen".upper()>>> “chen".upper()
‘‘CHEN'CHEN'
Unexpected things about stringsUnexpected things about strings
>>> s = "andrew">>> s = "andrew"
>>> s[0] = "A">>> s[0] = "A"
Traceback (most recent call last):Traceback (most recent call last):
File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support itemTypeError: 'str' object does not support item
assignmentassignment
>>> s = "A" + s[1:]>>> s = "A" + s[1:]
>>> s>>> s
'Andrew‘'Andrew‘
Strings are read only
““” is for special characters” is for special characters
n -> newlinen -> newline
t -> tabt -> tab
 -> backslash -> backslash
......
But Windows uses backslash for directories!
filename = "M:nickel_projectreactive.smi" # DANGER!
filename = "M:nickel_projectreactive.smi" # Better!
filename = "M:/nickel_project/reactive.smi" # Usually works
Lists are mutable - some usefulLists are mutable - some useful
methodsmethods
>>> ids = ["9pti", "2plv", "1crn"]>>> ids = ["9pti", "2plv", "1crn"]
>>> ids.append("1alm")>>> ids.append("1alm")
>>> ids>>> ids
['9pti', '2plv', '1crn', '1alm']['9pti', '2plv', '1crn', '1alm']
>>>ids.extend(L)>>>ids.extend(L)
Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.
>>> del ids[0]>>> del ids[0]
>>> ids>>> ids
['2plv', '1crn', '1alm']['2plv', '1crn', '1alm']
>>> ids.sort()>>> ids.sort()
>>> ids>>> ids
['1alm', '1crn', '2plv']['1alm', '1crn', '2plv']
>>> ids.reverse()>>> ids.reverse()
>>> ids>>> ids
['2plv', '1crn', '1alm']['2plv', '1crn', '1alm']
>>> ids.insert(0, "9pti")>>> ids.insert(0, "9pti")
>>> ids>>> ids
['9pti', '2plv', '1crn', '1alm']['9pti', '2plv', '1crn', '1alm']
append an element
remove an element
sort by default order
reverse the elements in a list
insert an element at some
specified position.
(Slower than .append())
Tuples:Tuples: sort of an immutable list
>>> yellow = (255, 255, 0) # r, g, b>>> yellow = (255, 255, 0) # r, g, b
>>> one = (1,)>>> one = (1,)
>>> yellow[0]>>> yellow[0]
>>> yellow[1:]>>> yellow[1:]
(255, 0)(255, 0)
>>> yellow[0] = 0>>> yellow[0] = 0
Traceback (most recent call last):Traceback (most recent call last):
File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignmentTypeError: 'tuple' object does not support item assignment
Very common in string interpolation:
>>> "%s lives in %s at latitude %.1f" % ("Andrew", "Sweden", 57.7056)
'Andrew lives in Sweden at latitude 57.7'
zipping lists togetherzipping lists together
>>> names>>> names
['ben', 'chen', 'yaqin']['ben', 'chen', 'yaqin']
>>> gender =>>> gender = [0, 0, 1][0, 0, 1]
>>> zip(names, gender)>>> zip(names, gender)
[('ben', 0), ('chen', 0), ('yaqin', 1)][('ben', 0), ('chen', 0), ('yaqin', 1)]
DictionariesDictionaries
 Dictionaries are lookup tables.
 They map from a “key” to a “value”.
symbol_to_name = {
"H": "hydrogen",
"He": "helium",
"Li": "lithium",
"C": "carbon",
"O": "oxygen",
"N": "nitrogen"
}
 Duplicate keys are not allowed
 Duplicate values are just fine
Keys can be any immutable valueKeys can be any immutable value
numbers, strings, tuples, frozensetnumbers, strings, tuples, frozenset,,
not list, dictionary, set, ...not list, dictionary, set, ...
atomic_number_to_name = {atomic_number_to_name = {
1: "hydrogen"1: "hydrogen"
6: "carbon",6: "carbon",
7: "nitrogen"7: "nitrogen"
8: "oxygen",8: "oxygen",
}}
nobel_prize_winners = {nobel_prize_winners = {
(1979, "physics"): ["Glashow", "Salam", "Weinberg"],(1979, "physics"): ["Glashow", "Salam", "Weinberg"],
(1962, "chemistry"): ["Hodgkin"],(1962, "chemistry"): ["Hodgkin"],
(1984, "biology"): ["McClintock"],(1984, "biology"): ["McClintock"],
}}
A set is an unordered collection
with no duplicate elements.
DictionaryDictionary
>>> symbol_to_name["C"]>>> symbol_to_name["C"]
'carbon''carbon'
>>> "O" in symbol_to_name, "U" in symbol_to_name>>> "O" in symbol_to_name, "U" in symbol_to_name
(True, False)(True, False)
>>> "oxygen" in symbol_to_name>>> "oxygen" in symbol_to_name
FalseFalse
>>> symbol_to_name["P"]>>> symbol_to_name["P"]
Traceback (most recent call last):Traceback (most recent call last):
File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module>
KeyError: 'P'KeyError: 'P'
>>> symbol_to_name.get("P", "unknown")>>> symbol_to_name.get("P", "unknown")
'unknown''unknown'
>>> symbol_to_name.get("C", "unknown")>>> symbol_to_name.get("C", "unknown")
'carbon''carbon'
Get the value for a given key
Test if the key exists
(“in” only checks the keys,
not the values.)
[] lookup failures raise an exception.
Use “.get()” if you want
to return a default value.
Some useful dictionary methodsSome useful dictionary methods
>>> symbol_to_name.keys()>>> symbol_to_name.keys()
['C', 'H', 'O', 'N', 'Li', 'He']['C', 'H', 'O', 'N', 'Li', 'He']
>>> symbol_to_name.values()>>> symbol_to_name.values()
['carbon', 'hydrogen', 'oxygen', 'nitrogen', 'lithium', 'helium']['carbon', 'hydrogen', 'oxygen', 'nitrogen', 'lithium', 'helium']
>>> symbol_to_name.update( {"P": "phosphorous", "S": "sulfur"} )>>> symbol_to_name.update( {"P": "phosphorous", "S": "sulfur"} )
>>> symbol_to_name.items()>>> symbol_to_name.items()
[('C', 'carbon'), ('H', 'hydrogen'), ('O', 'oxygen'), ('N', 'nitrogen'), ('P',[('C', 'carbon'), ('H', 'hydrogen'), ('O', 'oxygen'), ('N', 'nitrogen'), ('P',
'phosphorous'), ('S', 'sulfur'), ('Li', 'lithium'), ('He', 'helium')]'phosphorous'), ('S', 'sulfur'), ('Li', 'lithium'), ('He', 'helium')]
>>> del symbol_to_name['C']>>> del symbol_to_name['C']
>>> symbol_to_name>>> symbol_to_name
{'H': 'hydrogen', 'O': 'oxygen', 'N': 'nitrogen', 'Li': 'lithium', 'He': 'helium'}{'H': 'hydrogen', 'O': 'oxygen', 'N': 'nitrogen', 'Li': 'lithium', 'He': 'helium'}
 BackgroundBackground
 Data Types/StructureData Types/Structure
list, string, tuple, dictionarylist, string, tuple, dictionary
 Control flowControl flow
 File I/OFile I/O
 ModulesModules
 ClassClass
 NLTKNLTK
Control FlowControl Flow
Things that are FalseThings that are False
 The boolean value False
 The numbers 0 (integer), 0.0 (float) and 0j (complex).
 The empty string "".
 The empty list [], empty dictionary {} and empty set set().
Things that are TrueThings that are True
 The boolean value TrueThe boolean value True
 All non-zero numbers.All non-zero numbers.
 Any string containing at least one character.Any string containing at least one character.
 A non-empty data structure.A non-empty data structure.
IfIf
>>> smiles = "BrC1=CC=C(C=C1)NN.Cl">>> smiles = "BrC1=CC=C(C=C1)NN.Cl"
>>> bool(smiles)>>> bool(smiles)
TrueTrue
>>> not bool(smiles)>>> not bool(smiles)
FalseFalse
>>> if not smiles>>> if not smiles::
... print "The SMILES string is empty"... print "The SMILES string is empty"
......
 The “else” case is always optional
Use “elif” to chain subsequent testsUse “elif” to chain subsequent tests
>>> mode = "absolute">>> mode = "absolute"
>>> if mode == "canonical":>>> if mode == "canonical":
...... smiles = "canonical"smiles = "canonical"
... elif mode == "isomeric":... elif mode == "isomeric":
...... smiles = "isomeric”smiles = "isomeric”
...... elif mode == "absolute":elif mode == "absolute":
...... smiles = "absolute"smiles = "absolute"
... else:... else:
...... raise TypeError("unknown mode")raise TypeError("unknown mode")
......
>>> smiles>>> smiles
' absolute '' absolute '
>>>>>>
“raise” is the Python way to raise exceptions
Boolean logicBoolean logic
Python expressions can have “and”s andPython expressions can have “and”s and
“or”s:“or”s:
if (benif (ben <=<= 5 and chen5 and chen >=>= 10 or10 or
chenchen ==== 500 and ben500 and ben !=!= 5):5):
print “Ben and Chen“print “Ben and Chen“
Range TestRange Test
if (3if (3 <= Time <=<= Time <= 5):5):
print “Office Hour"print “Office Hour"
ForFor
>>> names = [“Ben", “Chen", “Yaqin"]>>> names = [“Ben", “Chen", “Yaqin"]
>>> for name in names:>>> for name in names:
...... print smilesprint smiles
......
BenBen
ChenChen
YaqinYaqin
Tuple assignment in for loopsTuple assignment in for loops
data = [ ("C20H20O3", 308.371),data = [ ("C20H20O3", 308.371),
("C22H20O2", 316.393),("C22H20O2", 316.393),
("C24H40N4O2", 416.6),("C24H40N4O2", 416.6),
("C14H25N5O3", 311.38),("C14H25N5O3", 311.38),
("C15H20O2", 232.3181)]("C15H20O2", 232.3181)]
forfor (formula, mw)(formula, mw) in data:in data:
print "The molecular weight of %s is %s" % (formula, mw)print "The molecular weight of %s is %s" % (formula, mw)
The molecular weight of C20H20O3 is 308.371The molecular weight of C20H20O3 is 308.371
The molecular weight of C22H20O2 is 316.393The molecular weight of C22H20O2 is 316.393
The molecular weight of C24H40N4O2 is 416.6The molecular weight of C24H40N4O2 is 416.6
The molecular weight of C14H25N5O3 is 311.38The molecular weight of C14H25N5O3 is 311.38
The molecular weight of C15H20O2 is 232.3181The molecular weight of C15H20O2 is 232.3181
Break, continueBreak, continue
>>> for value in [3, 1, 4, 1, 5, 9, 2]:>>> for value in [3, 1, 4, 1, 5, 9, 2]:
...... print "Checking", valueprint "Checking", value
...... if value > 8:if value > 8:
...... print "Exiting for loop"print "Exiting for loop"
...... breakbreak
...... elif value < 3:elif value < 3:
...... print "Ignoring"print "Ignoring"
...... continuecontinue
...... print "The square is", value**2print "The square is", value**2
......
Use “break” to stopUse “break” to stop
the for loopthe for loop
Use “continue” to stopUse “continue” to stop
processing the current itemprocessing the current item
Checking 3Checking 3
The square is 9The square is 9
Checking 1Checking 1
IgnoringIgnoring
Checking 4Checking 4
The square is 16The square is 16
Checking 1Checking 1
IgnoringIgnoring
Checking 5Checking 5
The square is 25The square is 25
Checking 9Checking 9
Exiting for loopExiting for loop
>>>>>>
Range()Range()
 ““range” creates a list of numbers in a specified rangerange” creates a list of numbers in a specified range
 range([start,] stop[, step]) -> list of integersrange([start,] stop[, step]) -> list of integers
 When step is given, it specifies the increment (or decrement).When step is given, it specifies the increment (or decrement).
>>> range(5)>>> range(5)
[0, 1, 2, 3, 4][0, 1, 2, 3, 4]
>>> range(5, 10)>>> range(5, 10)
[5, 6, 7, 8, 9][5, 6, 7, 8, 9]
>>> range(0, 10, 2)>>> range(0, 10, 2)
[0, 2, 4, 6, 8][0, 2, 4, 6, 8]
How to get every second element in a list?
for i in range(0, len(data), 2):
print data[i]
 BackgroundBackground
 Data Types/StructureData Types/Structure
 Control flowControl flow
 File I/OFile I/O
 ModulesModules
 ClassClass
 NLTKNLTK
Reading filesReading files
>>> f = open(“names.txt")>>> f = open(“names.txt")
>>> f.readline()>>> f.readline()
'Yaqinn''Yaqinn'
Quick WayQuick Way
>>> lst= [ x for x in open("text.txt","r").readlines() ]>>> lst= [ x for x in open("text.txt","r").readlines() ]
>>> lst>>> lst
['Chen Linn', 'clin@brandeis.edun', 'Volen 110n', 'Office['Chen Linn', 'clin@brandeis.edun', 'Volen 110n', 'Office
Hour: Thurs. 3-5n', 'n', 'Yaqin Yangn',Hour: Thurs. 3-5n', 'n', 'Yaqin Yangn',
'yaqin@brandeis.edun', 'Volen 110n', 'Offiche Hour:'yaqin@brandeis.edun', 'Volen 110n', 'Offiche Hour:
Tues. 3-5n']Tues. 3-5n']
Ignore the header?Ignore the header?
for (i,line) in enumerate(open(‘text.txt’,"r").readlines()):for (i,line) in enumerate(open(‘text.txt’,"r").readlines()):
if i == 0: continueif i == 0: continue
print lineprint line
Using dictionaries to countUsing dictionaries to count
occurrencesoccurrences
>>> for line in open('names.txt'):>>> for line in open('names.txt'):
...... name = line.strip()name = line.strip()
...... name_count[name] = name_count.get(name,0)+ 1name_count[name] = name_count.get(name,0)+ 1
......
>>> for (name, count) in name_count.items():>>> for (name, count) in name_count.items():
...... print name, countprint name, count
......
Chen 3Chen 3
Ben 3Ben 3
Yaqin 3Yaqin 3
File OutputFile Output
input_file = open(“in.txt")input_file = open(“in.txt")
output_file = open(“out.txt", "w")output_file = open(“out.txt", "w")
for line in input_file:for line in input_file:
output_file.write(line)output_file.write(line)
“w” = “write mode”
“a” = “append mode”
“wb” = “write in binary”
“r” = “read mode” (default)
“rb” = “read in binary”
“U” = “read files with Unix
or Windows line endings”
 BackgroundBackground
 Data Types/StructureData Types/Structure
 Control flowControl flow
 File I/OFile I/O
 ModulesModules
 ClassClass
 NLTKNLTK
ModulesModules
When a Python program starts it only has
access to a basic functions and classes.
(“int”, “dict”, “len”, “sum”, “range”, ...)
“Modules” contain additional functionality.
Use “import” to tell Python to load a
module.
>>> import math
>>> import nltk
import the math moduleimport the math module
>>> import math>>> import math
>>> math.pi>>> math.pi
3.14159265358979313.1415926535897931
>>> math.cos(0)>>> math.cos(0)
1.01.0
>>> math.cos(math.pi)>>> math.cos(math.pi)
-1.0-1.0
>>> dir(math)>>> dir(math)
['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh',['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh',
'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos','asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos',
'cosh', 'degrees', 'e', 'exp', 'fabs', 'factorial', 'floor', 'fmod','cosh', 'degrees', 'e', 'exp', 'fabs', 'factorial', 'floor', 'fmod',
'frexp', 'fsum', 'hypot', 'isinf', 'isnan', 'ldexp', 'log', 'log10','frexp', 'fsum', 'hypot', 'isinf', 'isnan', 'ldexp', 'log', 'log10',
'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan','log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan',
'tanh', 'trunc']'tanh', 'trunc']
>>> help(math)>>> help(math)
>>> help(math.cos)>>> help(math.cos)
““import” and “from ... import ...”import” and “from ... import ...”
>>> import math>>> import math
math.cosmath.cos
>>> from math import cos, pi
cos
>>> from math import *
 BackgroundBackground
 Data Types/StructureData Types/Structure
 Control flowControl flow
 File I/OFile I/O
 ModulesModules
 ClassClass
 NLTKNLTK
ClassesClasses
class ClassName(object):class ClassName(object):
<statement-1><statement-1>
. . .. . .
<statement-N><statement-N>
class MyClass(object):class MyClass(object):
"""A simple example class""""""A simple example class"""
i = 1234512345
def f(self):def f(self):
return self.ireturn self.i
class DerivedClassName(BaseClassName):class DerivedClassName(BaseClassName):
<statement-1><statement-1>
. . .. . .
<statement-N><statement-N>
 BackgroundBackground
 Data Types/StructureData Types/Structure
 Control flowControl flow
 File I/OFile I/O
 ModulesModules
 ClassClass
 NLTKNLTK
http://www.nltk.org/bookhttp://www.nltk.org/book
NLTK is on berry patch machines!NLTK is on berry patch machines!
>>>from nltk.book import *>>>from nltk.book import *
>>> text1>>> text1
<Text: Moby Dick by Herman Melville 1851><Text: Moby Dick by Herman Melville 1851>
>>> text1.name>>> text1.name
'Moby Dick by Herman Melville 1851''Moby Dick by Herman Melville 1851'
>>> text1.concordance("monstrous")>>> text1.concordance("monstrous")
>>> dir(text1)>>> dir(text1)
>>> text1.tokens>>> text1.tokens
>>> text1.index("my")>>> text1.index("my")
46474647
>>> sent2>>> sent2
['The', 'family', 'of', 'Dashwood', 'had', 'long', 'been', 'settled', 'in',['The', 'family', 'of', 'Dashwood', 'had', 'long', 'been', 'settled', 'in',
'Sussex', '.']'Sussex', '.']
Classify TextClassify Text
>>> def gender_features(word):>>> def gender_features(word):
...... return {'last_letter': word[-1]}return {'last_letter': word[-1]}
>>> gender_features('Shrek')>>> gender_features('Shrek')
{'last_letter': 'k'}{'last_letter': 'k'}
>>> from nltk.corpus import names>>> from nltk.corpus import names
>>> import random>>> import random
>>> names = ([(name, 'male') for name in names.words('male.txt')] +>>> names = ([(name, 'male') for name in names.words('male.txt')] +
... [(name, 'female') for name in names.words('female.txt')])... [(name, 'female') for name in names.words('female.txt')])
>>> random.shuffle(names)>>> random.shuffle(names)
Featurize, train, test, predictFeaturize, train, test, predict
>>> featuresets = [(gender_features(n), g) for (n,g) in names]>>> featuresets = [(gender_features(n), g) for (n,g) in names]
>>> train_set, test_set = featuresets[500:], featuresets[:500]>>> train_set, test_set = featuresets[500:], featuresets[:500]
>>> classifier = nltk.NaiveBayesClassifier.train(train_set)>>> classifier = nltk.NaiveBayesClassifier.train(train_set)
>>> print nltk.classify.accuracy(classifier, test_set)>>> print nltk.classify.accuracy(classifier, test_set)
0.7260.726
>>> classifier.classify(gender_features('Neo'))>>> classifier.classify(gender_features('Neo'))
'male''male'
fromfrom nltknltk.corpus import.corpus import reutersreuters
 Reuters Corpus:Reuters Corpus:10,788 news10,788 news
1.3 million words.1.3 million words.
 Been classified intoBeen classified into 9090 topicstopics
 Grouped into 2 sets, "training" and "test“Grouped into 2 sets, "training" and "test“
 Categories overlap with each otherCategories overlap with each other
http://nltk.googlecode.com/svn/trunk/doc/bohttp://nltk.googlecode.com/svn/trunk/doc/bo
ok/ch02.htmlok/ch02.html
ReutersReuters
>>> from nltk.corpus import reuters>>> from nltk.corpus import reuters
>>> reuters.fileids()>>> reuters.fileids()
['test/14826', 'test/14828', 'test/14829', 'test/14832', ...]['test/14826', 'test/14828', 'test/14829', 'test/14832', ...]
>>> reuters.categories()>>> reuters.categories()
['acq', 'alum', 'barley', 'bop', 'carcass', 'castor-oil', 'cocoa', 'coconut',['acq', 'alum', 'barley', 'bop', 'carcass', 'castor-oil', 'cocoa', 'coconut',
'coconut-oil', 'coffee', 'copper', 'copra-cake', 'corn', 'cotton', 'cotton-'coconut-oil', 'coffee', 'copper', 'copra-cake', 'corn', 'cotton', 'cotton-
oil', 'cpi', 'cpu', 'crude', 'dfl', 'dlr', ...]oil', 'cpi', 'cpu', 'crude', 'dfl', 'dlr', ...]

Más contenido relacionado

La actualidad más candente

Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...
Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...
Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...Matt Harrison
 
Python Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG ManiaplPython Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG ManiaplAnkur Shrivastava
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, futuredelimitry
 
4 b file-io-if-then-else
4 b file-io-if-then-else4 b file-io-if-then-else
4 b file-io-if-then-elseMalik Alig
 
Python Datatypes by SujithKumar
Python Datatypes by SujithKumarPython Datatypes by SujithKumar
Python Datatypes by SujithKumarSujith Kumar
 
Datastructures in python
Datastructures in pythonDatastructures in python
Datastructures in pythonhydpy
 
15. Streams Files and Directories
15. Streams Files and Directories 15. Streams Files and Directories
15. Streams Files and Directories Intro C# Book
 
13. Java text processing
13.  Java text processing13.  Java text processing
13. Java text processingIntro C# Book
 
Introduction to R
Introduction to RIntroduction to R
Introduction to Rvpletap
 
Datatypes in python
Datatypes in pythonDatatypes in python
Datatypes in pythoneShikshak
 
Introduction to python programming 1
Introduction to python programming   1Introduction to python programming   1
Introduction to python programming 1Giovanni Della Lunga
 
Kotlin, Spek and tests
Kotlin, Spek and testsKotlin, Spek and tests
Kotlin, Spek and testsintive
 
java 8 Hands on Workshop
java 8 Hands on Workshopjava 8 Hands on Workshop
java 8 Hands on WorkshopJeanne Boyarsky
 
awesome groovy
awesome groovyawesome groovy
awesome groovyPaul King
 
The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)jaxLondonConference
 
Python fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanPython fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanWei-Yuan Chang
 

La actualidad más candente (20)

Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...
Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...
Analysis of Fatal Utah Avalanches with Python. From Scraping, Analysis, to In...
 
Python Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG ManiaplPython Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG Maniapl
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
 
Begin with Python
Begin with PythonBegin with Python
Begin with Python
 
4 b file-io-if-then-else
4 b file-io-if-then-else4 b file-io-if-then-else
4 b file-io-if-then-else
 
Python Datatypes by SujithKumar
Python Datatypes by SujithKumarPython Datatypes by SujithKumar
Python Datatypes by SujithKumar
 
Datastructures in python
Datastructures in pythonDatastructures in python
Datastructures in python
 
15. Streams Files and Directories
15. Streams Files and Directories 15. Streams Files and Directories
15. Streams Files and Directories
 
13. Java text processing
13.  Java text processing13.  Java text processing
13. Java text processing
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Python 101
Python 101Python 101
Python 101
 
Datatypes in python
Datatypes in pythonDatatypes in python
Datatypes in python
 
Python Modules and Libraries
Python Modules and LibrariesPython Modules and Libraries
Python Modules and Libraries
 
Introduction to python programming 1
Introduction to python programming   1Introduction to python programming   1
Introduction to python programming 1
 
Kotlin, Spek and tests
Kotlin, Spek and testsKotlin, Spek and tests
Kotlin, Spek and tests
 
java 8 Hands on Workshop
java 8 Hands on Workshopjava 8 Hands on Workshop
java 8 Hands on Workshop
 
awesome groovy
awesome groovyawesome groovy
awesome groovy
 
1. python
1. python1. python
1. python
 
The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)
 
Python fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanPython fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuan
 

Similar a Python tutorial

Python tutorial
Python tutorialPython tutorial
Python tutorialRajiv Risi
 
Introduction to Python Programming.ppt
Introduction to Python Programming.pptIntroduction to Python Programming.ppt
Introduction to Python Programming.pptUmeshKumarSingh31
 
python chapter 1
python chapter 1python chapter 1
python chapter 1Raghu nath
 
Python chapter 2
Python chapter 2Python chapter 2
Python chapter 2Raghu nath
 
Becoming a Pythonist
Becoming a PythonistBecoming a Pythonist
Becoming a PythonistRaji Engg
 
Python Workshop - Learn Python the Hard Way
Python Workshop - Learn Python the Hard WayPython Workshop - Learn Python the Hard Way
Python Workshop - Learn Python the Hard WayUtkarsh Sengar
 
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...amit kuraria
 
Τα Πολύ Βασικά για την Python
Τα Πολύ Βασικά για την PythonΤα Πολύ Βασικά για την Python
Τα Πολύ Βασικά για την PythonMoses Boudourides
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"sandinmyjoints
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In RR Programming: Export/Output Data In R
R Programming: Export/Output Data In RRsquared Academy
 
Python language data types
Python language data typesPython language data types
Python language data typesHarry Potter
 
Python language data types
Python language data typesPython language data types
Python language data typesHoang Nguyen
 
Python language data types
Python language data typesPython language data types
Python language data typesLuis Goldster
 
Python language data types
Python language data typesPython language data types
Python language data typesTony Nguyen
 
Python language data types
Python language data typesPython language data types
Python language data typesFraboni Ec
 
Python language data types
Python language data typesPython language data types
Python language data typesJames Wong
 

Similar a Python tutorial (20)

Python tutorial
Python tutorialPython tutorial
Python tutorial
 
Introduction to Python Programming.ppt
Introduction to Python Programming.pptIntroduction to Python Programming.ppt
Introduction to Python Programming.ppt
 
Python material
Python materialPython material
Python material
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
python chapter 1
python chapter 1python chapter 1
python chapter 1
 
Python chapter 2
Python chapter 2Python chapter 2
Python chapter 2
 
Python Workshop
Python  Workshop Python  Workshop
Python Workshop
 
Becoming a Pythonist
Becoming a PythonistBecoming a Pythonist
Becoming a Pythonist
 
Python Workshop - Learn Python the Hard Way
Python Workshop - Learn Python the Hard WayPython Workshop - Learn Python the Hard Way
Python Workshop - Learn Python the Hard Way
 
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
Write better python code with these 10 tricks | by yong cui, ph.d. | aug, 202...
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
Τα Πολύ Βασικά για την Python
Τα Πολύ Βασικά για την PythonΤα Πολύ Βασικά για την Python
Τα Πολύ Βασικά για την Python
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In RR Programming: Export/Output Data In R
R Programming: Export/Output Data In R
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 

Más de Shani729

Interaction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interactionInteraction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interactionShani729
 
Fm lecturer 13(final)
Fm lecturer 13(final)Fm lecturer 13(final)
Fm lecturer 13(final)Shani729
 
Lecture slides week14-15
Lecture slides week14-15Lecture slides week14-15
Lecture slides week14-15Shani729
 
Frequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodFrequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodShani729
 
Dwh lecture slides-week15
Dwh lecture slides-week15Dwh lecture slides-week15
Dwh lecture slides-week15Shani729
 
Dwh lecture slides-week10
Dwh lecture slides-week10Dwh lecture slides-week10
Dwh lecture slides-week10Shani729
 
Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Shani729
 
Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Shani729
 
Dwh lecture slides-week3&4
Dwh lecture slides-week3&4Dwh lecture slides-week3&4
Dwh lecture slides-week3&4Shani729
 
Dwh lecture slides-week2
Dwh lecture slides-week2Dwh lecture slides-week2
Dwh lecture slides-week2Shani729
 
Dwh lecture slides-week1
Dwh lecture slides-week1Dwh lecture slides-week1
Dwh lecture slides-week1Shani729
 
Dwh lecture slides-week 13
Dwh lecture slides-week 13Dwh lecture slides-week 13
Dwh lecture slides-week 13Shani729
 
Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Shani729
 
Data warehousing and mining furc
Data warehousing and mining furcData warehousing and mining furc
Data warehousing and mining furcShani729
 
Lecture 40
Lecture 40Lecture 40
Lecture 40Shani729
 
Lecture 39
Lecture 39Lecture 39
Lecture 39Shani729
 
Lecture 38
Lecture 38Lecture 38
Lecture 38Shani729
 
Lecture 37
Lecture 37Lecture 37
Lecture 37Shani729
 
Lecture 35
Lecture 35Lecture 35
Lecture 35Shani729
 
Lecture 36
Lecture 36Lecture 36
Lecture 36Shani729
 

Más de Shani729 (20)

Interaction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interactionInteraction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interaction
 
Fm lecturer 13(final)
Fm lecturer 13(final)Fm lecturer 13(final)
Fm lecturer 13(final)
 
Lecture slides week14-15
Lecture slides week14-15Lecture slides week14-15
Lecture slides week14-15
 
Frequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodFrequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth method
 
Dwh lecture slides-week15
Dwh lecture slides-week15Dwh lecture slides-week15
Dwh lecture slides-week15
 
Dwh lecture slides-week10
Dwh lecture slides-week10Dwh lecture slides-week10
Dwh lecture slides-week10
 
Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8
 
Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Dwh lecture slides-week5&6
Dwh lecture slides-week5&6
 
Dwh lecture slides-week3&4
Dwh lecture slides-week3&4Dwh lecture slides-week3&4
Dwh lecture slides-week3&4
 
Dwh lecture slides-week2
Dwh lecture slides-week2Dwh lecture slides-week2
Dwh lecture slides-week2
 
Dwh lecture slides-week1
Dwh lecture slides-week1Dwh lecture slides-week1
Dwh lecture slides-week1
 
Dwh lecture slides-week 13
Dwh lecture slides-week 13Dwh lecture slides-week 13
Dwh lecture slides-week 13
 
Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13
 
Data warehousing and mining furc
Data warehousing and mining furcData warehousing and mining furc
Data warehousing and mining furc
 
Lecture 40
Lecture 40Lecture 40
Lecture 40
 
Lecture 39
Lecture 39Lecture 39
Lecture 39
 
Lecture 38
Lecture 38Lecture 38
Lecture 38
 
Lecture 37
Lecture 37Lecture 37
Lecture 37
 
Lecture 35
Lecture 35Lecture 35
Lecture 35
 
Lecture 36
Lecture 36Lecture 36
Lecture 36
 

Último

Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...ranjana rawat
 

Último (20)

Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 

Python tutorial

  • 1. Introduction to Python Chen LinChen Lin clin@brandeis.educlin@brandeis.edu COSI 134aCOSI 134a Volen 110Volen 110 Office Hour: Thurs. 3-5Office Hour: Thurs. 3-5
  • 2. For More Information? http://python.org/ - documentation, tutorials, beginners guide, core distribution, ... Books include:  Learning Python by Mark Lutz  Python Essential Reference by David Beazley  Python Cookbook, ed. by Martelli, Ravenscroft and Ascher  (online at http://code.activestate.com/recipes/langs/python/)  http://wiki.python.org/moin/PythonBooks
  • 3. Python VideosPython Videos http://showmedo.com/videotutorials/python “5 Minute Overview (What Does Python Look Like?)” “Introducing the PyDev IDE for Eclipse” “Linear Algebra with Numpy” And many more
  • 4. 4 Major Versions of Python4 Major Versions of Python “Python” or “CPython” is written in C/C++ - Version 2.7 came out in mid-2010 - Version 3.1.2 came out in early 2010 “Jython” is written in Java for the JVM “IronPython” is written in C# for the .Net environment Go To Website
  • 5. Development EnvironmentsDevelopment Environments what IDE to use?what IDE to use? http://stackoverflow.com/questions/81584http://stackoverflow.com/questions/81584 1. PyDev with Eclipse 2. Komodo 3. Emacs 4. Vim 5. TextMate 6. Gedit 7. Idle 8. PIDA (Linux)(VIM Based) 9. NotePad++ (Windows) 10.BlueFish (Linux)
  • 6. Pydev with EclipsePydev with Eclipse
  • 7. Python Interactive ShellPython Interactive Shell % python% python Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin[GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information.Type "help", "copyright", "credits" or "license" for more information. >>>>>> You can type things directly into a running Python sessionYou can type things directly into a running Python session >>> 2+3*4>>> 2+3*4 1414 >>> name = "Andrew">>> name = "Andrew" >>> name>>> name 'Andrew''Andrew' >>> print "Hello", name>>> print "Hello", name Hello AndrewHello Andrew >>>>>>
  • 8.  BackgroundBackground  Data Types/StructureData Types/Structure  Control flowControl flow  File I/OFile I/O  ModulesModules  ClassClass  NLTKNLTK
  • 9. ListList A compound data type:A compound data type: [0][0] [2.3, 4.5][2.3, 4.5] [5, "Hello", "there", 9.8][5, "Hello", "there", 9.8] [][] Use len() to get the length of a listUse len() to get the length of a list >>> names = [“Ben", “Chen", “Yaqin"]>>> names = [“Ben", “Chen", “Yaqin"] >>> len(names)>>> len(names) 33
  • 10. Use [ ] to index items in the listUse [ ] to index items in the list >>> names[0]>>> names[0] ‘‘Ben'Ben' >>> names[1]>>> names[1] ‘‘Chen'Chen' >>> names[2]>>> names[2] ‘‘Yaqin'Yaqin' >>> names[3]>>> names[3] Traceback (most recent call last):Traceback (most recent call last): File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module> IndexError: list index out of rangeIndexError: list index out of range >>> names[-1]>>> names[-1] ‘‘Yaqin'Yaqin' >>> names[-2]>>> names[-2] ‘‘Chen'Chen' >>> names[-3]>>> names[-3] ‘‘Ben'Ben' [0] is the first item. [1] is the second item ... Out of range values raise an exception Negative values go backwards from the last element.
  • 11. Strings share many features with listsStrings share many features with lists >>> smiles = "C(=N)(N)N.C(=O)(O)O">>> smiles = "C(=N)(N)N.C(=O)(O)O" >>> smiles[0]>>> smiles[0] 'C''C' >>> smiles[1]>>> smiles[1] '(''(' >>> smiles[-1]>>> smiles[-1] 'O''O' >>> smiles[1:5]>>> smiles[1:5] '(=N)''(=N)' >>> smiles[10:-4]>>> smiles[10:-4] 'C(=O)''C(=O)' Use “slice” notation to get a substring
  • 12. String Methods: find, splitString Methods: find, split smiles = "C(=N)(N)N.C(=O)(O)O"smiles = "C(=N)(N)N.C(=O)(O)O" >>> smiles.find("(O)")>>> smiles.find("(O)") 1515 >>> smiles.find(".")>>> smiles.find(".") 99 >>> smiles.find(".", 10)>>> smiles.find(".", 10) -1-1 >>> smiles.split(".")>>> smiles.split(".") ['C(=N)(N)N', 'C(=O)(O)O']['C(=N)(N)N', 'C(=O)(O)O'] >>>>>> Use “find” to find the start of a substring. Start looking at position 10. Find returns -1 if it couldn’t find a match. Split the string into parts with “.” as the delimiter
  • 13. String operators: in, not inString operators: in, not in if "Br" in “Brother”:if "Br" in “Brother”: print "contains brother“print "contains brother“ email_address = “clin”email_address = “clin” if "@" not in email_address:if "@" not in email_address: email_address += "@brandeis.edu“email_address += "@brandeis.edu“
  • 14. String Method: “strip”, “rstrip”, “lstrip” are ways toString Method: “strip”, “rstrip”, “lstrip” are ways to remove whitespace or selected charactersremove whitespace or selected characters >>> line = " # This is a comment line n">>> line = " # This is a comment line n" >>> line.strip()>>> line.strip() '# This is a comment line''# This is a comment line' >>> line.rstrip()>>> line.rstrip() ' # This is a comment line'' # This is a comment line' >>> line.rstrip("n")>>> line.rstrip("n") ' # This is a comment line '' # This is a comment line ' >>>>>>
  • 15. More String methodsMore String methods email.startswith(“c") endswith(“u”)email.startswith(“c") endswith(“u”) True/FalseTrue/False >>> "%s@brandeis.edu" % "clin">>> "%s@brandeis.edu" % "clin" 'clin@brandeis.edu''clin@brandeis.edu' >>> names = [“Ben", “Chen", “Yaqin"]>>> names = [“Ben", “Chen", “Yaqin"] >>> ", ".join(names)>>> ", ".join(names) ‘‘Ben, Chen, Yaqin‘Ben, Chen, Yaqin‘ >>> “chen".upper()>>> “chen".upper() ‘‘CHEN'CHEN'
  • 16. Unexpected things about stringsUnexpected things about strings >>> s = "andrew">>> s = "andrew" >>> s[0] = "A">>> s[0] = "A" Traceback (most recent call last):Traceback (most recent call last): File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module> TypeError: 'str' object does not support itemTypeError: 'str' object does not support item assignmentassignment >>> s = "A" + s[1:]>>> s = "A" + s[1:] >>> s>>> s 'Andrew‘'Andrew‘ Strings are read only
  • 17. ““” is for special characters” is for special characters n -> newlinen -> newline t -> tabt -> tab -> backslash -> backslash ...... But Windows uses backslash for directories! filename = "M:nickel_projectreactive.smi" # DANGER! filename = "M:nickel_projectreactive.smi" # Better! filename = "M:/nickel_project/reactive.smi" # Usually works
  • 18. Lists are mutable - some usefulLists are mutable - some useful methodsmethods >>> ids = ["9pti", "2plv", "1crn"]>>> ids = ["9pti", "2plv", "1crn"] >>> ids.append("1alm")>>> ids.append("1alm") >>> ids>>> ids ['9pti', '2plv', '1crn', '1alm']['9pti', '2plv', '1crn', '1alm'] >>>ids.extend(L)>>>ids.extend(L) Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L. >>> del ids[0]>>> del ids[0] >>> ids>>> ids ['2plv', '1crn', '1alm']['2plv', '1crn', '1alm'] >>> ids.sort()>>> ids.sort() >>> ids>>> ids ['1alm', '1crn', '2plv']['1alm', '1crn', '2plv'] >>> ids.reverse()>>> ids.reverse() >>> ids>>> ids ['2plv', '1crn', '1alm']['2plv', '1crn', '1alm'] >>> ids.insert(0, "9pti")>>> ids.insert(0, "9pti") >>> ids>>> ids ['9pti', '2plv', '1crn', '1alm']['9pti', '2plv', '1crn', '1alm'] append an element remove an element sort by default order reverse the elements in a list insert an element at some specified position. (Slower than .append())
  • 19. Tuples:Tuples: sort of an immutable list >>> yellow = (255, 255, 0) # r, g, b>>> yellow = (255, 255, 0) # r, g, b >>> one = (1,)>>> one = (1,) >>> yellow[0]>>> yellow[0] >>> yellow[1:]>>> yellow[1:] (255, 0)(255, 0) >>> yellow[0] = 0>>> yellow[0] = 0 Traceback (most recent call last):Traceback (most recent call last): File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module> TypeError: 'tuple' object does not support item assignmentTypeError: 'tuple' object does not support item assignment Very common in string interpolation: >>> "%s lives in %s at latitude %.1f" % ("Andrew", "Sweden", 57.7056) 'Andrew lives in Sweden at latitude 57.7'
  • 20. zipping lists togetherzipping lists together >>> names>>> names ['ben', 'chen', 'yaqin']['ben', 'chen', 'yaqin'] >>> gender =>>> gender = [0, 0, 1][0, 0, 1] >>> zip(names, gender)>>> zip(names, gender) [('ben', 0), ('chen', 0), ('yaqin', 1)][('ben', 0), ('chen', 0), ('yaqin', 1)]
  • 21. DictionariesDictionaries  Dictionaries are lookup tables.  They map from a “key” to a “value”. symbol_to_name = { "H": "hydrogen", "He": "helium", "Li": "lithium", "C": "carbon", "O": "oxygen", "N": "nitrogen" }  Duplicate keys are not allowed  Duplicate values are just fine
  • 22. Keys can be any immutable valueKeys can be any immutable value numbers, strings, tuples, frozensetnumbers, strings, tuples, frozenset,, not list, dictionary, set, ...not list, dictionary, set, ... atomic_number_to_name = {atomic_number_to_name = { 1: "hydrogen"1: "hydrogen" 6: "carbon",6: "carbon", 7: "nitrogen"7: "nitrogen" 8: "oxygen",8: "oxygen", }} nobel_prize_winners = {nobel_prize_winners = { (1979, "physics"): ["Glashow", "Salam", "Weinberg"],(1979, "physics"): ["Glashow", "Salam", "Weinberg"], (1962, "chemistry"): ["Hodgkin"],(1962, "chemistry"): ["Hodgkin"], (1984, "biology"): ["McClintock"],(1984, "biology"): ["McClintock"], }} A set is an unordered collection with no duplicate elements.
  • 23. DictionaryDictionary >>> symbol_to_name["C"]>>> symbol_to_name["C"] 'carbon''carbon' >>> "O" in symbol_to_name, "U" in symbol_to_name>>> "O" in symbol_to_name, "U" in symbol_to_name (True, False)(True, False) >>> "oxygen" in symbol_to_name>>> "oxygen" in symbol_to_name FalseFalse >>> symbol_to_name["P"]>>> symbol_to_name["P"] Traceback (most recent call last):Traceback (most recent call last): File "<stdin>", line 1, in <module>File "<stdin>", line 1, in <module> KeyError: 'P'KeyError: 'P' >>> symbol_to_name.get("P", "unknown")>>> symbol_to_name.get("P", "unknown") 'unknown''unknown' >>> symbol_to_name.get("C", "unknown")>>> symbol_to_name.get("C", "unknown") 'carbon''carbon' Get the value for a given key Test if the key exists (“in” only checks the keys, not the values.) [] lookup failures raise an exception. Use “.get()” if you want to return a default value.
  • 24. Some useful dictionary methodsSome useful dictionary methods >>> symbol_to_name.keys()>>> symbol_to_name.keys() ['C', 'H', 'O', 'N', 'Li', 'He']['C', 'H', 'O', 'N', 'Li', 'He'] >>> symbol_to_name.values()>>> symbol_to_name.values() ['carbon', 'hydrogen', 'oxygen', 'nitrogen', 'lithium', 'helium']['carbon', 'hydrogen', 'oxygen', 'nitrogen', 'lithium', 'helium'] >>> symbol_to_name.update( {"P": "phosphorous", "S": "sulfur"} )>>> symbol_to_name.update( {"P": "phosphorous", "S": "sulfur"} ) >>> symbol_to_name.items()>>> symbol_to_name.items() [('C', 'carbon'), ('H', 'hydrogen'), ('O', 'oxygen'), ('N', 'nitrogen'), ('P',[('C', 'carbon'), ('H', 'hydrogen'), ('O', 'oxygen'), ('N', 'nitrogen'), ('P', 'phosphorous'), ('S', 'sulfur'), ('Li', 'lithium'), ('He', 'helium')]'phosphorous'), ('S', 'sulfur'), ('Li', 'lithium'), ('He', 'helium')] >>> del symbol_to_name['C']>>> del symbol_to_name['C'] >>> symbol_to_name>>> symbol_to_name {'H': 'hydrogen', 'O': 'oxygen', 'N': 'nitrogen', 'Li': 'lithium', 'He': 'helium'}{'H': 'hydrogen', 'O': 'oxygen', 'N': 'nitrogen', 'Li': 'lithium', 'He': 'helium'}
  • 25.  BackgroundBackground  Data Types/StructureData Types/Structure list, string, tuple, dictionarylist, string, tuple, dictionary  Control flowControl flow  File I/OFile I/O  ModulesModules  ClassClass  NLTKNLTK
  • 26. Control FlowControl Flow Things that are FalseThings that are False  The boolean value False  The numbers 0 (integer), 0.0 (float) and 0j (complex).  The empty string "".  The empty list [], empty dictionary {} and empty set set(). Things that are TrueThings that are True  The boolean value TrueThe boolean value True  All non-zero numbers.All non-zero numbers.  Any string containing at least one character.Any string containing at least one character.  A non-empty data structure.A non-empty data structure.
  • 27. IfIf >>> smiles = "BrC1=CC=C(C=C1)NN.Cl">>> smiles = "BrC1=CC=C(C=C1)NN.Cl" >>> bool(smiles)>>> bool(smiles) TrueTrue >>> not bool(smiles)>>> not bool(smiles) FalseFalse >>> if not smiles>>> if not smiles:: ... print "The SMILES string is empty"... print "The SMILES string is empty" ......  The “else” case is always optional
  • 28. Use “elif” to chain subsequent testsUse “elif” to chain subsequent tests >>> mode = "absolute">>> mode = "absolute" >>> if mode == "canonical":>>> if mode == "canonical": ...... smiles = "canonical"smiles = "canonical" ... elif mode == "isomeric":... elif mode == "isomeric": ...... smiles = "isomeric”smiles = "isomeric” ...... elif mode == "absolute":elif mode == "absolute": ...... smiles = "absolute"smiles = "absolute" ... else:... else: ...... raise TypeError("unknown mode")raise TypeError("unknown mode") ...... >>> smiles>>> smiles ' absolute '' absolute ' >>>>>> “raise” is the Python way to raise exceptions
  • 29. Boolean logicBoolean logic Python expressions can have “and”s andPython expressions can have “and”s and “or”s:“or”s: if (benif (ben <=<= 5 and chen5 and chen >=>= 10 or10 or chenchen ==== 500 and ben500 and ben !=!= 5):5): print “Ben and Chen“print “Ben and Chen“
  • 30. Range TestRange Test if (3if (3 <= Time <=<= Time <= 5):5): print “Office Hour"print “Office Hour"
  • 31. ForFor >>> names = [“Ben", “Chen", “Yaqin"]>>> names = [“Ben", “Chen", “Yaqin"] >>> for name in names:>>> for name in names: ...... print smilesprint smiles ...... BenBen ChenChen YaqinYaqin
  • 32. Tuple assignment in for loopsTuple assignment in for loops data = [ ("C20H20O3", 308.371),data = [ ("C20H20O3", 308.371), ("C22H20O2", 316.393),("C22H20O2", 316.393), ("C24H40N4O2", 416.6),("C24H40N4O2", 416.6), ("C14H25N5O3", 311.38),("C14H25N5O3", 311.38), ("C15H20O2", 232.3181)]("C15H20O2", 232.3181)] forfor (formula, mw)(formula, mw) in data:in data: print "The molecular weight of %s is %s" % (formula, mw)print "The molecular weight of %s is %s" % (formula, mw) The molecular weight of C20H20O3 is 308.371The molecular weight of C20H20O3 is 308.371 The molecular weight of C22H20O2 is 316.393The molecular weight of C22H20O2 is 316.393 The molecular weight of C24H40N4O2 is 416.6The molecular weight of C24H40N4O2 is 416.6 The molecular weight of C14H25N5O3 is 311.38The molecular weight of C14H25N5O3 is 311.38 The molecular weight of C15H20O2 is 232.3181The molecular weight of C15H20O2 is 232.3181
  • 33. Break, continueBreak, continue >>> for value in [3, 1, 4, 1, 5, 9, 2]:>>> for value in [3, 1, 4, 1, 5, 9, 2]: ...... print "Checking", valueprint "Checking", value ...... if value > 8:if value > 8: ...... print "Exiting for loop"print "Exiting for loop" ...... breakbreak ...... elif value < 3:elif value < 3: ...... print "Ignoring"print "Ignoring" ...... continuecontinue ...... print "The square is", value**2print "The square is", value**2 ...... Use “break” to stopUse “break” to stop the for loopthe for loop Use “continue” to stopUse “continue” to stop processing the current itemprocessing the current item Checking 3Checking 3 The square is 9The square is 9 Checking 1Checking 1 IgnoringIgnoring Checking 4Checking 4 The square is 16The square is 16 Checking 1Checking 1 IgnoringIgnoring Checking 5Checking 5 The square is 25The square is 25 Checking 9Checking 9 Exiting for loopExiting for loop >>>>>>
  • 34. Range()Range()  ““range” creates a list of numbers in a specified rangerange” creates a list of numbers in a specified range  range([start,] stop[, step]) -> list of integersrange([start,] stop[, step]) -> list of integers  When step is given, it specifies the increment (or decrement).When step is given, it specifies the increment (or decrement). >>> range(5)>>> range(5) [0, 1, 2, 3, 4][0, 1, 2, 3, 4] >>> range(5, 10)>>> range(5, 10) [5, 6, 7, 8, 9][5, 6, 7, 8, 9] >>> range(0, 10, 2)>>> range(0, 10, 2) [0, 2, 4, 6, 8][0, 2, 4, 6, 8] How to get every second element in a list? for i in range(0, len(data), 2): print data[i]
  • 35.  BackgroundBackground  Data Types/StructureData Types/Structure  Control flowControl flow  File I/OFile I/O  ModulesModules  ClassClass  NLTKNLTK
  • 36. Reading filesReading files >>> f = open(“names.txt")>>> f = open(“names.txt") >>> f.readline()>>> f.readline() 'Yaqinn''Yaqinn'
  • 37. Quick WayQuick Way >>> lst= [ x for x in open("text.txt","r").readlines() ]>>> lst= [ x for x in open("text.txt","r").readlines() ] >>> lst>>> lst ['Chen Linn', 'clin@brandeis.edun', 'Volen 110n', 'Office['Chen Linn', 'clin@brandeis.edun', 'Volen 110n', 'Office Hour: Thurs. 3-5n', 'n', 'Yaqin Yangn',Hour: Thurs. 3-5n', 'n', 'Yaqin Yangn', 'yaqin@brandeis.edun', 'Volen 110n', 'Offiche Hour:'yaqin@brandeis.edun', 'Volen 110n', 'Offiche Hour: Tues. 3-5n']Tues. 3-5n'] Ignore the header?Ignore the header? for (i,line) in enumerate(open(‘text.txt’,"r").readlines()):for (i,line) in enumerate(open(‘text.txt’,"r").readlines()): if i == 0: continueif i == 0: continue print lineprint line
  • 38. Using dictionaries to countUsing dictionaries to count occurrencesoccurrences >>> for line in open('names.txt'):>>> for line in open('names.txt'): ...... name = line.strip()name = line.strip() ...... name_count[name] = name_count.get(name,0)+ 1name_count[name] = name_count.get(name,0)+ 1 ...... >>> for (name, count) in name_count.items():>>> for (name, count) in name_count.items(): ...... print name, countprint name, count ...... Chen 3Chen 3 Ben 3Ben 3 Yaqin 3Yaqin 3
  • 39. File OutputFile Output input_file = open(“in.txt")input_file = open(“in.txt") output_file = open(“out.txt", "w")output_file = open(“out.txt", "w") for line in input_file:for line in input_file: output_file.write(line)output_file.write(line) “w” = “write mode” “a” = “append mode” “wb” = “write in binary” “r” = “read mode” (default) “rb” = “read in binary” “U” = “read files with Unix or Windows line endings”
  • 40.  BackgroundBackground  Data Types/StructureData Types/Structure  Control flowControl flow  File I/OFile I/O  ModulesModules  ClassClass  NLTKNLTK
  • 41. ModulesModules When a Python program starts it only has access to a basic functions and classes. (“int”, “dict”, “len”, “sum”, “range”, ...) “Modules” contain additional functionality. Use “import” to tell Python to load a module. >>> import math >>> import nltk
  • 42. import the math moduleimport the math module >>> import math>>> import math >>> math.pi>>> math.pi 3.14159265358979313.1415926535897931 >>> math.cos(0)>>> math.cos(0) 1.01.0 >>> math.cos(math.pi)>>> math.cos(math.pi) -1.0-1.0 >>> dir(math)>>> dir(math) ['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh',['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos','asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'exp', 'fabs', 'factorial', 'floor', 'fmod','cosh', 'degrees', 'e', 'exp', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'hypot', 'isinf', 'isnan', 'ldexp', 'log', 'log10','frexp', 'fsum', 'hypot', 'isinf', 'isnan', 'ldexp', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan','log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']'tanh', 'trunc'] >>> help(math)>>> help(math) >>> help(math.cos)>>> help(math.cos)
  • 43. ““import” and “from ... import ...”import” and “from ... import ...” >>> import math>>> import math math.cosmath.cos >>> from math import cos, pi cos >>> from math import *
  • 44.  BackgroundBackground  Data Types/StructureData Types/Structure  Control flowControl flow  File I/OFile I/O  ModulesModules  ClassClass  NLTKNLTK
  • 45. ClassesClasses class ClassName(object):class ClassName(object): <statement-1><statement-1> . . .. . . <statement-N><statement-N> class MyClass(object):class MyClass(object): """A simple example class""""""A simple example class""" i = 1234512345 def f(self):def f(self): return self.ireturn self.i class DerivedClassName(BaseClassName):class DerivedClassName(BaseClassName): <statement-1><statement-1> . . .. . . <statement-N><statement-N>
  • 46.  BackgroundBackground  Data Types/StructureData Types/Structure  Control flowControl flow  File I/OFile I/O  ModulesModules  ClassClass  NLTKNLTK
  • 47. http://www.nltk.org/bookhttp://www.nltk.org/book NLTK is on berry patch machines!NLTK is on berry patch machines! >>>from nltk.book import *>>>from nltk.book import * >>> text1>>> text1 <Text: Moby Dick by Herman Melville 1851><Text: Moby Dick by Herman Melville 1851> >>> text1.name>>> text1.name 'Moby Dick by Herman Melville 1851''Moby Dick by Herman Melville 1851' >>> text1.concordance("monstrous")>>> text1.concordance("monstrous") >>> dir(text1)>>> dir(text1) >>> text1.tokens>>> text1.tokens >>> text1.index("my")>>> text1.index("my") 46474647 >>> sent2>>> sent2 ['The', 'family', 'of', 'Dashwood', 'had', 'long', 'been', 'settled', 'in',['The', 'family', 'of', 'Dashwood', 'had', 'long', 'been', 'settled', 'in', 'Sussex', '.']'Sussex', '.']
  • 48. Classify TextClassify Text >>> def gender_features(word):>>> def gender_features(word): ...... return {'last_letter': word[-1]}return {'last_letter': word[-1]} >>> gender_features('Shrek')>>> gender_features('Shrek') {'last_letter': 'k'}{'last_letter': 'k'} >>> from nltk.corpus import names>>> from nltk.corpus import names >>> import random>>> import random >>> names = ([(name, 'male') for name in names.words('male.txt')] +>>> names = ([(name, 'male') for name in names.words('male.txt')] + ... [(name, 'female') for name in names.words('female.txt')])... [(name, 'female') for name in names.words('female.txt')]) >>> random.shuffle(names)>>> random.shuffle(names)
  • 49. Featurize, train, test, predictFeaturize, train, test, predict >>> featuresets = [(gender_features(n), g) for (n,g) in names]>>> featuresets = [(gender_features(n), g) for (n,g) in names] >>> train_set, test_set = featuresets[500:], featuresets[:500]>>> train_set, test_set = featuresets[500:], featuresets[:500] >>> classifier = nltk.NaiveBayesClassifier.train(train_set)>>> classifier = nltk.NaiveBayesClassifier.train(train_set) >>> print nltk.classify.accuracy(classifier, test_set)>>> print nltk.classify.accuracy(classifier, test_set) 0.7260.726 >>> classifier.classify(gender_features('Neo'))>>> classifier.classify(gender_features('Neo')) 'male''male'
  • 50. fromfrom nltknltk.corpus import.corpus import reutersreuters  Reuters Corpus:Reuters Corpus:10,788 news10,788 news 1.3 million words.1.3 million words.  Been classified intoBeen classified into 9090 topicstopics  Grouped into 2 sets, "training" and "test“Grouped into 2 sets, "training" and "test“  Categories overlap with each otherCategories overlap with each other http://nltk.googlecode.com/svn/trunk/doc/bohttp://nltk.googlecode.com/svn/trunk/doc/bo ok/ch02.htmlok/ch02.html
  • 51. ReutersReuters >>> from nltk.corpus import reuters>>> from nltk.corpus import reuters >>> reuters.fileids()>>> reuters.fileids() ['test/14826', 'test/14828', 'test/14829', 'test/14832', ...]['test/14826', 'test/14828', 'test/14829', 'test/14832', ...] >>> reuters.categories()>>> reuters.categories() ['acq', 'alum', 'barley', 'bop', 'carcass', 'castor-oil', 'cocoa', 'coconut',['acq', 'alum', 'barley', 'bop', 'carcass', 'castor-oil', 'cocoa', 'coconut', 'coconut-oil', 'coffee', 'copper', 'copra-cake', 'corn', 'cotton', 'cotton-'coconut-oil', 'coffee', 'copper', 'copra-cake', 'corn', 'cotton', 'cotton- oil', 'cpi', 'cpu', 'crude', 'dfl', 'dlr', ...]oil', 'cpi', 'cpu', 'crude', 'dfl', 'dlr', ...]