2. What is
A dynamic object-oriented programming language.
Python is a programming language that lets you
work more quickly and integrate your systems more
effectively. You can learn to use Python and see
almost immediate gains in productivity and lower
maintenance costs.
– http://www.python.org/
5. Who is using Python?
spider and search engine
Yahoo Maps, Yahoo Groups
Python Success Stories
– http://www.python.org/about/success/
– Star Wars! : http://www.youtube.com/watch?v=RqhUz2vh6lA
– http://lineofthought.com/tools/python
6. My experience on Python?
• 2007-2010
– Web Automation testing : MaxQ
– Crawler web news for vertical search : beautifulsoup, lxml, mechanize
• 2011
– Text/file processing – hadoop?
• 2013
– scrapy (twisted) vs. gcrawler (gevent)
7. What Python can do?
• Xml processing
• Web Application
• Off-line computation
• Operation scripts
– puppet/chef(ruby)
– salt(python)
• NLP Processing
– has strong numeric processing capability : matrix
operations, etc
– Suitable for probability and machine learning code.
– NLTK : nature language tool kit
8. • data analysis
• machine learning
• Big data : R: http://www.xmind.net/m/LKF2/
13. Easy to get started!
• http://www.python.org/doc/
• <<Dive into Python>> : http://www.diveintopython.net/toc/index.html
• Python standard libraries: http://docs.python.org/2/library/index.html
• Google
• PyPI : http://pypi.python.org/pypi
– There are currently 33961 packages
• PyCon : http://www.pycon.org/
• Practice, practice, practice
14. Getting started and Installation
• Windows : find the install package here
http://www.python.org/download/releases/2.7.5/
• Linux : Generally, python come installed with the operating
system, if not, try
– Centos/redhat : yum install python
– Ubuntu : sudo apt-get install python2.7
– wget http://www.python.org/ftp/python/2.7.5/Python-2.7.5.tgz
(./configure & make & make install)
15. Using the python interpreter
• Indentation/缩进
• #
• “”” (doc string)
• Variables are created when they are assigned.
The name is case sensitive.
• If __name__==“__main__”:
– __name__ is a built-in variable which evaluate to the name of the
current module.
– being run directly or being imported?
21. Tuple
• A tuple is an immutable list. A tuple can not
be changed in any way once it is created.
22.
23. Defining Functions
>>> def fib(start=0, n=2000):
... "Print a Fibonacci series up to n, start from start"
... result = []
... a, b = start, start+1
... while b < n:
... result.append(b)
... a, b = b, a+b
... return result
...
>>> f1 = fib(5)
>>> f1
[6, 11, 17, 28, 45, 73, 118, 191, 309, 500, 809, 1309]
>>> f2 = fib(0, 1000)
>>> f2
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987]
24. Lambda Functions
anonymous functions in python
Type help() for help documents
Python 支持一种有趣的语法,它允许你快速定义单行的最小函数。这些叫做
lambda 的函数,是从 Lisp 借用来的,可以用在任何需要函数的地方。
25. >>> a = ['Mary', 'had', 'a', 'little', 'lamb']
>>> for i in range(len(a)):
... print i, a[i]
...
0 Mary 1 had 2 a 3 little 4 lamb
27. Modules
• A module is a file containing Python definitions and
statements. The file name is the module name with the
suffix .py appended. Within a module, the module’s name is
available as the value of the global variable __name__.
– A module can contain executable statements as well as function
definitions. These statements are intended to initialize the module.
They are executed only the first time the module name is encountered
in an import statement.
– Each module has its own private symbol table, which is used as the
global symbol table by all functions defined in the module.
– When a module named spam is imported, the interpreter first
searches for a built-in module with that name. If not found, it then
searches for a file named spam.py in a list of directories given by the
variable sys.path.
28. Packages
• Packages are a way of structuring Python's module
namespace by using "dotted module names".
• The __init__.py files are required to make Python treat the
directories as containing packages; this is done to prevent
directories with a common name
29. What can you do with excel?
• 1. read/write to normal csv file
• 2. use csv module to do it
• 3. pypi search for excel
– http://www.simplistix.co.uk/presentations/python
-excel.pdf
Add C:\Python27 to windows path to run python under cmd
http://docs.python.org/2/library/types.html
Using lists as stacks/queues
It is also possible to use a list as a queue, where the first element added is the first element retrieved (“first-in, first-out”); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one).To implement a queue, use collections.deque which was designed to have fast appends and pops from both ends.
It is also possible to use a list as a queue, where the first element added is the first element retrieved (“first-in, first-out”); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one).To implement a queue, use collections.deque which was designed to have fast appends and pops from both ends.
Tuple 比 list 操作速度快。如果您定义了一个值的常量集,并且唯一要用它做的是不断地遍历它,请使用 tuple 代替 list。如果对不需要修改的数据进行 “写保护”,可以使代码更安全。Tuples 可以在 dictionary 中被用做 key,但是 list 不行。Dictionary key 必须是不可变的。Tuple 本身是不可改变的,但是如果您有一个 list 的 tuple,那就认为是可变的了,用做 dictionary key 就是不安全的。只有字符串、整数或其它对 dictionary 安全的 tuple 才可以用作 dictionary key。
all variable assignments in a function store the value in the local symbol table; whereas variable references first look in the local symbol table, then in the global symbol table, and then in the table of built-in names.
map(...) map(function, sequence[, sequence, ...]) -> list Return a list of the results of applying the function to the items of the argument sequence(s). If more than one sequence is given, the function is called with an argument list consisting of the corresponding item of each sequence, substituting None for missing values when not all sequences have the same length. If the function is None, return a list of the items of the sequence (or a list of tuples if more than one sequence).reduce(...) reduce(function, sequence[, initial]) -> value Apply a function of two arguments cumulatively to the items of a sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty.
http://docs.python.org/2/library/functions.html
http://docs.python.org/2/tutorial/modules.htmlIf you quit from the Python interpreter and enter it again, the definitions you have made (functions and variables) are lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead. This is known as creating a script. As your program gets longer, you may want to split it into several files for easier maintenance. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program.Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a module can be imported into other modules or into the main module
The csv module implements classes to read and write tabular data in CSV format.http://docs.python.org/2/library/csv.html