SlideShare una empresa de Scribd logo
1 de 56
Descargar para leer sin conexión
Python & Stuff

               All the things I like about Python, plus a bit more.




Friday, November 4, 11
Jacob Perkins
                         Python Text Processing with NLTK 2.0 Cookbook

                         Co-Founder & CTO @weotta

                         Blog: http://streamhacker.com

                         NLTK Demos: http://text-processing.com

                         @japerk

                         Python user for > 6 years



Friday, November 4, 11
What I use Python for

                          web development with Django

                          web crawling with Scrapy

                          NLP with NLTK

                          argparse based scripts

                          processing data in Redis & MongoDB



Friday, November 4, 11
Topics
                         functional programming

                         I/O

                         Object Oriented programming

                         scripting

                         testing

                         remoting

                         parsing

                         package management

                         data storage

                         performance
Friday, November 4, 11
Functional Programming
                         list comprehensions

                         slicing

                         iterators

                         generators

                         higher order functions

                         decorators

                         default & optional arguments

                         switch/case emulation
Friday, November 4, 11
List Comprehensions

                         >>>   [i for i in range(10) if i % 2]
                         [1,   3, 5, 7, 9]
                         >>>   dict([(i, i*2) for i in range(5)])
                         {0:   0, 1: 2, 2: 4, 3: 6, 4: 8}
                         >>>   s = set(range(5))
                         >>>   [i for i in range(10) if i in s]
                         [0,   1, 2, 3, 4]




Friday, November 4, 11
Slicing

                         >>>   range(10)[:5]
                         [0,   1, 2, 3, 4]
                         >>>   range(10)[3:5]
                         [3,   4]
                         >>>   range(10)[1:5]
                         [1,   2, 3, 4]
                         >>>   range(10)[::2]
                         [0,   2, 4, 6, 8]
                         >>>   range(10)[-5:-1]
                         [5,   6, 7, 8]



Friday, November 4, 11
Iterators

                         >>> i = iter([1, 2, 3])
                         >>> i.next()
                         1
                         >>> i.next()
                         2
                         >>> i.next()
                         3
                         >>> i.next()
                         Traceback (most recent call last):
                           File "<stdin>", line 1, in <module>
                         StopIteration



Friday, November 4, 11
Generators
                         >>> def gen_ints(n):
                         ...     for i in range(n):
                         ...          yield i
                         ...
                         >>> g = gen_ints(2)
                         >>> g.next()
                         0
                         >>> g.next()
                         1
                         >>> g.next()
                         Traceback (most recent call last):
                           File "<stdin>", line 1, in <module>
                         StopIteration


Friday, November 4, 11
Higher Order Functions

                          >>> def hof(n):
                          ...      def addn(i):
                          ...          return i + n
                          ...      return addn
                          ...
                          >>> f = hof(5)
                          >>> f(3)
                          8




Friday, November 4, 11
Decorators
               >>> def print_args(f):
               ...     def g(*args, **kwargs):
               ...         print args, kwargs
               ...         return f(*args, **kwargs)
               ...     return g
               ...
               >>> @print_args
               ... def add2(n):
               ...     return n+2
               ...
               >>> add2(5)
               (5,) {}
               7
               >>> add2(3)
               (3,) {}
               5
Friday, November 4, 11
Default & Optional Args
               >>> def special_arg(special=None, *args, **kwargs):
               ...     print 'special:', special
               ...     print args
               ...     print kwargs
               ...
               >>> special_arg(special='hi')
               special: hi
               ()
               {}
               >>>
               >>> special_arg('hi')
               special: hi
               ()
               {}

Friday, November 4, 11
switch/case emulation


                             OPTS = {
                                 “a”: all,
                                 “b”: any
                             }

                             def all_or_any(lst, opt):
                                 return OPTS[opt](lst)




Friday, November 4, 11
Object Oriented

                         classes

                         multiple inheritance

                         special methods

                         collections

                         defaultdict



Friday, November 4, 11
Classes
               >>>       class A(object):
               ...           def __init__(self):
               ...                   self.value = 'a'
               ...
               >>>       class B(A):
               ...           def __init__(self):
               ...                   super(B, self).__init__()
               ...                   self.value = 'b'
               ...
               >>>       a = A()
               >>>       a.value
               'a'
               >>>       b = B()
               >>>       b.value
               'b'
Friday, November 4, 11
Multiple Inheritance
               >>>       class B(object):
               ...           def __init__(self):
               ...                   self.value = 'b'
               ...
               >>>       class C(A, B): pass
               ...
               >>>       C().value
               'a'
               >>>       class C(B, A): pass
               ...
               >>>       C().value
               'b'


Friday, November 4, 11
Special Methods

                         __init__

                         __len__

                         __iter__

                         __contains__

                         __getitem__



Friday, November 4, 11
collections

                         high performance containers

                         Abstract Base Classes

                         Iterable, Sized, Sequence, Set, Mapping

                         multi-inherit from ABC to mix & match

                         implement only a few special methods, get
                         rest for free


Friday, November 4, 11
defaultdict
               >>> d = {}
               >>> d['a'] += 2
               Traceback (most recent call last):
                 File "<stdin>", line 1, in <module>
               KeyError: 'a'
               >>> import collections
               >>> d = collections.defaultdict(int)
               >>> d['a'] += 2
               >>> d['a']
               2
               >>> l = collections.defaultdict(list)
               >>> l['a'].append(1)
               >>> l['a']
               [1]

Friday, November 4, 11
I/O


                         context managers

                         file iteration

                         gevent / eventlet




Friday, November 4, 11
Context Managers



               >>> with open('myfile', 'w') as f:
               ...     f.write('hellonworld')
               ...




Friday, November 4, 11
File Iteration


               >>> with open('myfile') as f:
               ...     for line in f:
               ...             print line.strip()
               ...
               hello
               world




Friday, November 4, 11
gevent / eventlet
                         coroutine networking libraries

                         greenlets: “micro-threads”

                         fast event loop

                         monkey-patch standard library

                         http://www.gevent.org/

                         http://www.eventlet.net/


Friday, November 4, 11
Scripting


                         argparse

                         __main__

                         atexit




Friday, November 4, 11
argparse
   import argparse

   parser = argparse.ArgumentParser(description='Train a
   NLTK Classifier')

   parser.add_argument('corpus', help='corpus name/path')
   parser.add_argument('--no-pickle', action='store_true',
     default=False, help="don't pickle")
   parser.add_argument('--trace', default=1, type=int,
     help='How much trace output you want')

   args = parser.parse_args()

   if args.trace:
       print ‘have args’
Friday, November 4, 11
__main__


                         if __name__ == ‘__main__’:
                             do_main_function()




Friday, November 4, 11
atexit

        def goodbye(name, adjective):
            print 'Goodbye, %s, it was %s to meet you.' % (name,
        adjective)

        import atexit
        atexit.register(goodbye, 'Donny', 'nice')




Friday, November 4, 11
Testing

                         doctest

                         unittest

                         nose

                         fudge

                         py.test



Friday, November 4, 11
doctest
                         def fib(n):
                             '''Return the nth fibonacci number.
                             >>> fib(0)
                             0
                             >>> fib(1)
                             1
                             >>> fib(2)
                             1
                             >>> fib(3)
                             2
                             >>> fib(4)
                             3
                             '''
                             if n == 0: return 0
                             elif n == 1: return 1
                             else: return fib(n - 1) + fib(n - 2)
Friday, November 4, 11
doctesting modules



                           if __name__ == ‘__main__’:
                               import doctest
                               doctest.testmod()




Friday, November 4, 11
unittest


                         anything more complicated than function I/O

                         clean state for each test

                         test interactions between components

                         can use mock objects




Friday, November 4, 11
nose

                         http://readthedocs.org/docs/nose/en/latest/

                         test runner

                         auto-discovery of tests

                         easy plugin system

                         plugins can generate XML for CI (Jenkins)



Friday, November 4, 11
fudge


                         http://farmdev.com/projects/fudge/

                         make fake objects

                         mock thru monkey-patching




Friday, November 4, 11
py.test


                         http://pytest.org/latest/

                         similar to nose

                         distributed multi-platform testing




Friday, November 4, 11
Remoting Libraries



                         Fabric

                         execnet




Friday, November 4, 11
Fabric


                         http://fabfile.org

                         run commands over ssh

                         great for “push” deployment

                         not parallel yet




Friday, November 4, 11
fabfile.py
   from fabric.api import run

   def host_type():
       run('uname -s')


                         fab command
   $ fab -H localhost,linuxbox host_type
   [localhost] run: uname -s
   [localhost] out: Darwin
   [linuxbox] run: uname -s
   [linuxbox] out: Linux


Friday, November 4, 11
execnet
                         http://codespeak.net/execnet/

                         open python interpreters over ssh

                         spawn local python interpreters

                         shared-nothing model

                         send code & data over channels

                         interact with CPython, Jython, PyPy

                         py.test distributed testing

Friday, November 4, 11
execnet example
   >>> import execnet, os
   >>> gw = execnet.makegateway("ssh=codespeak.net")
   >>> channel = gw.remote_exec("""
   ...      import sys, os
   ...      channel.send((sys.platform, sys.version_info,
   os.getpid()))
   ... """)
   >>> platform, version_info, remote_pid = channel.receive()
   >>> platform
   'linux2'
   >>> version_info
   (2, 4, 2, 'final', 0)


Friday, November 4, 11
Parsing


                         regular expressions

                         NLTK

                         SimpleParse




Friday, November 4, 11
NLTK Tokenization


          >>> from nltk import tokenize
          >>> tokenize.word_tokenize("Jacob's presentation")
          ['Jacob', "'s", 'presentation']
          >>> tokenize.wordpunct_tokenize("Jacob's presentation")
          ['Jacob', "'", 's', 'presentation']




Friday, November 4, 11
nltk.grammar


                         CFGs

                         Chapter 9 of NLTK Book: http://
                         nltk.googlecode.com/svn/trunk/doc/book/
                         ch09.html




Friday, November 4, 11
more NLTK


                         stemming

                         part-of-speech tagging

                         chunking

                         classification




Friday, November 4, 11
SimpleParse

                         http://simpleparse.sourceforge.net/

                         Parser generator

                         EBNF grammars

                         Based on mxTextTools: http://
                         www.egenix.com/products/python/mxBase/
                         mxTextTools/ (C extensions)



Friday, November 4, 11
Package Management


                         import

                         pip

                         virtualenv

                         mercurial




Friday, November 4, 11
import
                 import module
                 from module import function, ClassName
                 from module import function as f




                         always make sure package directories have
                         __init__.py




Friday, November 4, 11
pip
                          http://www.pip-installer.org/en/latest/

                          easy_install replacement

                          install from requirements files

                         $ pip install simplejson
                         [... progress report ...]
                         Successfully installed simplejson




Friday, November 4, 11
virtualenv


                         http://www.virtualenv.org/en/latest/

                         create self-contained python installations

                         dependency silos

                         works great with pip (same author)




Friday, November 4, 11
mercurial

                         http://mercurial.selenic.com/

                         Python based DVCS

                         simple & fast

                         easy cloning

                         works with Bitbucket, Github, Googlecode



Friday, November 4, 11
Flexible Data Storage



                         Redis

                         MongoDB




Friday, November 4, 11
Redis
                         in-memory key-value storage server

                         most operations O(1)

                         lists

                         sets

                         sorted sets

                         hash objects


Friday, November 4, 11
MongoDB
                         memory mapped document storage

                         arbitrary document fields

                         nested documents

                         index on multiple fields

                         easier (for programmers) than SQL

                         capped collections (good for logging)


Friday, November 4, 11
Python Performance



                         CPU

                         RAM




Friday, November 4, 11
CPU


                         probably fast enough if I/O or DB bound

                         try PyPy: http://pypy.org/

                         use CPython optimized libraries like numpy

                         write a CPython extension




Friday, November 4, 11
RAM


                         don’t keep references longer than needed

                         iterate over data

                         aggregate to an optimized DB




Friday, November 4, 11
import this
                     >>> import this
                     The Zen of Python, by Tim Peters

                     Beautiful is better than ugly.
                     Explicit is better than implicit.
                     Simple is better than complex.
                     Complex is better than complicated.
                     Flat is better than nested.
                     Sparse is better than dense.
                     Readability counts.
                     Special cases aren't special enough to break the rules.
                     Although practicality beats purity.
                     Errors should never pass silently.
                     Unless explicitly silenced.
                     In the face of ambiguity, refuse the temptation to guess.
                     There should be one-- and preferably only one --obvious way to do it.
                     Although that way may not be obvious at first unless you're Dutch.
                     Now is better than never.
                     Although never is often better than *right* now.
                     If the implementation is hard to explain, it's a bad idea.
                     If the implementation is easy to explain, it may be a good idea.
                     Namespaces are one honking great idea -- let's do more of those!

Friday, November 4, 11

Más contenido relacionado

La actualidad más candente

Clojure - A new Lisp
Clojure - A new LispClojure - A new Lisp
Clojure - A new Lispelliando dias
 
Intro to Testing in Zope, Plone
Intro to Testing in Zope, PloneIntro to Testing in Zope, Plone
Intro to Testing in Zope, PloneQuintagroup
 
Spock: A Highly Logical Way To Test
Spock: A Highly Logical Way To TestSpock: A Highly Logical Way To Test
Spock: A Highly Logical Way To TestHoward Lewis Ship
 
The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202Mahmoud Samir Fayed
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Kiyotaka Oku
 
Advanced Java Practical File
Advanced Java Practical FileAdvanced Java Practical File
Advanced Java Practical FileSoumya Behera
 
Advanced Python, Part 1
Advanced Python, Part 1Advanced Python, Part 1
Advanced Python, Part 1Zaar Hai
 
The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184Mahmoud Samir Fayed
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Introthnetos
 
Creating Lazy stream in CSharp
Creating Lazy stream in CSharpCreating Lazy stream in CSharp
Creating Lazy stream in CSharpDhaval Dalal
 
Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)Dhaval Dalal
 
Important java programs(collection+file)
Important java programs(collection+file)Important java programs(collection+file)
Important java programs(collection+file)Alok Kumar
 
Kotlin coroutines and spring framework
Kotlin coroutines and spring frameworkKotlin coroutines and spring framework
Kotlin coroutines and spring frameworkSunghyouk Bae
 
NIO.2, the I/O API for the future
NIO.2, the I/O API for the futureNIO.2, the I/O API for the future
NIO.2, the I/O API for the futureMasoud Kalali
 

La actualidad más candente (20)

Clojure - A new Lisp
Clojure - A new LispClojure - A new Lisp
Clojure - A new Lisp
 
Biopython
BiopythonBiopython
Biopython
 
Intro to Testing in Zope, Plone
Intro to Testing in Zope, PloneIntro to Testing in Zope, Plone
Intro to Testing in Zope, Plone
 
Spock: A Highly Logical Way To Test
Spock: A Highly Logical Way To TestSpock: A Highly Logical Way To Test
Spock: A Highly Logical Way To Test
 
The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202The Ring programming language version 1.8 book - Part 18 of 202
The Ring programming language version 1.8 book - Part 18 of 202
 
Python tour
Python tourPython tour
Python tour
 
Java VS Python
Java VS PythonJava VS Python
Java VS Python
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介
 
Advanced Java Practical File
Advanced Java Practical FileAdvanced Java Practical File
Advanced Java Practical File
 
Advanced Python, Part 1
Advanced Python, Part 1Advanced Python, Part 1
Advanced Python, Part 1
 
The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184The Ring programming language version 1.5.3 book - Part 14 of 184
The Ring programming language version 1.5.3 book - Part 14 of 184
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
обзор Python
обзор Pythonобзор Python
обзор Python
 
Creating Lazy stream in CSharp
Creating Lazy stream in CSharpCreating Lazy stream in CSharp
Creating Lazy stream in CSharp
 
Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)Currying and Partial Function Application (PFA)
Currying and Partial Function Application (PFA)
 
Important java programs(collection+file)
Important java programs(collection+file)Important java programs(collection+file)
Important java programs(collection+file)
 
NIO and NIO2
NIO and NIO2NIO and NIO2
NIO and NIO2
 
Sam wd programs
Sam wd programsSam wd programs
Sam wd programs
 
Kotlin coroutines and spring framework
Kotlin coroutines and spring frameworkKotlin coroutines and spring framework
Kotlin coroutines and spring framework
 
NIO.2, the I/O API for the future
NIO.2, the I/O API for the futureNIO.2, the I/O API for the future
NIO.2, the I/O API for the future
 

Destacado

Corpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTKCorpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTKJacob Perkins
 
ZOETWITT in the Press
ZOETWITT in the PressZOETWITT in the Press
ZOETWITT in the Presszoetwitt
 
Basic NLP with Python and NLTK
Basic NLP with Python and NLTKBasic NLP with Python and NLTK
Basic NLP with Python and NLTKFrancesco Bruni
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHugJimmy Lai
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Pythonshanbady
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Fasihul Kabir
 

Destacado (9)

Corpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTKCorpus Bootstrapping with NLTK
Corpus Bootstrapping with NLTK
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
 
ZOETWITT in the Press
ZOETWITT in the PressZOETWITT in the Press
ZOETWITT in the Press
 
Basic NLP with Python and NLTK
Basic NLP with Python and NLTKBasic NLP with Python and NLTK
Basic NLP with Python and NLTK
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHug
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014
 

Similar a Python & Stuff

Python utan-stodhjul-motorsag
Python utan-stodhjul-motorsagPython utan-stodhjul-motorsag
Python utan-stodhjul-motorsagniklal
 
Introduction to R
Introduction to RIntroduction to R
Introduction to Ragnonchik
 
Python - File operations & Data parsing
Python - File operations & Data parsingPython - File operations & Data parsing
Python - File operations & Data parsingFelix Z. Hoffmann
 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and ProsperKen Kousen
 
Scala jeff
Scala jeffScala jeff
Scala jeffjeff kit
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingIstanbul Tech Talks
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonCarlos V.
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11Henry Schreiner
 
D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4Jan Berdajs
 
Functions in python
Functions in pythonFunctions in python
Functions in pythonIlian Iliev
 
Using browser() in R
Using browser() in RUsing browser() in R
Using browser() in RLeon Kim
 
Postobjektové programovanie v Ruby
Postobjektové programovanie v RubyPostobjektové programovanie v Ruby
Postobjektové programovanie v RubyJano Suchal
 
Cookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own InterpreterCookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own Interpretermametter
 
Functions and modules in python
Functions and modules in pythonFunctions and modules in python
Functions and modules in pythonKarin Lagesen
 
Python Training v2
Python Training v2Python Training v2
Python Training v2ibaydan
 

Similar a Python & Stuff (20)

Python utan-stodhjul-motorsag
Python utan-stodhjul-motorsagPython utan-stodhjul-motorsag
Python utan-stodhjul-motorsag
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Ruby basics
Ruby basicsRuby basics
Ruby basics
 
Python - File operations & Data parsing
Python - File operations & Data parsingPython - File operations & Data parsing
Python - File operations & Data parsing
 
PythonOOP
PythonOOPPythonOOP
PythonOOP
 
Spock: Test Well and Prosper
Spock: Test Well and ProsperSpock: Test Well and Prosper
Spock: Test Well and Prosper
 
Dynamic Python
Dynamic PythonDynamic Python
Dynamic Python
 
Python and You Series
Python and You SeriesPython and You Series
Python and You Series
 
Scala jeff
Scala jeffScala jeff
Scala jeff
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function Programming
 
Python classes in mumbai
Python classes in mumbaiPython classes in mumbai
Python classes in mumbai
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with Python
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
 
D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4D-Talk: What's awesome about Ruby 2.x and Rails 4
D-Talk: What's awesome about Ruby 2.x and Rails 4
 
Functions in python
Functions in pythonFunctions in python
Functions in python
 
Using browser() in R
Using browser() in RUsing browser() in R
Using browser() in R
 
Postobjektové programovanie v Ruby
Postobjektové programovanie v RubyPostobjektové programovanie v Ruby
Postobjektové programovanie v Ruby
 
Cookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own InterpreterCookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own Interpreter
 
Functions and modules in python
Functions and modules in pythonFunctions and modules in python
Functions and modules in python
 
Python Training v2
Python Training v2Python Training v2
Python Training v2
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Python & Stuff

  • 1. Python & Stuff All the things I like about Python, plus a bit more. Friday, November 4, 11
  • 2. Jacob Perkins Python Text Processing with NLTK 2.0 Cookbook Co-Founder & CTO @weotta Blog: http://streamhacker.com NLTK Demos: http://text-processing.com @japerk Python user for > 6 years Friday, November 4, 11
  • 3. What I use Python for web development with Django web crawling with Scrapy NLP with NLTK argparse based scripts processing data in Redis & MongoDB Friday, November 4, 11
  • 4. Topics functional programming I/O Object Oriented programming scripting testing remoting parsing package management data storage performance Friday, November 4, 11
  • 5. Functional Programming list comprehensions slicing iterators generators higher order functions decorators default & optional arguments switch/case emulation Friday, November 4, 11
  • 6. List Comprehensions >>> [i for i in range(10) if i % 2] [1, 3, 5, 7, 9] >>> dict([(i, i*2) for i in range(5)]) {0: 0, 1: 2, 2: 4, 3: 6, 4: 8} >>> s = set(range(5)) >>> [i for i in range(10) if i in s] [0, 1, 2, 3, 4] Friday, November 4, 11
  • 7. Slicing >>> range(10)[:5] [0, 1, 2, 3, 4] >>> range(10)[3:5] [3, 4] >>> range(10)[1:5] [1, 2, 3, 4] >>> range(10)[::2] [0, 2, 4, 6, 8] >>> range(10)[-5:-1] [5, 6, 7, 8] Friday, November 4, 11
  • 8. Iterators >>> i = iter([1, 2, 3]) >>> i.next() 1 >>> i.next() 2 >>> i.next() 3 >>> i.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration Friday, November 4, 11
  • 9. Generators >>> def gen_ints(n): ... for i in range(n): ... yield i ... >>> g = gen_ints(2) >>> g.next() 0 >>> g.next() 1 >>> g.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration Friday, November 4, 11
  • 10. Higher Order Functions >>> def hof(n): ... def addn(i): ... return i + n ... return addn ... >>> f = hof(5) >>> f(3) 8 Friday, November 4, 11
  • 11. Decorators >>> def print_args(f): ... def g(*args, **kwargs): ... print args, kwargs ... return f(*args, **kwargs) ... return g ... >>> @print_args ... def add2(n): ... return n+2 ... >>> add2(5) (5,) {} 7 >>> add2(3) (3,) {} 5 Friday, November 4, 11
  • 12. Default & Optional Args >>> def special_arg(special=None, *args, **kwargs): ... print 'special:', special ... print args ... print kwargs ... >>> special_arg(special='hi') special: hi () {} >>> >>> special_arg('hi') special: hi () {} Friday, November 4, 11
  • 13. switch/case emulation OPTS = { “a”: all, “b”: any } def all_or_any(lst, opt): return OPTS[opt](lst) Friday, November 4, 11
  • 14. Object Oriented classes multiple inheritance special methods collections defaultdict Friday, November 4, 11
  • 15. Classes >>> class A(object): ... def __init__(self): ... self.value = 'a' ... >>> class B(A): ... def __init__(self): ... super(B, self).__init__() ... self.value = 'b' ... >>> a = A() >>> a.value 'a' >>> b = B() >>> b.value 'b' Friday, November 4, 11
  • 16. Multiple Inheritance >>> class B(object): ... def __init__(self): ... self.value = 'b' ... >>> class C(A, B): pass ... >>> C().value 'a' >>> class C(B, A): pass ... >>> C().value 'b' Friday, November 4, 11
  • 17. Special Methods __init__ __len__ __iter__ __contains__ __getitem__ Friday, November 4, 11
  • 18. collections high performance containers Abstract Base Classes Iterable, Sized, Sequence, Set, Mapping multi-inherit from ABC to mix & match implement only a few special methods, get rest for free Friday, November 4, 11
  • 19. defaultdict >>> d = {} >>> d['a'] += 2 Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'a' >>> import collections >>> d = collections.defaultdict(int) >>> d['a'] += 2 >>> d['a'] 2 >>> l = collections.defaultdict(list) >>> l['a'].append(1) >>> l['a'] [1] Friday, November 4, 11
  • 20. I/O context managers file iteration gevent / eventlet Friday, November 4, 11
  • 21. Context Managers >>> with open('myfile', 'w') as f: ... f.write('hellonworld') ... Friday, November 4, 11
  • 22. File Iteration >>> with open('myfile') as f: ... for line in f: ... print line.strip() ... hello world Friday, November 4, 11
  • 23. gevent / eventlet coroutine networking libraries greenlets: “micro-threads” fast event loop monkey-patch standard library http://www.gevent.org/ http://www.eventlet.net/ Friday, November 4, 11
  • 24. Scripting argparse __main__ atexit Friday, November 4, 11
  • 25. argparse import argparse parser = argparse.ArgumentParser(description='Train a NLTK Classifier') parser.add_argument('corpus', help='corpus name/path') parser.add_argument('--no-pickle', action='store_true', default=False, help="don't pickle") parser.add_argument('--trace', default=1, type=int, help='How much trace output you want') args = parser.parse_args() if args.trace: print ‘have args’ Friday, November 4, 11
  • 26. __main__ if __name__ == ‘__main__’: do_main_function() Friday, November 4, 11
  • 27. atexit def goodbye(name, adjective): print 'Goodbye, %s, it was %s to meet you.' % (name, adjective) import atexit atexit.register(goodbye, 'Donny', 'nice') Friday, November 4, 11
  • 28. Testing doctest unittest nose fudge py.test Friday, November 4, 11
  • 29. doctest def fib(n): '''Return the nth fibonacci number. >>> fib(0) 0 >>> fib(1) 1 >>> fib(2) 1 >>> fib(3) 2 >>> fib(4) 3 ''' if n == 0: return 0 elif n == 1: return 1 else: return fib(n - 1) + fib(n - 2) Friday, November 4, 11
  • 30. doctesting modules if __name__ == ‘__main__’: import doctest doctest.testmod() Friday, November 4, 11
  • 31. unittest anything more complicated than function I/O clean state for each test test interactions between components can use mock objects Friday, November 4, 11
  • 32. nose http://readthedocs.org/docs/nose/en/latest/ test runner auto-discovery of tests easy plugin system plugins can generate XML for CI (Jenkins) Friday, November 4, 11
  • 33. fudge http://farmdev.com/projects/fudge/ make fake objects mock thru monkey-patching Friday, November 4, 11
  • 34. py.test http://pytest.org/latest/ similar to nose distributed multi-platform testing Friday, November 4, 11
  • 35. Remoting Libraries Fabric execnet Friday, November 4, 11
  • 36. Fabric http://fabfile.org run commands over ssh great for “push” deployment not parallel yet Friday, November 4, 11
  • 37. fabfile.py from fabric.api import run def host_type(): run('uname -s') fab command $ fab -H localhost,linuxbox host_type [localhost] run: uname -s [localhost] out: Darwin [linuxbox] run: uname -s [linuxbox] out: Linux Friday, November 4, 11
  • 38. execnet http://codespeak.net/execnet/ open python interpreters over ssh spawn local python interpreters shared-nothing model send code & data over channels interact with CPython, Jython, PyPy py.test distributed testing Friday, November 4, 11
  • 39. execnet example >>> import execnet, os >>> gw = execnet.makegateway("ssh=codespeak.net") >>> channel = gw.remote_exec(""" ... import sys, os ... channel.send((sys.platform, sys.version_info, os.getpid())) ... """) >>> platform, version_info, remote_pid = channel.receive() >>> platform 'linux2' >>> version_info (2, 4, 2, 'final', 0) Friday, November 4, 11
  • 40. Parsing regular expressions NLTK SimpleParse Friday, November 4, 11
  • 41. NLTK Tokenization >>> from nltk import tokenize >>> tokenize.word_tokenize("Jacob's presentation") ['Jacob', "'s", 'presentation'] >>> tokenize.wordpunct_tokenize("Jacob's presentation") ['Jacob', "'", 's', 'presentation'] Friday, November 4, 11
  • 42. nltk.grammar CFGs Chapter 9 of NLTK Book: http:// nltk.googlecode.com/svn/trunk/doc/book/ ch09.html Friday, November 4, 11
  • 43. more NLTK stemming part-of-speech tagging chunking classification Friday, November 4, 11
  • 44. SimpleParse http://simpleparse.sourceforge.net/ Parser generator EBNF grammars Based on mxTextTools: http:// www.egenix.com/products/python/mxBase/ mxTextTools/ (C extensions) Friday, November 4, 11
  • 45. Package Management import pip virtualenv mercurial Friday, November 4, 11
  • 46. import import module from module import function, ClassName from module import function as f always make sure package directories have __init__.py Friday, November 4, 11
  • 47. pip http://www.pip-installer.org/en/latest/ easy_install replacement install from requirements files $ pip install simplejson [... progress report ...] Successfully installed simplejson Friday, November 4, 11
  • 48. virtualenv http://www.virtualenv.org/en/latest/ create self-contained python installations dependency silos works great with pip (same author) Friday, November 4, 11
  • 49. mercurial http://mercurial.selenic.com/ Python based DVCS simple & fast easy cloning works with Bitbucket, Github, Googlecode Friday, November 4, 11
  • 50. Flexible Data Storage Redis MongoDB Friday, November 4, 11
  • 51. Redis in-memory key-value storage server most operations O(1) lists sets sorted sets hash objects Friday, November 4, 11
  • 52. MongoDB memory mapped document storage arbitrary document fields nested documents index on multiple fields easier (for programmers) than SQL capped collections (good for logging) Friday, November 4, 11
  • 53. Python Performance CPU RAM Friday, November 4, 11
  • 54. CPU probably fast enough if I/O or DB bound try PyPy: http://pypy.org/ use CPython optimized libraries like numpy write a CPython extension Friday, November 4, 11
  • 55. RAM don’t keep references longer than needed iterate over data aggregate to an optimized DB Friday, November 4, 11
  • 56. import this >>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those! Friday, November 4, 11