>>> F='3.5" or 5.25" Floppy?'; F
'3.5" or 5.25" Floppy?'
>>> D="F(x)=x^2, f'(x)=2x";D
"F(x)=x^2, f'(x)=2x"
You can use multi-line strings inside 3 quotes:
>>> ALongString="""First line
... second line
...
... many lines
... """
>>> print ALongString
First line
second line
many lines
>>> print "A new line:\n and a tab \t character. This \\ is one backslash!"
A new line:
and a tab character. This \ is one backslash!
Raw strings do not interpret the backslashes:
>>> print r"A new line:\n and a tab \t character. This \\ are two backslashes!"
A new line:\n and a tab \t character. This \\ are two backslashes!
#!/usr/bin/python
# -*- coding: utf-8 -*-
print "German Umlauts: ÄÖÜäüöß"
The encoding of the file and the coding in the second line have to match.
>>> previous = {0: 0, 1: 1}
>>>
... def fibonacci(n):
... if previous.has_key(n):
... return previous[n]
... else:
... new_value = fibonacci(n-1) + fibonacci(n-2)
... previous[n] = new_value
... return new_value
...
>>> fibonacci(40)
102334155
>>> fibonacci(50)
12586269025L
>>> fibonacci(60)
1548008755920L
>>> fibonacci(80)
23416728348467685L
>>> fibonacci(100)
354224848179261915075L
When you write a script and it grows, you want to split it into several files or at least put some functions into a separate file. So that other scripts can reuse the functions.
This is called writing and importing from a module. Imagine we put the definition of fibonacci and previous into a file fib.py: Now the file fib.py becomes the module named fib (without extension):
previous = {0: 0, 1: 1}
def fibonacci(n):
if previous.has_key(n):
return previous[n]
else:
new_value = fibonacci(n-1) + fibonacci(n-2)
previous[n] = new_value
return new_value
if __name__=="__main__": # not in module mode:
print "Testing: ", fibonacci(100)
else:
print "As module with name", __name__
>>> import fib As module with name fib >>> fib>>> dir(fib) ['__builtins__', '__doc__', '__file__', '__name__', 'fibonacci', 'previous'] >>> fib.fibonacci(50) 12586269025L >>> fib.previous {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5, 6: 8, 7: 13, 8: 21, 9: 34, 10: 55, 11: 89, 12: 144, 13: 233, 14: 377, 15: 610, 16: 987, 17: 1597, 18: 2584, 19: 4181, 20: 6765, 21: 10946, 22: 17711, 23: 28657, 24: 46368, 25: 75025, 26: 121393, 27: 196418, 28: 317811, 29: 514229, 30: 832040, 31: 1346269, 32: 2178309, 33: 3524578, 34: 5702887, 35: 9227465, 36: 14930352, 37: 24157817, 38: 39088169, 39: 63245986, 40: 102334155, 41: 165580141, 42: 267914296, 43: 433494437, 44: 701408733, 45: 1134903170, 46: 1836311903, 47: 2971215073L, 48: 4807526976L, 49: 7778742049L, 50: 12586269025L}
Note the use of fib.previous instead of previous. The module comes with its own namespace. We could also use from fib import *. Then fibonacci and previous belong to the global namespace. Another form is from fib import fibonacci which only imports fibonacci and not previous.
Python comes with a huge library of Standard Modules.
Sometimes (for example when building a huge CAS with Python) one needs to distribute a lot of module together. Python supports this with packages. You can think of a packages as a directory in the filesystem containing subdirectories and modules. We have in SAGE, for example:
sage.groups.abelian_gps.abelian_group??
File: /usr/local/sage/local/lib/python2.5/site-packages/sage/groups/abelian_gps/abelian_group.py
The package sage contains a subpackage sage.groups, and
sage.groups.abelian_gps and a module abelian_group which belongs
to sage.groups.abelian_gps. And everything corresponds to
files and subdirectories of
/usr/local/sage/local/lib/python2.5/site-packages/
See section 6.4 of The Python Tutorial for more information about packages and an examples with a complex directory layout.
If it walks like a duck and quacks like a duck, I would call it a duck.
If a class has the same behaviour (e.g. implements the same methods) as another class they are interchangeable. This is similar to Java Interfaces but in Python it is done at runtime and only the part being accessed is considered.
For example, if code requires at one place a class implementing a method foo and at another place a class with a method bar, then a class A implementing both methods can be used in both places and another class B implementing the first method can only be used in the first place.
class A(object):
def foo(self):
print "Foo"
def bar(self):
print "bar"
class B(object):
def foo(self):
print "B's implementation of foo"
L=[ A(), A(), B()]
for obj in L:
obj.foo()
for obj in L:
obj.bar()
Foo Foo B's implementation of foo bar bar Traceback (most recent call last): File "", line 22, in AttributeError: 'B' object has no attribute 'bar'
Section 3.4.5 of The Python Reference Manual lists the method names involved with container types like lists.
So if we want to add list-like behaviour to one of our classes A, we need to implement
Section 3.4.7 of The Python Reference Manual list the method names for numeric types. These methods correspond to the different operators + - * and so on.
First make it right, then make it fast.
With a profiler, we measure the performance of our code, for example the fib module:
python -m cProfile fib.py
Testing: 354224848179261915075
402 function calls (204 primitive calls) in 0.002 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.002 0.002 :1()
1 0.000 0.000 0.002 0.002 fib.py:1()
199/1 0.001 0.000 0.001 0.001 fib.py:2(fibonacci)
1 0.000 0.000 0.002 0.002 {execfile}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
199 0.000 0.000 0.000 0.000 {method 'has_key' of 'dict' objects}
We can see how often a certain function is called and how much time is spent there. The Python Library Reference has an entire about profiling. Here is a link to a quickstart guide.
There is also a newer profiler called hotshot, which has a smaller performance impact (profiling slows down). But it requires more setup and you have to put it in its own python file. The Python Library Requires has an example.
The profiler helps us to find the function which is called the most or which takes the longest time. This function is the first candidate for optimization.
Here is an example of different ways to express the same function:
import cProfile
import os, string
# create some test data:
P= os.popen("man -Tascii python| col -b")
L= P.readlines()
wordList = []
for l in L:
for w in l.split():
if w: wordList.append(w)
# different loopings
def worker1():
newList = []
for w in wordList:
newList.append(w.upper() )
# in theory 2 and 3 should be faster than 1, but they are not
def worker2():
newList = []
append=newList.append
upper= string.upper
for w in wordList:
append( upper(w) )
newList = []
append=newList.append
upper= string.upper
def worker3():
for w in wordList:
append( upper(w) )
def worker4():
newList= map( string.upper, wordList )
def worker5():
return [w.upper() for w in wordList ]
# the winner is :
def worker6(): # this is much faster than the others
return (w.upper() for w in wordList )
# worker6 ist the fastest.
# worker5 is faster than the rest but much slower than worker6
# worker1 is not slower than 2 and 3
def f():
for time in range(500):
worker5()
cProfile.run( "f()" )
#for w in worker6():
# print w
Surprisingly worker2 and worker3 are slower
than worker1.
At the end of Python Patterns - An Optimization Anecdote (an essay by Guide van Rossum) there are a few conclusions:
(From the Glossary:)
EAFP: Easier to ask for forgiveness than
permission. This common Python coding style assumes the
existence of valid keys or attributes and catches exceptions if
the assumption proves false. This clean and fast style is
characterized by the presence of many try and except
statements. The technique contrasts with the LBYL style that is
common in many other languages such as C.
LBYL: Look before you leap. This coding style
explicitly tests for pre-conditions before making calls or
lookups. This style contrasts with the EAFP approach and is
characterized the presence of many if statements.
The next example from Section 9 of PythonInfo Wiki: PythonSpeed/PerformanceTips shows the benefits:
import os
# create some test data:
P= os.popen("man -Tascii python| col -b")
L= P.readlines()
wordList = []
for l in L:
for w in l.split():
if w: wordList.append( w.upper() )
def worker1(words): # lbyl
wdict = {}
for word in words:
if word not in wdict:
wdict[word] = 0
wdict[word] += 1
return wdict
def worker2(words): # eafp , but twice as slow as lbyl
wdict = {}
for word in words:
try:
wdict[word] += 1
except KeyError:
wdict[word] = 1
return wdict
def worker3(words): # faster as eafp, but not much
wdict = {}
g = wdict.get
for word in words:
wdict[word] = g(word,0) + 1
return wdict
import cProfile
def f(func):
for time in range(1000):
func(wordList)
cProfile.run( "f(worker3)" )
# sort for wordcount which is the value
tmp= [ (v, k) for k,v in worker1(wordList).items() ]
tmp.sort() # sorting by values!
# the 50 most frequent words:
print tmp[-50:]
Again the theory is wrong, the eafp version is slower.
When a large range of numbers is required, use xrange instead of range. xrange uses a generator object, where each number is created one after another. range creates the whole list at once.
This recipe shows how to overload the __rmul__ function.
This recipe shows how to create function with arbitrary argument list. Notice that no inheritance is used, you cannot access stream's write method directly (only via out.stream.write).
This recipe shows how to get every permutation of a given sequence or string. It uses recursion and generators and also demonstrates slicing (last line).
This recipe show how to make a faster copy of an object. (The discussion gives more details on how Python copies your own classes.)
In Python an object is referenced by default. So when you call foo( obj ) the method foo can change obj. This is most often the right thing. But sometimes one desires that the function works on its own copy, so that the foo cannot change the original object.
Python provides the copy.copy and copy.deepcopy functions for creating copies. Consider the next example:
aList=[1,2,3]
def foo(L):
L[1] = 42
print "before:", aList
foo(aList)
print "after:", aList
# before: [1, 2, 3]
# after: [1, 42, 3]
import copy
aList[1]=2
print "before (2):", aList
foo(copy.copy(aList))
print "after (2):", aList
# before (2): [1, 2, 3]
# after (2): [1, 2, 3]
Due to the fact that object variables are references to some memory location, we have another pitfall: object aliasing:
ripefruits={"apple": "green", "banana": "yellow"}
rottenfruits=ripefruits # only aliasing,
# ripe- and rotten point to the same place in memory
rottenfruits["apple"]="brown"
rottenfruits["banana"]="black"
print ripefruits
print rottenfruits
# {'apple': 'brown', 'banana': 'black'}
# {'apple': 'brown', 'banana': 'black'}