>>> F='3.5" or 5.25" Floppy?'; F '3.5" or 5.25" Floppy?' >>> D="F(x)=x^2, f'(x)=2x";D "F(x)=x^2, f'(x)=2x"You can use multi-line strings inside 3 quotes:
>>> ALongString="""First line ... second line ... ... many lines ... """ >>> print ALongString First line second line many lines
>>> print "A new line:\n and a tab \t character. This \\ is one backslash!" A new line: and a tab character. This \ is one backslash!Raw strings do not interpret the backslashes:
>>> print r"A new line:\n and a tab \t character. This \\ are two backslashes!" A new line:\n and a tab \t character. This \\ are two backslashes!
#!/usr/bin/python # -*- coding: utf-8 -*- print "German Umlauts: ÄÖÜäüöß"The encoding of the file and the coding in the second line have to match.
>>> previous = {0: 0, 1: 1} >>> ... def fibonacci(n): ... if previous.has_key(n): ... return previous[n] ... else: ... new_value = fibonacci(n-1) + fibonacci(n-2) ... previous[n] = new_value ... return new_value ... >>> fibonacci(40) 102334155 >>> fibonacci(50) 12586269025L >>> fibonacci(60) 1548008755920L >>> fibonacci(80) 23416728348467685L >>> fibonacci(100) 354224848179261915075L
When you write a script and it grows, you want to split it into several files or at least put some functions into a separate file. So that other scripts can reuse the functions.
This is called writing and importing from a module. Imagine we put the definition of fibonacci and previous into a file fib.py: Now the file fib.py becomes the module named fib (without extension):
previous = {0: 0, 1: 1} def fibonacci(n): if previous.has_key(n): return previous[n] else: new_value = fibonacci(n-1) + fibonacci(n-2) previous[n] = new_value return new_value if __name__=="__main__": # not in module mode: print "Testing: ", fibonacci(100) else: print "As module with name", __name__
>>> import fib As module with name fib >>> fib>>> dir(fib) ['__builtins__', '__doc__', '__file__', '__name__', 'fibonacci', 'previous'] >>> fib.fibonacci(50) 12586269025L >>> fib.previous {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5, 6: 8, 7: 13, 8: 21, 9: 34, 10: 55, 11: 89, 12: 144, 13: 233, 14: 377, 15: 610, 16: 987, 17: 1597, 18: 2584, 19: 4181, 20: 6765, 21: 10946, 22: 17711, 23: 28657, 24: 46368, 25: 75025, 26: 121393, 27: 196418, 28: 317811, 29: 514229, 30: 832040, 31: 1346269, 32: 2178309, 33: 3524578, 34: 5702887, 35: 9227465, 36: 14930352, 37: 24157817, 38: 39088169, 39: 63245986, 40: 102334155, 41: 165580141, 42: 267914296, 43: 433494437, 44: 701408733, 45: 1134903170, 46: 1836311903, 47: 2971215073L, 48: 4807526976L, 49: 7778742049L, 50: 12586269025L}
Note the use of fib.previous instead of previous. The module comes with its own namespace. We could also use from fib import *. Then fibonacci and previous belong to the global namespace. Another form is from fib import fibonacci which only imports fibonacci and not previous.
Python comes with a huge library of Standard Modules.
Sometimes (for example when building a huge CAS with Python) one needs to distribute a lot of module together. Python supports this with packages. You can think of a packages as a directory in the filesystem containing subdirectories and modules. We have in SAGE, for example:
sage.groups.abelian_gps.abelian_group?? File: /usr/local/sage/local/lib/python2.5/site-packages/sage/groups/abelian_gps/abelian_group.pyThe package sage contains a subpackage sage.groups, and sage.groups.abelian_gps and a module abelian_group which belongs to sage.groups.abelian_gps. And everything corresponds to files and subdirectories of /usr/local/sage/local/lib/python2.5/site-packages/
See section 6.4 of The Python Tutorial for more information about packages and an examples with a complex directory layout.
If it walks like a duck and quacks like a duck, I would call it a duck.
If a class has the same behaviour (e.g. implements the same methods) as another class they are interchangeable. This is similar to Java Interfaces but in Python it is done at runtime and only the part being accessed is considered.
For example, if code requires at one place a class implementing a method foo and at another place a class with a method bar, then a class A implementing both methods can be used in both places and another class B implementing the first method can only be used in the first place.
class A(object): def foo(self): print "Foo" def bar(self): print "bar" class B(object): def foo(self): print "B's implementation of foo" L=[ A(), A(), B()] for obj in L: obj.foo() for obj in L: obj.bar()
Foo Foo B's implementation of foo bar bar Traceback (most recent call last): File "", line 22, in AttributeError: 'B' object has no attribute 'bar'
Section 3.4.5 of The Python Reference Manual lists the method names involved with container types like lists.
So if we want to add list-like behaviour to one of our classes A, we need to implement
Section 3.4.7 of The Python Reference Manual list the method names for numeric types. These methods correspond to the different operators + - * and so on.
First make it right, then make it fast.
With a profiler, we measure the performance of our code, for example the fib module:
python -m cProfile fib.py Testing: 354224848179261915075 402 function calls (204 primitive calls) in 0.002 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.002 0.002:1( ) 1 0.000 0.000 0.002 0.002 fib.py:1( ) 199/1 0.001 0.000 0.001 0.001 fib.py:2(fibonacci) 1 0.000 0.000 0.002 0.002 {execfile} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 199 0.000 0.000 0.000 0.000 {method 'has_key' of 'dict' objects}
We can see how often a certain function is called and how much time is spent there. The Python Library Reference has an entire about profiling. Here is a link to a quickstart guide.
There is also a newer profiler called hotshot, which has a smaller performance impact (profiling slows down). But it requires more setup and you have to put it in its own python file. The Python Library Requires has an example.
The profiler helps us to find the function which is called the most or which takes the longest time. This function is the first candidate for optimization.
Here is an example of different ways to express the same function:
import cProfile import os, string # create some test data: P= os.popen("man -Tascii python| col -b") L= P.readlines() wordList = [] for l in L: for w in l.split(): if w: wordList.append(w) # different loopings def worker1(): newList = [] for w in wordList: newList.append(w.upper() ) # in theory 2 and 3 should be faster than 1, but they are not def worker2(): newList = [] append=newList.append upper= string.upper for w in wordList: append( upper(w) ) newList = [] append=newList.append upper= string.upper def worker3(): for w in wordList: append( upper(w) ) def worker4(): newList= map( string.upper, wordList ) def worker5(): return [w.upper() for w in wordList ] # the winner is : def worker6(): # this is much faster than the others return (w.upper() for w in wordList ) # worker6 ist the fastest. # worker5 is faster than the rest but much slower than worker6 # worker1 is not slower than 2 and 3 def f(): for time in range(500): worker5() cProfile.run( "f()" ) #for w in worker6(): # print wSurprisingly worker2 and worker3 are slower than worker1.
At the end of Python Patterns - An Optimization Anecdote (an essay by Guide van Rossum) there are a few conclusions:
(From the Glossary:)
EAFP: Easier to ask for forgiveness than
permission. This common Python coding style assumes the
existence of valid keys or attributes and catches exceptions if
the assumption proves false. This clean and fast style is
characterized by the presence of many try and except
statements. The technique contrasts with the LBYL style that is
common in many other languages such as C.
LBYL: Look before you leap. This coding style
explicitly tests for pre-conditions before making calls or
lookups. This style contrasts with the EAFP approach and is
characterized the presence of many if statements.
The next example from Section 9 of PythonInfo Wiki: PythonSpeed/PerformanceTips shows the benefits:
import os # create some test data: P= os.popen("man -Tascii python| col -b") L= P.readlines() wordList = [] for l in L: for w in l.split(): if w: wordList.append( w.upper() ) def worker1(words): # lbyl wdict = {} for word in words: if word not in wdict: wdict[word] = 0 wdict[word] += 1 return wdict def worker2(words): # eafp , but twice as slow as lbyl wdict = {} for word in words: try: wdict[word] += 1 except KeyError: wdict[word] = 1 return wdict def worker3(words): # faster as eafp, but not much wdict = {} g = wdict.get for word in words: wdict[word] = g(word,0) + 1 return wdict import cProfile def f(func): for time in range(1000): func(wordList) cProfile.run( "f(worker3)" ) # sort for wordcount which is the value tmp= [ (v, k) for k,v in worker1(wordList).items() ] tmp.sort() # sorting by values! # the 50 most frequent words: print tmp[-50:]Again the theory is wrong, the eafp version is slower.
When a large range of numbers is required, use xrange instead of range. xrange uses a generator object, where each number is created one after another. range creates the whole list at once.
This recipe shows how to overload the __rmul__ function.
This recipe shows how to create function with arbitrary argument list. Notice that no inheritance is used, you cannot access stream's write method directly (only via out.stream.write).
This recipe shows how to get every permutation of a given sequence or string. It uses recursion and generators and also demonstrates slicing (last line).
This recipe show how to make a faster copy of an object. (The discussion gives more details on how Python copies your own classes.)
In Python an object is referenced by default. So when you call foo( obj ) the method foo can change obj. This is most often the right thing. But sometimes one desires that the function works on its own copy, so that the foo cannot change the original object.
Python provides the copy.copy and copy.deepcopy functions for creating copies. Consider the next example:
aList=[1,2,3] def foo(L): L[1] = 42 print "before:", aList foo(aList) print "after:", aList # before: [1, 2, 3] # after: [1, 42, 3] import copy aList[1]=2 print "before (2):", aList foo(copy.copy(aList)) print "after (2):", aList # before (2): [1, 2, 3] # after (2): [1, 2, 3]
Due to the fact that object variables are references to some memory location, we have another pitfall: object aliasing:
ripefruits={"apple": "green", "banana": "yellow"} rottenfruits=ripefruits # only aliasing, # ripe- and rotten point to the same place in memory rottenfruits["apple"]="brown" rottenfruits["banana"]="black" print ripefruits print rottenfruits # {'apple': 'brown', 'banana': 'black'} # {'apple': 'brown', 'banana': 'black'}