Getting Started
Source
This is the summary of the book “A Whirlwind Tour of Python” by Jake VanderPlas.
You can view it in
- nbviewer: A whirlwind Tour of Python, or
- Github: A whirlwind Tour of Python
import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Operators
Identity and Membership
The identity operators, is
and is not
check for object identity. Object identity is different than equality.
a = [1, 2, 3]
b = [1, 2, 3]
a == b
True
a is b
False
1 in a
True
4 in a
False
Built-in Types
Python’s simple types:
Type | Example | Description |
---|---|---|
int |
x = 1 |
integers (i.e., whole numbers) |
float |
x = 1.0 |
floating-point numbers (i.e., real numbers) |
complex |
x = 1 + 2j |
Complex numbers (i.e., numbers with real and imaginary part) |
bool |
x = True |
Boolean: True/False values |
str |
x = 'abc' |
String: characters or text |
NoneType |
x = None |
Special object indicating nulls |
Complex Numbers
complex(1, 2)
(1+2j)
Alternatively, we can use the “j
” suffix in expressions to indicate the imaginary part:
1 + 2j
(1+2j)
c = 3 + 4j
c.real
3.0
c.imag
4.0
c.conjugate()
(3-4j)
abs(c)
5.0
String Type
msg = "what do you like?" # double quotes
response = 'spam' # single quotes
# length
len(response)
4
# Upper/lower case
response.upper()
'SPAM'
# Capitalize, see also str.title()
msg.capitalize()
'What do you like?'
# concatenation with +
msg + response
'what do you like?spam'
# multiplication is multiple concatenation
5 * response
'spamspamspamspamspam'
# Access individual characters (zero-based (list) indexing)
msg[0]
'w'
None Type
Most commonly used as the default return value of a function
type(None)
NoneType
ret_val = print("abc")
abc
print(ret_val)
None
Likewise, any function in Python with no return value is, in reality, returning None
.
Boolean
Booleans can also be constructed using the bool()
object constructor: values of any other type can be converted to Boolean via predictable rules
-
any numeric type is False if equal to zero, and True otherwise
-
The Boolean conversion of
None
is always False -
For strings,
bool(s)
is False for empty strings and True otherwise -
For sequences, the Boolean representation is False for empty sequences and True for any other sequences
# numeric type
bool(0)
False
bool(1)
True
a = 0
if not a:
print("a")
a
bool(None)
False
bool("")
False
bool("Hello World!")
True
bool([])
False
bool([1])
True
l_1 = [1, 2, 3]
l_2 = []
def is_empty(l):
if l:
print("not empty")
return False
else:
print("empty")
return True
is_empty([1, 2, 3])
not empty
False
is_empty([])
empty
True
Built-In Data Structures
Type Name | Example | Description |
---|---|---|
list |
[1, 2, 3] |
Ordered collection |
tuple |
(1, 2, 3) |
Immutable ordered collection |
dict |
{'a':1, 'b':2, 'c':3} |
Unordered (key,value) mapping |
set |
{1, 2, 3} |
Unordered collection of unique values |
Defining and Using Functions
*args
and **kwargs
Write a function in which we don’t initially know how many arguments the user will pass.
-
*args
:-
*
before a variable means “expand this as a sequence” -
args
is short for “arguments”
-
-
**kwargs
-
**
before a variable means “expand this as a dictionary” -
kwargs
is short for “keyword arguments”
-
def catch_all(*args, **kwargs):
print("args = ", args)
print("kwargs = ", kwargs)
catch_all(1, 2, 3, a=4, b=5)
args = (1, 2, 3)
kwargs = {'a': 4, 'b': 5}
inputs = (1, 2, 3)
keywords = {"one": 1, "two": 2}
catch_all(*inputs, **keywords)
args = (1, 2, 3)
kwargs = {'one': 1, 'two': 2}
Iterators
enumerate
“Pythonic” way to enumerate the indices and values in a list.
l = [2, 4, 6, 8, 10]
for i, val in enumerate(l):
print("index: {}, value: {}".format(i, val))
index: 0, value: 2
index: 1, value: 4
index: 2, value: 6
index: 3, value: 8
index: 4, value: 10
zip
Iterate over multiple lists simultaneously
L = [1, 3, 5, 7, 9]
R = [2, 4, 6, 8, 10]
for l_val, r_val in zip(L, R):
print("L: {}, R: {}".format(l_val, r_val))
L: 1, R: 2
L: 3, R: 4
L: 5, R: 6
L: 7, R: 8
L: 9, R: 10
for i, val in enumerate(zip(L, R)):
print("Index: {}, L: {}, R: {}".format(i, val[0], val[1]))
Index: 0, L: 1, R: 2
Index: 1, L: 3, R: 4
Index: 2, L: 5, R: 6
Index: 3, L: 7, R: 8
Index: 4, L: 9, R: 10
map
and filter
map
: takes a function and applies it to the values in an iterator
func = lambda x: x + 1
l = [1, 2, 3, 4, 5]
list(map(func, l))
[2, 3, 4, 5, 6]
filter
: only passes-through values for which the filter function evaluates to True
is_even = lambda x: x % 2 == 0
list(filter(is_even, l))
[2, 4]
Iterators as function arguments
It turns out that the *args
syntax works not just with sequences, but with any iterator:
print(*range(5))
0 1 2 3 4
list(range(3))
[0, 1, 2]
print(*map(lambda x: x + 1, range(3)))
1 2 3
L1 = [1, 2, 3, 4]
L2 = ["a", "b", "c", "d"]
z = zip(L1, L2)
print(*z)
(1, 'a') (2, 'b') (3, 'c') (4, 'd')
z = zip(L1, L2)
new_L1, new_L2 = zip(*z)
new_L1
(1, 2, 3, 4)
new_L2
('a', 'b', 'c', 'd')
Specialized Iterators: itertools
from itertools import permutations
p = permutations(range(3))
print(*p)
(0, 1, 2) (0, 2, 1) (1, 0, 2) (1, 2, 0) (2, 0, 1) (2, 1, 0)
p
<itertools.permutations at 0x10fb32710>
List Comprehensions
l = [1, 2, 3, 4, 5]
[2 * el for el in l if el > 3]
[8, 10]
which is equivalent to the loop syntax, but list comprehension is much easier to write and to understand!
L = []
for el in l:
if el > 3:
L.append(2 * el)
L
[8, 10]
[(i, j) for i in range(2) for j in range(3)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
print(*range(10))
# Leave out multiples of 3, and negate all multiples of 2
[val if val % 2 else -val for val in range(10) if val % 3]
0 1 2 3 4 5 6 7 8 9
[1, -2, -4, 5, 7, -8]
L = []
for val in range(10):
if val % 3 != 0: # conditional on iterator
# conditional on value
if val % 2 != 0:
L.append(val)
else:
L.append(-val)
L
[1, -2, -4, 5, 7, -8]
{n * 2 for n in range(5)}
{0, 2, 4, 6, 8}
{a % 3 for a in range(100)}
{0, 1, 2}
Generators
Difference between list comprehensions and generator expressions:
List comprehensions use square brackets, while generator expressions use parentheses
# list comprehension:
[n * 2 for n in range(5)]
[0, 2, 4, 6, 8]
# generator
g = (n * 2 for n in range(5))
list(g)
[0, 2, 4, 6, 8]
A list is a collection of values, while a generator is a recipe for producing values
When you create a list, you are actually building a collection of values, and there is some memory cost associated with that.
When you create a generator, you are not building a collection of values, but a recipe for producing those values.
Both expose the same iterator interface.
l = [n * 2 for n in range(5)]
for val in l:
print(val, end=" ")
0 2 4 6 8
g = g = (n * 2 for n in range(5))
for val in g:
print(val, end=" ")
0 2 4 6 8
The difference is that a generator expression does not actually compute the values until they are needed. This not only leads to memory efficiency, but to computational efficiency as well! This also means that while the size of a list is limited by available memory, the size of a generator expression is unlimited!
A list can be iterated multiple times; a generator expression is single-use
l = [n * 2 for n in range(5)]
for val in l:
print(val, end=" ")
print("\n")
for val in l:
print(val, end=" ")
0 2 4 6 8
0 2 4 6 8
g = g = (n * 2 for n in range(5))
list(g)
[0, 2, 4, 6, 8]
list(g)
[]
This can be very useful because it means iteration can be stopped and started:
g = g = (n ** 2 for n in range(12))
for n in g:
print(n, end=" ")
if n > 30:
break
print("\nDoing something in between...")
for n in g:
print(n, end=" ")
0 1 4 9 16 25 36
Doing something in between...
49 64 81 100 121
This is useful when working with collections of data files on disk; it means that you can quite easily analyze them in batches, letting the generator keep track of which ones you have yet to see.
Generator Functions: Using yield
# list comprehension
L1 = [n * 2 for n in range(5)]
L2 = []
for n in range(5):
L2.append(n * 2)
print("L1:", L1)
print("L2:", L2)
L1: [0, 2, 4, 6, 8]
L2: [0, 2, 4, 6, 8]
# generator
G1 = (n * 2 for n in range(5))
# generator function
def gen():
for n in range(5):
yield n * 2
G2 = gen()
print(*G1)
print(*G2)
0 2 4 6 8
0 2 4 6 8
Example: Prime Number Generator
# Generate a list of candidates
L = [n for n in range(2, 40)]
print(L)
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
# Remove all multiples of the first value
L = [n for n in L if n == L[0] or n % L[0] > 0]
print(L)
[2, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39]
# Remove all multiples of the second value
L = [n for n in L if n == L[1] or n % L[1] > 0]
print(L)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37]
If we repeat this procedure enough times on a large enough list, we can generate as many primes as we wish.
Encapsulate this logic in a generator function:
def gen_primes(N):
"""
Generate primes up to N
"""
primes = set()
for n in range(2, N):
# print("n = ", n, ":", *(n % p > 0 for p in primes))
if all(n % p > 0 for p in primes):
primes.add(n)
yield n
print(*gen_primes(100))
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97