Loops in Python¶
In [1]:
import numpy as np
import matplotlib.pyplot as plt
For loops first¶
In [2]:
indices = ['Karl',1,True]
for index in indices:
print(index)
Karl 1 True
Often want to index with a range of integers.
In [3]:
for i in range(0,8):
print(i)
0 1 2 3 4 5 6 7
In [4]:
days = ['Day ' + str(i) for i in range(0,20)]
days
Out[4]:
['Day 0', 'Day 1', 'Day 2', 'Day 3', 'Day 4', 'Day 5', 'Day 6', 'Day 7', 'Day 8', 'Day 9', 'Day 10', 'Day 11', 'Day 12', 'Day 13', 'Day 14', 'Day 15', 'Day 16', 'Day 17', 'Day 18', 'Day 19']
In [5]:
pow2 = [2**a for a in range(0,10)]
pow2
Out[5]:
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
Let's write a loop to take the sum of the entries in a NumPy vector:
In [6]:
x = np.array([1,4,7,2,3])
val = 0
for i in range(x.size):
#val = val + x[i]
val += x[i]
print(val)
17
How about computing the sample variance... $$ S^2 = \frac{1}{n-1}\sum_{i=1}^n(X_i - \bar X_n)^2 $$
In [7]:
x = np.array([1,4,7,2,3])
def sampvar(x):
n = x.size
sumx = 0
for i in range(n):
sumx += x[i]
xbar = sumx / n
sum2 = 0
for i in range(n):
sum2 += (x[i] - xbar)**2
s2 = sum2/(n-1)
return s2
print(sampvar(x))
print(np.var(x))
5.300000000000001 4.24
In [8]:
%timeit sampvar(x)
1.46 μs ± 9.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [9]:
%timeit np.var(x)
6.74 μs ± 360 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Make a plot with several curves on it, with a loop¶
In [10]:
import scipy.stats as stats
a = [1,1,1,2,5,10] # "shape" parameter values
b = [1,2,4,2,2,1/2] # "scale" parameter values
fig, ax = plt.subplots(figsize = (8,5))
x = np.linspace(0,20,201)
ax.set_ylim((0,1/2))
for i in range(len(a)):
fx = stats.gamma.pdf(x,a[i],scale = b[i])
ax.plot(x,fx)
Arrange 6 plots in a grid with a for loop.
In [11]:
a = [1,1,1,2,5,10] # "shape" parameter values
b = [1,2,4,2,2,1/2] # "scale" parameter values
x = np.linspace(0,20,201)
fig, axs = plt.subplots(2,3,figsize = (12,6))
k = 0
for i in range(2):
for j in range(3):
fx = stats.gamma.pdf(x,a[k],scale = b[k])
axs[i,j].plot(x,fx)
k += 1
One more example of for loops: Computing the Wilcoxon rank-sum test statistic. $$ W_{XY} = \sum_{i=1}^n \sum_{j=1}^m \mathbb{I}(X_i \leq Y_j) $$
In [12]:
n = 30
m = 20
X = np.random.normal(0,1,n)
Y = np.random.normal(0,1,m)
def wilcx(X,Y):
W = 0
n = X.size
m = Y.size
for i in range(n):
for j in range(m):
W += X[i] <= Y[j]
return W
print(wilcx(X,Y))
336
while loops in Python¶
In [13]:
x = np.array([1,4,7,2,3])
val = 0
for i in range(x.size):
#val = val + x[i]
val += x[i]
print(val)
17
Write a while loop to do the same thing:
In [14]:
x = np.array([1,4,7,2,3])
i = 0
val = 0
while(i < x.size):
val += x[i]
i += 1
print(val)
17
Approximation to the natural log
In [15]:
def ln(z,tol=1e-6):
logz = 0
k = 0
conv = False
while(not conv):
trm = 2/(2*k+1) * ((z-1)/(z+1)) ** (2*k+1)
logz += trm
k += 1
conv = trm < tol
return logz, k
z = 30
print(ln(z))
print(np.log(z))
(3.4011918444486384, 73) 3.4011973816621555
Newton's method for computing the reciprocal of a number:
$$ x_{n+1} = 2 x_n - a x_n^2 $$In [16]:
# a = 10 # number of which I seek the reciprocal
# tol = 1e-6 # .000001
def recip(a,tol= 1e-6):
x = 0.0001 # initial guess
conv = False
while(not conv):
x0 = x # store current value so we can compare it with the new value
x = 2*x - a * x **2
conv = abs(x - x0) < tol
return x
print(recip(10))
print(recip(4.58723))
print(1/4.58723)
0.09999999999999942 0.21799648153676895 0.217996481536788
Minimizing a function with Newton's method¶
Let $$ f(x) = x^2 + \sin x. $$
Let $$ f'(x) = 2 x + \cos x $$ and $$ f''(x) = 2 - \sin x $$ Make updates $$ x_{n+1} = x_n - \frac{f'(x_n)}{f''(x_n)} $$
In [17]:
tol = 1e-6
x = 0 # initial guess
conv = False
while(not conv):
x0 = x # store current value before updating
df = 2*x + np.cos(x)
ddf = 2 - np.sin(x)
x = x - df/ddf
conv = abs(x - x0) < tol
xmin = x
xmin
Out[17]:
-0.45018361129487383
Let's make a plot and see
In [18]:
x = np.linspace(-1,1,201)
fx = x**2 + np.sin(x)
fig, ax = plt.subplots( figsize = (5,3))
ax.plot(x,fx)
ax.axvline(xmin)
plt.show()
compute a maximum likelihood estimator¶
For the gamma distribution.
In [19]:
from scipy.special import polygamma # function to compute the gamma, digamma and trigamma functions
In [20]:
# first let's draw a random sample from the gamma distribution.
rng = np.random.default_rng()
n = 500
alpha_true = 2
X = rng.gamma(shape = alpha_true, scale = 1, size = n)
# write a loop to estimate the shape parameter alpha
alpha = 1
conv = False
while(not conv):
alpha0 = alpha
alpha = alpha - (polygamma(0,alpha) - np.mean(np.log(X)))/ polygamma(1,alpha)
conv = abs(alpha - alpha0) < 1e-6
print(alpha)
2.0567613667347953
In [21]:
x = np.linspace(0,10,201)
fx = stats.gamma.pdf(x,a= alpha,scale = 1)
fig, ax = plt.subplots(figsize = (8,3))
ax.plot(x,fx)
ax.hist(X,density = True)
plt.show()