A random walk in two dimensions performs a step either to the north, south, west, or east, each one with probability 1/4. To demonstrate this process, we introduce x and y coordinates of np particles and draw random numbers among 1, 2, 3, or 4 to determine the move. The positions of the particles can easily be visualized as small circles in an xy coordinate system.
8.7.1 Basic Implementation
The algorithm described above is conveniently expressed directly as a complete working program:
def random_walk_2D(np, ns, plot_step): xpositions = zeros(np) ypositions = zeros(np) # extent of the axis in the plot: xymax = 3*sqrt(ns); xymin = -xymax
NORTH = 1; SOUTH = 2; WEST = 3; EAST = 4 # constants for step in range(ns): for i in range(np):
direction = random_number.randint(1, 4) if direction == NORTH: ypositions[i] += 1 elif direction == SOUTH:
ypositions[i] -= 1 elif direction == EAST: xpositions[i] += 1 elif direction == WEST: xpositions[i] -= 1
# plot just every plot_step steps: if (step+1) / plot_step == 0:
plot(xpositions, ypositions, 'ko', axis=[xymin, xymax, xymin, xymax], title='/d particles after /d steps' / \
(np, step+1), hardcopy='tmp_0/„03d.eps' / (step+1)) return xpositions, ypositions
# main program:
import random as random_number random_number.seed(10) import sys from scitools.std import zeros, plot, sqrt np = int(sys.argv) # number of particles ns = int(sys.argv) # number of steps plot_step = int(sys.argv) # plot every plot_step steps x, y = random_walk_2D(np, ns, plot_step)
The program is found in the file walk2D.py. Figure 8.7 shows two snapshots of the distribution of 3000 particles after 40 and 400 steps. These plots were generated with command-line arguments 3000 400 20, the latter implying that we visualize the particles every 20 time steps only.
To get a feeling for the two-dimensional random walk you can try out only 30 particles for 400 steps and let each step be visualized (i.e., command-line arguments 30 400 l). The update of the movements is now fast.
The walk2D. py program dumps the plots to PostScript files with names of the form tmp_xxx.eps, where xxx is the step number. We can create a movie out of these individual files using the movie function (Chapter 4.3.7) or the program convert from the ImageMagick suite12:
convert -delay 50 -loop 1000 tmp_*.eps movie.gif
All the plots are now put after each other as frames in a movie, with a delay of 50 ms between each frame. The movie will run in a loop 1000 times. Alternatively, we can create the movie with the movie function from Easyviz, inside a program:
12 If you want to run this command from an IPython session, prefix convert with an exclamation mark: ! convert.
from scitools.std import movie movie('tmp_*.eps', encoder='convert', output_file='movie.gif')
The resulting movie file is named movie.gif, which can be viewed by the animate program (also from the ImageMagick program suite), just write animate movie.gif. Making and showing the movie are slow processes if a large number of steps are included in the movie - 100 steps or fewer are appropriate, but this depends on the power of your computer.
The walk2D. py program is quite slow. Now the visualization is much faster than the movement of the particles. Vectorization may speed up the walk2D. py program significantly. As in the one-dimensional phase, we draw all the movements at once and then invoke a loop over the steps to update the x and y coordinates. We draw ns x np numbers among 1, 2, 3, and 4. We then reshape the vector of random numbers to a two-dimensional array moves[i, j], where i counts the steps, j counts the particles. The if test on whether the current move is to the north, south, east, or west can be vectorized using the where function (see Chapter 4.4.1). For example, if the random numbers for all particles in the current step are accessible in an array this_move, we could update the x positions by xpositions += where(this_move == EAST, 1, 0) xpositions -= where(this_move == WEST, 1, 0)
provided EAST and WEST are constants, equal to 3 and 4, respectively. A similar construction can be used for the y moves. The complete program is listed below:
def random_walk_2D(np, ns, plot_step): xpositions = zeros(np) ypositions = zeros(np)
moves = random.random_integers(1, 4, size=ns*np) moves.shape = (ns, np)
# estimate max and min positions: xymax = 3*sqrt(ns); xymin = -xymax
NORTH = 1; SOUTH = 2; WEST = 3; EAST = 4 # constants for step in range(ns):
ypositions += where(this_move == NORTH, ypositions -= where(this_move == SOUTH, xpositions += where(this_move == EAST, xpositions -= where(this_move == WEST,
# just plot every plot_step steps: if (step+1) / plot_step == 0:
plot(xpositions, ypositions, 'ko', axis=[xymin, xymax, xymin, xymax],
(np, step+1), hardcopy='tmp_°/03d.eps' / (step+1)) return xpositions, ypositions
# main program:
from scitools.std import *
np = int(sys.argv) # number of particles ns = int(sys.argv) # number of steps plot_step = int(sys.argv) # plot each plot_step step x, y = random_walk_2D(np, ns, plot_step)
You will easily experience that this program, found in the file walk2Dv.py, runs significantly faster than the walk2D.py program.
Drawing Random Numbers. Random numbers can be scattered throughout an interval in various ways, specified by the distribution of the numbers. We have considered a uniform distribution (Chapter 8.1.2) and a normal (or Gaussian) distribution (Chapter 8.1.6). Table 8.1 shows the syntax for generating random numbers of these two distributions, using either the standard scalar random module in Python or the vectorized numpy.random module.
Table 8.1 Summary of important functions for drawing random numbers. N is the array length in vectorized drawing, while m and s represent the mean and standard deviation values of a normal distribution.
random numpy.random uniform numbers in [0,1) random()
uniform numbers in [a,b) uniform(a, integers in [a, b] randint(a,
Gaussian numbers, mean m, st.dev. s gauss(m, s)
shuffle list a (in-place) shuffle(a)
choose a random element in list a choice(a)
random(N) b) uniform(a, b, N) b) randint(a, b+1, N) random_integers(a, normal(m, s, N) seed(i) shuffle(a)
Typical Probability Computation. Many programs performing probability computations draw a large number N of random numbers and count how many times M a random number leads to some true condition (Monte Carlo simulation):
import random as random_number M = 0
r = random_number.randint(a, b) if condition: M += 1
print 'Probability estimate:', float(M)/N
For example, if we seek the probability that we get at least four eyes when throwing a dice, we choose the random number to be the number of eyes, i.e., an integer in the interval [1, 6] (a=1, b=6) and condition becomes r >= 4.
For large N we can speed up such programs by vectorization, i.e., drawing all random numbers at once in a big array and use operations on the array to find M. The similar vectorized version of the program above looks like from numpy import * r = random.uniform(a, b, N) M = sum(condition) # or
print 'Probability estimate:', float(M)/N
(Combinations of boolean expressions in the condition argument to where requires special constructs as outlined in Exercise 8.14.) Make sure you use sum from numpy, when operating on large arrays, and not the much slower built-in sum function in Python.
Statistical Measures. Given an array of random numbers, the following code computes the mean, variance, and standard deviation of the numbers and finally displays a plot of the histogram, which reflects how the numbers are statistically distributed:
from scitools.std import mean, var, std, compute_histogram m = mean(numbers) v = var(numbers) s = std(numbers)
x, y = compute_histogram(numbers, 50, piecewise_constant=True) plot(x, y)
8.8.2 Summarizing Example: Random Growth
Chapter 5.1.1 contains mathematical models for how an investment grows when there is an interest rate being added to the investment at certain intervals. The model can easily allow for a time-varying interest rate, but for forecasting the growth of an investment, it is difficult to predict the future interest rate. One commonly used method is to build a probabilistic model for the development of the interest rate, where the rate is chosen randomly at random times. This gives a random growth of the investment, but by simulating many random scenarios we can compute the mean growth and use the standard deviation as a measure of the uncertainty of the predictions.
Problem. Let p be the annual interest rate in a bank in percent. Suppose the interest is added to the investment q times per year. The new value of the investment, xn, is given by the previous value of the investment, xn-1, plus the p/q percent interest:
Normally, the interest is added daily (q — 360 and n counts days), but for efficiency in the computations later we shall assume that the interest is added monthly, so q — 12 and n counts months.
The basic assumption now is that p is random and varies with time. Suppose p increases with a random amount y from one month to the next:
A typical size of p adjustments is 0.5. However, the central bank does not adjust the interest every month. Instead this happens every M months on average. The probability of a Y — 0 can therefore be taken as 1/M. In a month where y — 0, we may say that y — m with probability 1/2 or y — — m with probability 1/2 if it is equally likely that the rate goes up as down (this is not a good assumption, but a more complicated evolvement of y is postponed now).
Solution. First we must develop the precise formulas to be implemented. The difference equations for xn and pn are in simple in the present case, but the details computing y must be worked out. In a program, we can draw two random numbers to estimate y: one for deciding if Y — 0 and the other for determining the sign of the change. Since the probability for y — 0 is 1/M, we can draw a number r1 among the integers 1,..., M and if r1 — 1 we continue with drawing a second number r2 among the integers 1 and 2. If r2 — 1 we set y — m, and if r2 — 2 we set y — —m. We must also assure that pn does not take on unreasonable values, so we choose pn < 1 and pn > 15
as cases where pn is not changed.
The mathematical model for the investment must track both xn and pn. Below we express with precise mathematics the equations for xn and pn and the computation of the random y quantity:
Xn = Xn-1 + 1P"100X™-1' i = 1, •••>N (8.9)
We remark that the evolution of pn is much like a random walk process (Chapter 8.6), the only differences is that the plus/minus steps are taken at some random points among the times 0,1, 2,..., N rather than at all times 0,1,2,..., N. The random walk for pn also has barriers at p = 1 and p = 15, but that is common in a standard random walk too.
Each time we calculate the xn sequence in the present application, we get a different development because of the random numbers involved. We say that one development of Xo,..., Xn is a path (or realization, but since the realization can be viewed as a curve xn or pn versus n in this case, it is common to use the word path). Our Monte Carlo simulation approach consists of computing a large number of paths, as well as the sum of the path and the sum of the paths squared. From the latter two sums we can compute the mean and standard deviation of the paths to see the average development of the investment and the uncertainty of this development. Since we are interested in complete paths, we need to store the complete sequence of xn for each path. We may also be interested in the statistics of the interest rate so we store the complete sequence pn too.
Programs should be built in pieces so that we can test each piece before testing the whole program. In the present case, a natural piece is a function that computes one path of xn and pn with N steps, given M, m, and the initial conditions x0 and p0. We can then test this function before moving on to calling the function a large number of times. An appropriate code may be def simulate_one_path(N, x0, p0, M, m): x = zeros(N+1) p = zeros(N+1)
# update interest rate p: r = random_number.randint(1, M) if r == 1:
# adjust gamma:
r = random_number.randint(1, 2) gamma = m if r == 1 else -m else:
gamma = 0 pn = p[n-1] + gamma p[n] = pn if 1 <= pn <= 15 else p[n-1] return x, p
Testing such a function is challenging because the result is different each time because of the random numbers. A first step in verifying the implementation is to turn off the randomness (m = 0) and check that the deterministic parts of the difference equations are correctly computed:
The output becomes
[ 1. 1.00833333 1.01673611 1.02520891]
These numbers can quickly be checked against a formula of the type (5.4) on page 237 in an interactive session:
>>> g(1, 1, 10) 1.0083333333333333 >>> g(1, 2, 10) 1.0167361111111111 >>> g(1, 3, 10) 1.0252089120370369
We can conclude that our function works well when there is no randomness. A next step is to carefully examine the code that computes gamma and compare with the mathematical formulas.
Simulating many paths and computing the average development of xn and pn is a matter of calling simulate_one_path repeatedly, use two arrays xm and pm to collect the sum of x and p, respectively, and finally obtain the average path by dividing xm and pm by the number of paths we have computed:
def simulate_n_paths(n, N, L, p0, M, m): xm = zeros(N+1) pm = zeros(N+1) for i in range(n):
x, p = simulate_one_path(N, L, p0, M, m) # accumulate paths: xm += x pm += p # compute average: xm /= float(n) pm /= float(n) return xm, pm
We can also compute the standard deviation of the paths using formulas (8.3) and (8.6), with Xj as either an x or a p array. It might happen that small round-off errors generate a small negative variance, which mathematically should have been slightly greater than zero. Taking the square root will then generate complex arrays and problems with plotting. To avoid this problem, we therefore replace all negative elements by zeros in the variance arrays before taking the square root. The new lines for computing the standard deviation arrays xs and ps are indicated below:
xs = zeros(N+1) # standard deviation of x ps = zeros(N+1) # standard deviation of p for i in range(n):
# accumulate paths:
# compute standard deviation:
xs = xs/float(n) - xm*xm # variance ps = ps/float(n) - pm*pm # variance
# remove small negative numbers (round off errors): xs[xs < 0] = 0
ps[ps < 0] = 0 xs = sqrt(xs) ps = sqrt(ps) return xm, xs, pm, ps
A remark regarding the efficiency of array operations is appropriate here. The statement xs += x**2 could equally well, from a mathematical point of view, be written as xs = xs + x**2. However, in this latter statement, two extra arrays are created (one for the squaring and one for the sum), while in the former only one array (x**2) is made. Since the paths can be long and we make many simulations, such optimizations can be important.
One may wonder whether x**2 is "smart" in the sense that squaring is detected and computed as x*x, not as a general (slow) power function. This is indeed the case for arrays, as we have investigated in the little test program smart_power.py in the random directory. This program applies time measurement methods from Appendix E.6.2.
Our simulate_n_paths function generates four arrays which are natural to visualize. Having a mean and a standard deviation curve, it is often common to plot the mean curve with one color or linetype and then two curves, corresponding to plus one and minus one standard deviation, with another less visible color. This gives an indication of the mean development and the uncertainty of the underlying process. We therefore make two plots: one with xm, xm+xs, and xm-xs, and one with pm, pm+ps, and pm-ps.
Both for debugging and curiosity it is handy to have some plots of a few actual paths. We may pick out 5 paths from the simulations and visualize these:
# show 5 random sample paths: if i / (n/5) == 0: figure(1)
plot(x, title='sample paths of investment')
plot(p, title='sample paths of interest rate') hold('on')
figure(1); hardcopy('tmp_sample_paths_investment.eps') figure(2); hardcopy('tmp_sample_paths_interestrate.eps')
Note the use of figure: we need to hold on both figures to add new plots and switch between the figures, both for plotting and making the final hardcopy.
After the visualization of sample paths we make the mean ± standard deviation plots by this code:
xm, xs, pm, ps = simulate_n_paths(n, N, x0, p0, M, m) figure(3)
months = range(len(xm)) # indices along the x axis plot(months, xm, 'r', months, xm-xs, 'y', months, xm+xs, 'y', title='Mean +/- 1 st.dev. of investment', hardcopy='tmp_mean_investment.eps') figure(4)
plot(months, pm, 'r', months, pm-ps, 'y', months, pm+ps, 'y', title='Mean +/- 1 st.dev. of annual interest rate', hardcopy='tmp_mean_interestrate.eps')
The complete program for simulating the investment development is found in the file growth_random.py.
Running the program with the input data x0 = 1 # initial investment p0 = 5 # initial interest rate
M = 3 # p changes (on average) every M months n = 1000 # number of simulations m = 0.5 # adjustment of p and initializing the seed of the random generator to 1, we get four plots, which are shown in Figure 8.8.
20 40 60 80
20 40 60 80
pie paths o
Fig. 8.8 Development of an investment with random jumps of the interest rate at random points of time: (a) mean value of investment ± one standard deviation; (b) mean value of the interest rate ± one standard deviation; (c) five paths of the investment development; (d) five paths of the interest rate development.
pie paths o
Was this article helpful?