Physics & Astronomy Modules

Random Walks

Imagine you stand on the pavement on your street, flip a coin and take a step left for heads or a step right for tails. Then repeat. Your movement might look something like the red circle above, which is moving according to electronic coin flips made by your computer.

We call this kind of path a random walk in mathematics, and though it might not seem like much, random walks prove to be of great use in modelling complex systems in diverse fields, from financial markets to animal behaviour. It is also used everywhere in computational physics from the cosmological to the quantal, but perhaps the most familiar random walk in physics came in the form of a classical problem so we’ll start there.

Brownian Motion

In 1827, the Scottish botanist Robert Brown looked through his microscope at pollen grains suspended in water and noted that they jittered around in a way he couldn’t explain.

It took until 1905 until a satisfactory answer was provided by Albert Einstein who brought some statistical methods to the problem, predicting that the random motion of the pollen grains was due to water being made up of molecules, which were continually bumping into the pollen grains. First he derived a diffusion equation for the grains, then he showed how to link this to physically measurable quantities starting with the mean squared dispacement (MSD), which we’ll come back to.

The reality of atoms and molecules was not fully accepted at that time, and Einstein viewed this model of Brownian motion as a way to give evidence for, or to falsify, the atomic theory. He wrote in the paper:

If the movement discussed here can actually be observed (together with the laws relating to it that one would expect to find), then classical thermodynamics can no longer be looked upon as applicable with precision to bodies even of dimensions distinguishable in a microscope: an exact determination of actual atomic dimensions is even possible. On the other hand, had the prediction of this movement proved to be incorrect, a weighty argument would be provided against the molecular-kinetic conception of heat.

Experimental verification of the theory was made a few years later by Jean Perrin — atoms and molecules were here to stay.

Walking on a Lattice

To introduce modelling with random walks, we’ll go back to the coin toss example we started with. This is an example of a lattice walk, in this case a 1D lattice.

We define the walk as a sequence of \(n\) independent integers \(a_n\), each with equal probability of being either -1 or 1. Then we define the series \( \{ S_n \} \), where \[ S_n = \sum_{i=0}^{n}{a_n}, \] as the random walk. In the simulation below, 100 walks are made with \( n = 20 \) steps. The final positions of the walkers are recorded in the histogram, which gives us a starting point for investigating the statistical properties of the lattice walk.

A number of runs should show that despite each walker taking its own route, the result of many walkers tends toward the same result: a normal distribution. I won’t derive that distribution here, but fundamentally its normality is due to the central limit theorem. You can gain an understanding of the central limit theorem by watching this video about bunny rabbits and dragons too.

All good, but for any physical system we’re trying to model (e.g. Brown’s jittery pollen grains) we need to find something to relate the individual walkers to something we can physically observe. The obvious place to start is the question: how far do the walkers get after a certain time, or \( n \) steps?

Each step along the way in the one-dimensional random walk is either left or right by 1 unit. Of course it could be a walk of some other amount per step so we will use \(a_i\) to represent the \(i^{th}\) step. After \(n\) steps, the sum of all of them is where the walker will be. \[ S_n = a_0 + a_1 + a_2 + ... \] Average that sum, and we have the expectation value of the displacement of the walk with \(n\) steps. We would write the average of the sum \[ \langle S_n \rangle = \langle \sum_{i=0}^{n} {a_i} \rangle = \sum_{i=0}^{n}{\langle a_i \rangle} . \] The average over many trials of just one step left or right, \( {\langle a_i \rangle} \) is simply zero if the number of trials is large enough, and the right-hand side of this expression is zero. Because of that, the average many trials of \(n\) steps is also zero, although a single trial of \(n\) steps can take the walker as far as \(n\) units left or right if each step is 1 unit. However, since the steps are taken randomly left or right, the most probable distance travelled after many steps will be zero — back where the walker started. So in this case the mean displacement is not useful to us as an indicator of the process of getting to that end.

Next we might look at the squared distance travelled. This proves more useful because both \( (-1)^2 = 1\) and \( (+1)^2 = 1 \), so the square of any \(i^{th}\) step will always average \( \langle a_i^2 \rangle = 1 , \) and the square of the displacement cannot be negative. We find the displacement after \(n\) steps, take its square, and then average that over many trials. First, the sum is \(a_0 + a_1 + a_2 + ...\). Mulitply this by itself and there will be two kinds of terms: the positive products with squares such as \[ a_0 a_0 + a_1 a_1 + a_2 a_2 + ... \] and the crossproducts that may be negative or positive such as \[a_0 a_1 + a_0 a_2 + ... + a_1 a_0 + a_1 a_2 + ... \] Since the cross product terms averaged over many trials will be zero, we have \[ \langle S_n^2 \rangle = \sum_{i=0}^{n}{\langle a_i^2 \rangle} + \sum_{i=0}^{n} \sum_{j \ne i} 2 \langle a_i a_j \rangle = n + 0 = n.\] This non-zero mean squared displacement gives us a more useful statistical quantity to work with. For a random walk in one dimension with \(n\) steps, the mean square displacement (some say "MSD") is \(n\). How far did the walker actually go? That's \(n\) too, while on average its displacement from where it started will be zero because it can go either left or right. It simply follows a jittery path to get home again.

Normal Curve

When you run the simulation for 100 trials of 20 steps you will see a distribution of outcomes, some to the left, some to the right, averaging around zero displacment. It looks approximately smooth, a "bell-shaped" curve that spreads about \(\pm 5\) around the origin. In the limit of an infinite number of trials, the distribution of outcomes would be a normal distribution given by this expression \[ \frac{1}{\sigma \sqrt{2\pi}} \exp (- (( x - \mu ) / 2 \sigma)^2 ) \]

The mean value of the displacement \(x\) is \(\mu\), which for the random walk in one dimension is zero. A curve with the function \(\exp( -(x/a)^2 \) is named a Gaussian curve. This one is centered on zero and spreads left and right. For large \(x\) it goes to zero, and at \(x=a\) it has a value of \(1/e\). In a normal distribution, the width is determined by \(\sigma\) which is called the standard deviation. The standard deviation for the random walk in one dimension with a step of 1 unit is \[ \sigma = \sqrt{n} \] We see that that number \(n\) of steps taken determines the width, and that the mean square deviation is just \(n^2\). The number of trials determines how closely the observed distribution will follow the normal curve. This approach of observed behavior to the normal distribution centered on the average in the limit of an infinite number of trials is the mean value theorem . We would expect that for 20 steps the distribution would be centered nearly at zero after 100 trials, and that it would fall to \(1/e = 1/2.718 = 0.3679 \) at \(2 \sigma = 2 \sqrt{20} = 8.94 \) from the origin. The larger the number of trials, the more the distribution will approach this ideal behavior.

Measurable Quantities

In his paper Einstein derived, from basic principles of kinetic theory, a mean squared displacement that might be expected for a particle subject to Brownian motion. Describing the motion in terms of a diffusion equation, he showed that the mean squared displacement \( \langle S(t)^2 \rangle \) of a particle at a time \( t \) is related linearly to the diffusion coefficient \( D \) by \[ \langle S(t)^2 \rangle = 2 d D t\] where \( d \) is the number of dimensions in the problem. The diffusion of particles is something we can observe and measure, so now we have a way to check the random walks of a model with reality.

In the next simulation, we’ll move up to a 2D lattice, where each walker can move on a plane: up, down, left or right with equal probability. We’ll set off 50 walkers, and keep track of the MSD as we go in a graph.

The equation for the diffusion coefficient above tells us we should expect the MSD graph to tend to a straight line as we increase the number of walkers. Then, once the last walker has walked we’ll take a line of best fit through the function from which (by the relation for the diffusion equation above, with \(d = 2\)) the slope of the line gives us the diffusion coefficient.


I hope I’ve given an inkling of how detailed physical models can be based on something as simple as taking random steps. By making a good choice of random walks, we can sample huge domains in a systematic way that gives us useful results, without having to check every possible outcome (for example, every way a particle might go). We can then link the theory of microscopic behaviour to macroscopic observables — things we can measure in the lab.

One example, Einstein’s model of Brownian motion and Perrin’s measurement, provided crucial evidence for the existence of atoms at the beginning of the last century, something we take for granted now. But as I mentioned, random walks are everywhere in physics, playing an important role in what are known as Monte Carlo simulations. I’ll write some more about the use of Monte Carlo methods in quantum physics in future notes.

References & Further Reading

The content of this site is the work of Thomas P. Ogden, mirrored here with permission. .

Werner Krauth. Statistical Mechanics: Algorithms and Computations, Oxford University Press, November 2006.

Jos Thijssen. Computational Physics. Cambridge University Press, second edition, March 2007.

This simulations are written in javascript and D3.js.

When writing this note, I found this blogpost by Jan Wrobel, who has done something similar to the first simulation. I used his code to improve my walk loops.