Using the Monte-Carlo simulation to estimate the value of π

First of all, let's talk briefly about what is a Monte-Carlo simulation.
Simply put, it is a computational method that uses a large number of random samples to obtain a result. It is used in a wide range of disciplines like mathematics, computer graphics, and physical sciences. In this blog, we will see how we can use this method to estimate the value of the mathematical constant \( \pi \).

Concept

How can a large number of random samples be used to estimate \( \pi \)?
Suppose we have a circle inscribed in a square, if we throw a pile of sand randomly on the square, some grains will end up inside the circle and others will end up outside of it. The following animation illustrates this concept, the blue dots represent the grains that are inside the circle and the red ones represent the ones outside of it.
If you wait a few moments you will start to see a circle filled with blue dots (a blue disk) and the areas outside of it are filled with red dots.

We can compute the proportion of grains that ended up inside the circle, this value equals the proportion of the circle to the square. Besides, the more grain of sands we throw onto the square the more accurate the estimation will be.
But why is this true? The Monte-Carlo simulation is an example of the "Law of Large Numbers (LLN)", this is not the focus of this blog, if you want to know more you can check this wikipedia article Law_of_large_numbers.
First, let's figure out what the proportion of the circle in relation to the square is.
The area of the circle is \(\pi r^2\), on the other hand we will consider a unit square having an area of 1.
Therefore the proportion we're looking for is \(\cfrac{\pi / 4}{1} = \dfrac{\pi}{4}\).
The second value we're looking for is the proportion of the grains inside the circle. Denote \(success\) the number of these grains, and \(trials\) the total number of grains thrown, so the proportion is \(success/trials\).
Now, let's write the equation $$ \dfrac{\pi}{4} = \dfrac{success}{trials} \Harr \pi = 4\cdot \dfrac{success}{trials} $$ There you go! What we have to do now is to simulate a lot of "trials" and see how many are "success" i.e. inside the circle.

Code

We will be creating an array containing many couples of variables representing the coordinates of each point (grain of sand), then we calculate whether the point is within the disk or not, the disk center is \( (1/2, 1/2) \).
Each point \( (x,y) \) is a bivariate uniform random variable on the unit square, \( (x,y) \in (0,1)^2 \) and the disk is defined by: $$ \Big\{(x,y) \in \R^2; \Big(x - \frac{1}{2}\Big)^2 + \Big(y - \frac{1}{2}\Big)^2 \leq \Big(\frac{1}{2}\Big)^2 \Big\}$$

            
            import numpy as np

            def monte_carlo_pi(trials):
                b_uniform = ((np.random.uniform(0,1), np.random.uniform(0,1)) for _ in range(trials))
                success = sum(((x - 0.5)**2 + (y - 0.5)**2 <= 0.25 for x,y in b_uniform))
                return 4 * (success/trials)
            
            

The code above defines a function "monte_carlo_pi" that takes the number of random trials to generate as argument.
The "b_uniform" is a generator containing "trials" amount of random points all within the square. The "success" variable is simply the number of points situated inside the circle. Finally the estimated value of \( \pi \) is returned.

Results

We expect that the larger number of trials we use the better the estimation of \( \pi \) will be.
Let's see the result when using only 100 trials.
Card image cap
We got 3.0 as an estimation, this is quite far from the value that we're mostly used to (e.g. 3.141592654), so let's increase the number of trials to 100,000
Card image cap
Now we're at 3.13976, much better. Finally, let's try 1,000,000 trials Card image cap
The estimation got even closer, we're at 3.144164.

Conclusion

The Monte-Carlo method got a decent approximation of \( \pi \) when using 1,000,000 trials, so why not use even more trials? How about 100 million? What about 100 billion?
The theory states that the more trials we make the better the approximation, but there is a limit when using computers.
Also, more trials will result in more computing time, 100 million trials will give an estimation of 3.14144576.
A better result, however, it took eight and a half minutes to complete.
There are many ways to enhance the performance of our code, including using the Numba library and parallelization, this will be covered in another blog.