©2009 2010 2011 2012 Jennifer Harlow, Dominic Lee and Raazesh Sainudiin.
Creative Commons AttributionNoncommercialShare Alike 3.0
In the last worksheet, we started to look at how we can produce realisations from the most elementary $Uniform(0,1)$ random variable.
i.e., how can we produce samples $(x_1, x_2, \ldots, x_n)$ from $X_1, X_2, \ldots, X_n$ $\overset{IID}{\thicksim}$ $Uniform(0,1)$?
What is Sage doing when we ask for random()?

We looked at how Modular arithmetic and number theory gives us pseudorandom number generators.
We used linear congruential generators (LCG) as simple pseudorandom number generators.
Remember that "pseudorandom" means that the numbers are not really random. We saw that some linear congruential generators (LCG) have much shorter, more predictable, patterns than others and we learned what makes a good LCG.
We introduced the pseudorandom number generator (PRNG) called the Mersenne Twister that we will use for simulation purposes in this course. It is based on more sophisticated theory than that of LCG but the basic principles of recurrence relations are the same.
Recall that the $Uniform(0,1)$ random variable is the fundamental model as we can transform it to any other random variable, random vector or random structure. The PDF $f$ and DF $F$ of $X \sim Uniform(0,1)$ are:
$f(x) = \begin{cases} 0 & \text{if} \ x \notin [0,1] \\ 1 & \text{if} \ x \in [0,1] \end{cases}$
$F(x) = \begin{cases} 0 & \text{if} \ x < 0 \\ 1 & \text{if} \ x > 1 \\ x & \text{if} \ x \in [0,1] \end{cases}$
We use the Mersenne twister pseudorandom number generator to mimic independent and identically distributed draws from the $uniform(0,1)$ RV.
In Sage, we use the python random module to generate pseudorandom numbers for us. (We have already used it: remember randint?)
random() will give us one simulation from the $Uniform(0,1)$ RV:

If we want a whole simulated sample we can use a list comprehension. We will be using this technique frequently so make sure you understand what is going on. "for i in range(3)" is acting like a counter to give us 3 simulated values in the list we are making


If we do this again, we will get a different sample:

Often is it useful to be able to replicate the same random sample. For example, if we were writing some code to do some simulations using samples from a PRNG, and we "improved" the way that we were doing it, how would we want to test our improvement? If we could replicate the same samples then we could show that our new code was equivalent to our old code, just more efficient.
Remember when we were using the LCGs, and we could set the seed $x_0$? More sophisticated PRNGs like the Mersenne Twister also have a seed. By setting this seed to a specified value we can make sure that we can replicate samples.



Now we can replicate the same sample again by setting the seed to the same value:




We can compare some samples visually by plotting them:

You try
Optionally, try looking at the more advanced documentation.



What can we do with samples from a $Uniform(0,1)$ RV? Why bother?
We can use them to sample or simulate from other, more complex, random variables.
The $Bernoulli(\theta)$ RV $X$ with PMF $f(x;\theta)$ and DF $F(x;\theta)$ parameterised by some real $\theta\in [0,1]$ is a discrete random variable with only two possible outcomes.
$f(x;\theta)= \theta^x (1\theta)^{1x} \mathbf{1}_{\{0,1\}}(x) =
\begin{cases}
\theta & \text{if} \ x=1,\\
1\theta & \text{if} \ x=0,\\
0 & \text{otherwise}
\end{cases}$
$F(x;\theta) =
\begin{cases}
1 & \text{if} \ 1 \leq x,\\
1\theta & \text{if} \ 0 \leq x < 1,\\
0 & \text{otherwise}
\end{cases}$
Here are some functions for the PMF and DF for a $Bernoulli$:

We can see the effect of varying $\theta$ interactively:
Click to the left again to hide and once more to show the dynamic interactive window 


Don't worry about how these plots are done: you are not expected to be able to do this. Just use them to see the effect of varying $\theta$.
We can simulate a sample from a $Bernoulli$ distribution by transforming input from a $Uniform(0,1)$ distribution using the floor() function in Sage. In maths, $\lfloor x \rfloor$, the 'floor of $x$' is the largest integer that is smaller than or equal to $x$. For example, $\lfloor 3.8 \rfloor = 3$.

Using floor, we can do inversion sampling from the $Bernoulli(\theta)$ RV using the the $Uniform(0,1)$ random variable that we said is the fundamental model.
We will introduce inversion sampling more formally later. In general, inversion sampling means using the inverse of the CDF $F$, $F^{[1]}$, to transform input from a $Uniform(0,1)$ distribution.
To simulate from the $Bernoulli(\theta)$, we can use the following algorithm:
Input:
Output:
Steps:
We can illustrate this with Sage:

To make a number of simulations, we can use list comprehensions again:

To make modular reusable code we can package up what we have done as functions.
The function bernoulliFInverse(u, theta) codes the inverse of the CDF of a bernoulli distribution parameterised by theta. The function bernoulliSample(n, theta) uses bernoulliFInverse(...) in a list comprehension to simulate n samples from a bernoulli distribution parameterised by theta.

Note that we are using a list comprehension and the builtin SAGE random() function to make a list of pseudorandom simulations from the $Uniform(0,1)$. The length of the list is determined by the value of n. Inside the body of the function we assign this list to a variable named us (i.e., u plural). We then use another list comprehension to make our simulated sample. This list comprehension works by calling our function bernoulliFInverse(...) and passing in values for theta together with each u in us in turn.
Let's try a small number of samples:

Now lets explore the effect of interactively varying n and $\theta$:
Click to the left again to hide and once more to show the dynamic interactive window 
You can vary $\theta$ and $n$ on the interactive plot. You should be able to see that as $n$ increases, the empirical plots should get closer to the theoretical $f$ and $F$.
You try
Check that you understand what floor is doing. We have put some extra print statements into our demonstration of floor so that you can see what is going on in each step. Try evaluating this cell several times so that you see what happens with different values of u.

In the cell below we use floor to get 1's and 0's from the pseudorandom u's given by random(). It is effectively doing exactly the same thing as the functions above that we use to simulate a specified number of $Bernoulli(\theta)$ RVs, but the why that it is written may be easier to understand. If floor is doing what we want it to, then when n is sufficiently large, we'd expect our proportion of 1s to be close to theta. Try changing the value assigned to the variable theta and reevaluting the cell to check this.


The $de~Moivre(\theta_1,\theta_2,\ldots,\theta_k)$ RV is the natural generalisation of the $Bernoulli (\theta)$ RV to more than two outcomes. Take a die (i.e. one of a pair of dice): there are 6 possible outcomes from tossing a die if the die is a normal sixsided one (the outcome is which face is the on the top). To start with we can allow the possibility that the different faces could be loaded so that they have different probabilities of being the face on the top if we throw the die. In this case, k=6 and the parameters $\theta_1$, $\theta_2$, ...$\theta_6$ specify how the die is loaded, and the number on the uppermost face if the die is tossed is a $de\,Moivre$ random variable parameterised by $\theta_1,\theta_2,\ldots,\theta_6$.
If $\theta_1=\theta_2=\ldots=\theta_6= \frac{1}{6}$ then we have a fair die.
Here are some functions for the equiprobable $de\, Moivre$ PMF and CDF where we code the possible outcomes as the numbers on the faces of a ksided die, i.e, 1,2,...k.

Now we can use the interactive plots to look at the PMF and CDF for different values of k.
Click to the left again to hide and once more to show the dynamic interactive window 
Try changing the value of k.
We use floor ($\lfloor \, \rfloor$) again for simulating from the equiprobable $de \, Moivre(k)$ RV, but because we are defining our outcomes as 1, 2, ... k, we just add 1 to the result.

To simulate from the equiprobable $de\,Moivre(k)$, we can use the following algorithm:
Input:
Output:
Steps:
We can illustrate this with Sage:

A small sample:

You should understand the deMoivreFInverse and deMoivreSample functions and be able to write something like them if you were asked to.
You are not expected to be to make the interactive plots below.
Now let's do some interactive sampling where you can vary $k$ and the sample size $n$:
Click to the left again to hide and once more to show the dynamic interactive window 
Try changing $n$ and/or $k$. With $k = 40$ for example, you could be simulating the number on the first ball for $n$ Lotto draws.

You try
A useful counterpart to the floor of a number is the ceiling, denoted $\lceil \, \rceil$. In maths, $\lceil x \rceil$, the 'ceiling of $x$' is the smallest integer that is larger than or equal to $x$. For example, $\lceil 3.8 \rceil = 4$. We can use the ceil function to do this in Sage:

Try using ceil to check that you understand what it is doing. What would ceil(0) be?


When we simulated from the discrete RVs above, the $Bernoulli(\theta)$ and the equiprobable $de\,Moivre(k)$, we transformed some $u \thicksim Uniform(0,1)$ into some value for the RV.
Now we will look at the formal idea of an inversion sampler for continuous random variables. Inversion sampling for continuous random variables is a way to simulate values for a continuous random variable $X$ using $u \thicksim Uniform(0,1)$.
The idea of the inversion sampler is to treat $u \thicksim Uniform(0,1)$ as some value taken by the CDF $F$ and find the value $x$ at which $F(X \le x) = u$.
To find x where $F(X \le x) = u$ we need to use the inverse of $F$, $F^{[1]}$. This is why it is called an inversion sampler.
Formalising this,
Let $F(x) := \int_{ \infty}^{x} f(y) \,d y : \mathbb{R} \rightarrow [0,1]$ be a continuous DF with density $f$, and let its inverse $F^{[1]} $ be:
\[ F^{[1]}(u) := \inf \{ x : F(x) = u \} : [0,1] \rightarrow \mathbb{R} \]
Then, $F^{[1]}(U)$ has the distribution function $F$, provided $U \thicksim Uniform(0,1)$ ($U$ is a $Uniform(0,1)$ RV).
Note:
The infimum of a set A of real numbers, denoted by $\inf(A)$, is the greatest lower bound of every element of $A$.
The ``oneline proof'' of the proposition is due to the following equalities: \[P(F^{[1]}(U) \leq x) = P(\inf \{ y : F(y) = U)\} \leq x ) = P(U \leq F(x)) = F(x), \quad for~all~x \in \mathbb{R} . \]
Input:
Output:
Algorithm steps:
We have already met the$Uniform(\theta_1, \theta_2)$ RV.
Given two real parameters $\theta_1,\theta_2 \in \mathbb{R}$, such that $\theta_1 < \theta_2$, the PDF of the $Uniform(\theta_1,\theta_2)$ RV $X$ is:
\[f(x;\theta_1,\theta_2) =
\begin{cases}
\frac{1}{\theta_2  \theta_1} & \text{if }$\theta_1 \leq x \leq \theta_2$\text{,}\\
0 & \text{otherwise}
\end{cases}
\]
and its DF given by $F(x;\theta_1,\theta_2) = \int_{ \infty}^x f(y; \theta_1,\theta_2) \, dy$ is:
\[
F(x; \theta_1,\theta_2) =
\begin{cases}
0 & \text{if }x < \theta_1 \\
\frac{x\theta_1}{\theta_2\theta_1} & \text{if}~\theta_1 \leq x \leq \theta_2,\\
1 & \text{if} x > \theta_2
\end{cases}
\]
For example, here are the PDF, CDF and inverse CDF for the $Uniform(1,1)$:
As usual, we can make some Sage functions for the PDF and CDF:

Using these functions in an interactive plot, we can see the effect of changing the distribution parameters $\theta_1$ and $\theta_2$.
Click to the left again to hide and once more to show the dynamic interactive window 
You should be able to write simple functions like uniformPDF and uniformCDF yourself, but you are not expected to be able to make the interactive plots.

We can simulate from the $Uniform(\theta_1,\theta_2)$ using the inversion sampler, provided that we can get an expression for $F^{[1]}$ that can be implemented as a procedure.
We can get this by solving for $x$ in terms of $u=F(x;\theta_1,\theta_2)$:
\[
u = \frac{x\theta_1}{\theta_2\theta_1} \quad \iff \quad x = (\theta_2\theta_1)u+\theta_1 \quad \iff \quad F^{[1]}(u;\theta_1,\theta_2) = \theta_1+(\theta_2\theta_1)u
\]
Input:
Output:
Algorithm steps:
We can illustrate this with Sage by writing a function to calculate the inverse of the CDF of a uniform distribution parameterised by theta1 and theta2. Given a value between 0 and 1 for the parameter u, it returns the height of the inverse CDF at this point, i.e. the value in the range theta1 to theta2 where the CDF evaluates to u.

This function transforms a single $u$ into a single simulated value from the $Uniform(\theta_1, \theta_2)$, for example:

Then we can use this function inside another function to generate a number of samples:

The basic strategy is the same as for simulating $Bernoulli$ and $de \, Moirve$ samples: we are using a list comprehension and the builtin SAGE random() function to make a list of pseudorandom simulations from the $Uniform(0,1)$. The length of the list is determined by the value of n. Inside the body of the function we assign this list to a variable named us (i.e., u plural). We then use another list comprehension to make our simulated sample. This list comprehension works by calling our function uniformFInverse(...) and passing in values for theta1 and theta2 together with each u in us in turn.
You should be able to write simple functions like uniformFinverse and uniformSample yourself.
Try this for a small sample:

Much more fun, we can make an interactive plot which uses the uniformSample(...) function to generate and plot while you choose the parameters and number to generate (you are not expected to be able to make interactive plots like this):
Click to the left again to hide and once more to show the dynamic interactive window 
We can get a better idea of the distribution of our sample using a histogram (the minimum sample size has been set to 50 here because the automatic histogram generation does not do a very good job with small samples).
Click to the left again to hide and once more to show the dynamic interactive window 

For a given $\lambda$ > 0, an $Exponential(\lambda)$ Random Variable has the following PDF $f$ and DF $F$:
\[f(x;\lambda) =\begin{cases}\lambda e^{\lambda x} & \text{if }$x \ge 0$\text{,}\\ 0 & \text{otherwise}\end{cases}\] \[F(x;\lambda) =\begin{cases}1  e^{\lambda x} & \text{if }$x \ge 0$\text{,}\\ 0 & \text{otherwise}\end{cases}\]
An exponential distribution is useful because is can often be used to model interarrival times or making interevent measurements (if you are familiar with the $Poisson$ distribution, a discrete distribution, you may have also met the $Exponential$ distribution as the time between $Poisson$ events). Here are some examples of random variables which are sometimes modelled with an exponential distribution:
In Sage, the we can use exp(x) to calculate $e^x$, for example:

We can code some functions for the PDF and DF of an $Exponential$ parameterised by lam like this. [Note that we cannot use the name 'lambda' for the parameter because in SAGE (and Python), the term 'lambda' has a special meaning.]

You should be able to write simple functions like exponentialPDF and exponentialCDF yourself, but you are not expected to be able to make the interactive plots.
You can see the shapes of the PDF and CDF for different values of $\lambda$ using the interactive plot below.
Click to the left again to hide and once more to show the dynamic interactive window 

We are going to write some functions to help us to do inversion sampling from the $Exponential(\lambda)$ RV.
As before, we need an expression for $F^{[1]}$ that can be implemented as a procedure.
We can get this by solving for $x$ in terms of $u=F(x;\lambda)$
You try in class
Show that
\[F^{[1]}(u;\lambda) =\frac{1}{\lambda} \ln(1u)\]
$\ln = \log_e$ is the natural logarthim.
(end of You try)
Input:
Output:
Algorithm steps:
The function exponentialFInverse(u, lam) codes the inverse of the CDF of an exponential distribution parameterised by lam. Given a value between 0 and 1 for the parameter u, it returns the height of the inverse CDF of the exponential distribution at this point, i.e. the value where the CDF evaluates to u. The function exponentialSample(n, lam) uses exponentialFInverse(...) to simulate n samples from an exponential distribution parameterised by lam.

We can have a look at a small sample:

You should be able to write simple functions like exponentialFinverse and exponentialSample yourself.
The best way to visualise the results is to use a histogram. With this interactive plot you can explore the effect of varying lambda and n:
Click to the left again to hide and once more to show the dynamic interactive window 
A standard $Cauchy$ Random Variable has the following PDF $f$ and DF $F$:
\[f(x) =\frac{1}{\pi(1+x^2)}\text{,}\,\, \infty < x < \infty\] \[F(x) = \frac{1}{\pi}tan^{1}(x) + 0.5\]
The $Cauchy$ distribution is an interesting distribution because the expectation does not exist:
\[
\int \leftx\right\,dF(x) = \frac{2}{\pi} \int_0^{\infty} \frac{x}{1+x^2}\,dx = \left(x \tan^{1}(x) \right]_0^{\infty}  \int_0^{\infty} \tan^{1}(x)\, dx = \infty \ .
\]
In Sage, we can use the arctan function for $tan^{1}$, and pi for $\pi$ and code some functions for the PDF and DF of the standard Cauchy like this.

You can see the shapes of the PDF and CDF using the plot below. Note from the PDF $f$ above is defined for $\infty < x < \infty$. This means we should set some arbitrary limits on the minimum and maximum values to use for the xaxis on the plots. You can change these limits interactively.
Click to the left again to hide and once more to show the dynamic interactive window 
You are expected to be able to write simple functions like cauchyPDF and cauchyCDF yourself, but you are not expected to be able to make the interactive plots.
You could try imagining a standard $Cauchy$ RVs like this: Place a double light sabre (i.e., one that can its lazer beam from both ends) on a cartesian axis so that it is centred on (0, 1). Randomly spin it (so that its spin angle to the xaxis is $\theta \thicksim Uniform (0, 2\pi)$). The ycoordinate of the point of intersection with the yaxis is a standard Cauchy RV. You can see that we are equally likely to get positive and negative values (the density function of the standard $Cauchy$ RV is symmetrical about 0) and whenever the spin angle is close to $\frac{\pi}{4}$ ($90^{\circ}$) or $\frac{3\pi}{2}$ ($270^{\circ}$), the intersections will be a long way out up or down the yaxis, i.e. very negative or very positive values. If the light sabre is exactly parallel to the yaxis there will be no intersection: a $Cauchy$ RV $X$ can take values $\infty < x < \infty$
We can perform inversion sampling on the $Cauchy$ RV by transforming a $Uniform(0,1)$ random variable into a $Cauchy$ random variable using the inverse CDF.
We can get this by replacing $F(x)$ by $u$ in the expression for $F(x)$:
\[\frac{1}{\pi}tan^{1}(x) + 0.5 = u\]
and solving for $x$:
\[\begin{array}{lcl} \frac{1}{\pi}tan^{1}(x) + 0.5 = u & \iff & \frac{1}{\pi} tan^{1}(x) = u  \frac{1}{2}\\ & \iff & tan^{1}(x) = (u  \frac{1}{2})\pi\\ & \iff & tan(tan^{1}(x)) = tan((u  \frac{1}{2})\pi)\\ & \iff & x = tan((u  \frac{1}{2})\pi) \end{array}\]
Input:
Output:
Algorithm steps:
The function cauchyFInverse(u) codes the inverse of the CDF of the standard Cauchy distribution. Given a value between 0 and 1 for the parameter u, it returns the height of the inverse CDF of the standard $Cauchy$ at this point, i.e. the value where the CDF evaluates to u. The function cauchySample(n) uses cauchyFInverse(...) to simulate n samples from a standard Cauchy distribution.

And we can visualise these simulated samples with an interactive plot:
Click to the left again to hide and once more to show the dynamic interactive window 
Notice how we can get some very extreme values This is because of the 'thick tails' of the density function of the $Cauchy$ RV. Think about this in relation to the double light sabre visualisation. We can see effect of the extreme values with a histogram visualisation as well. The interactive plot below will only use values between lower and upper in the histogram. Try increasing the sample size to something like 1000 and then gradually widening the limits:
Click to the left again to hide and once more to show the dynamic interactive window 
You try
When we introduced the $Cauchy$ distribution, we noted that the expectation of the $Cauchy$ RV does not exist. This means that attempts to estimate the mean of a $Cauchy$ RV by looking at a sample mean will not be successful: as you take larger and larger samples, the effect of the extreme values will still cause the sample mean to swing around wildly (we will cover estimation properly soon). You are going to investigate the sample mean of simulated $Cauchy$ samples of steadily increasing size and show how unstable this is. A convenient way of doing this is to look at a running mean. We will start by working through the process of calculating some running means for the $Uniform(0,10)$, which do stabilise. You will then do the same thing for the $Cauchy$ and be able to see the instability.
We will be using the pylab.cumsum function, so we make sure that we have it available. We then generate a sample from the $Uniform(0,10)$

We are going to treat this sample as though it is actually 10 samples of increasing size:
We know that a sample mean is the sum of the elements in the sample divided by the number of elements in the sample $n$:
\[\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i\]
We can get the sum of the elements in each of our 10 samples with the cumulative sum of uSample.
We use cumsum to get the cumulative sum. This will be a pylab.array type, so we use the list function to turn it back into a list:

What we have now is effectively a list $\left[\displaystyle\sum_{i=1}^1x_i, \sum_{i=1}^2x_i, \sum_{i=1}^3x_i, \ldots, \sum_{i=1}^{10}x_i\right]$
So all we have to do is divide each element in csUSample by the number of elements that were summed to make it, and we have a list of running means $\left[\frac{1}{1}\displaystyle\sum_{i=1}^1x_i, \frac{1}{2}\sum_{i=1}^2x_i, \frac{1}{3}\sum_{i=1}^3x_i, \ldots, \frac{1}{10}\sum_{i=1}^{10}x_i\right]$
We can get the running sample sizes using the range function:

And we can do the division with list comprehension:

We could pull all of this together into a function which produced a list of running means for sample sizes 1 to $n$

Have a look at the running means of 10 incrementallysized samples:

Recall that the expectation $E_{(\theta_1, \theta_2)}(X)$ of a $X \thicksim Uniform(\theta_1, \theta_2) = \frac{(\theta_1 +\theta_2)}{2}$
In our simulations we are using $\theta_1 = 0$, $\theta_2 = 10$, so if $X \thicksim Uniform(0,10)$, $E(X) = 5$
To show that the running means of different simulations from a $Uniform$ distribution settle down to be close to the expectation, we can plot say 5 different groups of running means for sample sizes $1, \ldots, 1000$. We will use a line plot rather than plotting individual points.

Your task is to now do the same thing for some standard Cauchy running means.
To start with, do not put everything into a function, just put statements into the cell(s) below to:
Add more cells as you need them.



When you are happy that you are doing the right things, write a function, parameterised by the number of running means to do, that returns a list of running means. Try to make your own function rather than copying and changing the one we used for the $Uniform$: you will learn more by trying to do it yourself. Please call your function cauchyRunningMeans, so that (if you have done everything else right), you'll be able to use some code we will supply you with to plot the results.

Try checking your function by using it to create a small list of running means. Check that the function does not report an error and gives you the kind of list you expect.

When you think that your function is working correctly, try evaluating the cell below: this will put the plot of 5 groups of $Uniform(0,10)$ running means beside a plot of 5 groups of standard $Cauchy$ running means produced by your function (as usual, you are not expected to be able to produce plots like this one).

Remember that we know how to set the seed of the PRNG used by random() with set_random_seed? If we wanted our sampling functions to give repeatable samples, we could also pass the functions the seed to use. Try making a new version of uniformSample which has a parameter for a value to use as the random number generator seed. Call your new version uniformSampleSeeded to distinguish it from the original one.

Try out your new uniformSampleSeeded function: if you generate two samples using the same seed they should be exactly the same. You could try using a large sample and checking on sample statistics such as the mean, min, max, variance etc, rather than comparing small samples by eye.

You can also give parameters default values in Sage. Using a default value means that if no value is passed to the function for that parameter, the default value is used. Here is an example with a very simple function:

Please note that default parameter values for functions may come up in assessments. This is your chance to become familiar with them.
Note that parameters with default values need to come after parameters without default values when we define the function.
Now you can try the function  evaluate the following cells to see what you get:






Try making yet another version of the uniform sampler which takes a value to be used as a random number generator seed, but defaults to None if no value is supplied for that parameter. None is a special Python type.

Using set_random_seed(None) will mean that the random seed is actually reset to a new ('random') value. You can see this by testing what happens when you do this twice in succession and then check what seed is being used with initial_seed:


Do another version of the uniformSampleSeeded function with a default value for the seed of None.

Check your function again by testing with both when you supply a value for the seed and when you don't.





Optional You try
Only look at this if you have enough time and are interested in doing a little bit more. It is not vital to do this in order to pass the course.
We could use the samplers we have made to do a very simple simulation. Suppose the interarrival times, in minutes, of Orbiter buses at an Orbiter stop follows an $Exponential(\lambda = 0.1)$ distribution. Also suppose that this is quite a popular bus stop, and the arrival of people is very predictable: one new person will arrive in each whole minute. This means that the longer another bus takes in coming, the more people arrive to join the queue. Also suppose that the number of free seats available on any bus follows a $de\, Moivre(k=40)$ distribution, i.e, there are equally like to to be 1, or 2, or 3 ... or 40 spare seats. If there are more spare seats than people in the queue, everyone can get onto the bus and nobody is left waiting, but if there are not enough spare seats some people will be left waiting for the next bus. As they wait, more people arrive to join the queue....
This is not very realistic  we would want a better model for how many people arrive at the stop at least, and for the number of spare seats there will be on the bus. However, we are just using this as a simple example that you can do using the RVs you know how to simulate samples from.
Try to code this example yourself, using our suggested steps. We have put our version the code into a cell below, but you will get more out of this example by trying to do it yourself first.
Suggested steps:


Here is our code to do this simulation. Yours may be different  maybe it will be better!


We could do a visualisation of this, showing the number of people able to board the bus and the number of people left by the height of lines on the plot

You could try the effect on your simulation of changing the $Exponential$ parameter $\lambda$, or some of the other factors involved.










