Weak Law of Large Numbers

Visualize Law of Large Numbers

Suppose your classroom consists of

\(300\)

students and you want to know what is the average height of those

\(300\)

students?
Now say you measure height of

\(50\)

students and suppose that the average height of those

\(50\)

students is somewhat near to the average height of all

\(300\)

students.
So now we have our

\(50\)

observations

X_1,X_2,\cdots,X_{50}

, and these observations are random, they are the result of some unknown random process, so we call these random observations as Random variables.
These random variables are resultant of a common random process therefore they are identically distributed and all of them are independent of each other, so we call then I.I.D.(Independent and Identically Distributed) random variables.
So average height of those

\(50\)

students (sample mean) is

\displaystyle\overline{X}_{50}=\frac{X_1+X_2+\cdots+X_{50}}{50}

.
What we actually strive is to get the average over total population (Average height of all

\(300\)

students), the True mean. Let's say that the true mean is

\mu\left(=\mathbb{E}[X_i]\right)

Note
True mean $(\mu)$ is over the entire population. True mean $(\mu)$ is not random, it's a number.
Sample mean $(\overline{X}_n)$ is over the observed values during an experiment. Sample mean $(\overline{X}_n)$ is a Random variable because $X_1,\cdots,X_n$ are random.

Weak Law of Large Numbers says, as we increases our sample size $$(n)$$ then, in probability our sample mean goes toward the True mean (this is what we referred as Truth in our central dogma).

So according to the Weak Law of Large Numbers,

\overline{X}_n:=\frac{1}{n}\sum _{i=1}^ n X_ i \xrightarrow [n\to \infty ] {\mathbb{P}} \mu

Explanation

For

X_1,X_2,\dots,X_n

I.I.D. random variables with finite mean

\mu

and variance

\sigma^2

.
Sample mean

\displaystyle\overline{X}_n=\frac{X_1+\cdots+X_n}{n}

\displaystyle\mathbb{E}[\overline{X}_n]=\mu

\displaystyle\text{Var}[\overline{X}_n]=\frac{n\sigma^2}{n^2}=\frac{\sigma^2}{n}

\mathbb{P}\left(|\overline{X}_n - \mu| \geq \epsilon\right) \xrightarrow [n\to \infty ] {} 0;\quad\forall\epsilon\gt 0

By Chebyshev's inequality
$\displaystyle \mathbb{P}\left(|\overline{X}_n - \mu| \geq \epsilon\right) \leq \frac{\text{Var}(\overline{X}_n)}{\epsilon^2}$

$\displaystyle \mathbb{P}\left(|\overline{X}_n - \mu| \geq \epsilon\right) \leq \frac{\sigma^2}{n\epsilon^2}\xrightarrow [n\to \infty ] {} 0;\quad\forall\epsilon\gt 0$

So for any $\epsilon\geq0$ , $\mathbb{P}\left(|\overline{X}_n - \mu|\geq \epsilon\right) \xrightarrow [n\to \infty ]{} 0$
This is convergence in probability
Let's assume a very small number, like $$0.00001$$ . Now convergence in probability says that if $$n$$ is large enough then it's highly unlikely for $\overline{X}_n$ to be more than $$0.00001$$ units away from $\mu$ .
Or say that, if $$n$$ is large then it's extremely likely that $\overline{X}_n$ is extremely close to $\mu$ .

Interpretation

For any $\epsilon\gt0$ (it's constant), probability that the sample mean $(\overline{X}_n)$ falls away from the true mean $(\mu)$ by more than $\epsilon$ goes to $$0$$ as our sample size $(n)\to\infty$ .

In our above example we have a population of $$300$$ students, among those $$300$$ students we randomly select $$50$$ students and measure their heights $X_1,\cdots,X_{50}$ .
If the true mean of all $$300$$ students is $\mu(=\mathbb{E}[X_i])$ , then we can say that,
Height of the $i^{th}$ student is $X_i = \mu + W_i$ , where $$W_i$$ is the measurement noise for the $i^{th}$ student, and the Weak Law of Large Numbers tells us that as $n\to\infty$ then in probability the average sample noise $\to 0$ .
So our sample mean $(\overline{X}_n)$ is unlikely to be far from the true mean $(\mu)$ .

So according to Weak Law of Large Numbers if we increase the number of students in our sample from

\(n=50\)

to say something like

\(n=100\)

then we should get a better estimate of the true mean.

There is also a Strong Law of large numbers.

Ok now we know that the Law of large numbers says, if we have large enough Sample size then our estimator

\overline{X}_n

and real parameter

\mu

are close

\overline{X}_n \xrightarrow [n\to \infty] {} \mu

, but how much close, we don't know! We don't know that how fast(at what rate)

\overline{X}_n

approaches to

\mu

.

We can think it as:

\left|\overline{X}_n -\mu \right| \propto \frac{1}{f(n)}

where

\(f(n)\)

is an increasing function w.r.t.

\(n\)

.
As

\(f(n)\)

increases

\left|\overline{X}_n -\mu\right|

decreases, so we want a function

\(f(n)\)

that increases rapidly w.r.t.

\(n\)

.
For example

\(log(log(n))\)

increases very slowly so function like this are not useful.
So what is the rate at which

\overline{X}_n

approaches

\mu

?
The answer is hidden in Central Limit Theorem.

Gambler's Fallacy

Gambler's Fallacy also known as Monte Carlo Fallacy is a rather popular mistaken belief that,

If an independent event is occurring more frequently (then it normally does), then it's less likely to occur in the future.
Note that this statement is not true, as it's a mistaken belief

Example:
Say you start flipping a fair coin

\((p=0.5)\)

, and you observe that first

\(20\)

tosses are

\text{Heads}

.
Then some might say that,
"According to Law of Large Numbers the average proportion of

\text{Heads}

shall be

50\%

and we got

\(20\)

\text{Heads}

in a row so there are high chances for our next toss to be

\text{Tails}

."
But the above statement is Incorrect
Even if you got

\(1000\)

\text{Heads}

in a row, but the probability of next toss to be

\text{Tails}

is still

50\%

.

But why exactly is above statement False?
Because Law of Large Numbers says as $n\to\infty$ our Sample mean $\to$ True mean.
So even if we got

\(1000\)

\text{Heads}

in a row, there is still

\infty

tosses are left to make our Sample mean a True mean.

Now let's see some Simulation, choose your language of choice,
,
Launch Statistics App

Weak Law of Large Numbers

Explanation

Interpretation

There is also a Strong Law of large numbers.

Gambler's Fallacy

Recommended Watching