Skip to content

Strong Law of Large numbers

The Strong Law of Large Numbers (SLLN) is a fundamental result in probability theory that describes the long-term behavior of the average of a sequence of random variables.
The SLLN states that, under certain conditions, the sample average of a sequence of independent and identically distributed (i.i.d.) random variables converges almost surely to the expected value (mean) of those variables as the sample size goes to infinity.
Mathematically, let X1,X2,X_1, X_2, \ldots be a sequence of i.i.d. random variables with a finite first moment, i.e., EX1<\mathbb{E} |X_1| < \infty. The SLLN asserts that the sample average of these variables converges almost surely to the expected value:
X1+X2++Xnna. s.EX1  , as n.\begin{equation*}\frac{X_1 + X_2 + \dots + X_n}{n} \xrightarrow{\text{a. s.}} \mathbb{E} X_1 \;, \text{ as n} \rightarrow \infty.\end{equation*}
This means that as the number of observations nn increases, the sample average will almost surely converge to the true mean EX1\mathbb{E} X_1.
Fair coin flip experiment
Consider a fair coin flip experiment where we assign:
Xi={1,if heads,0,if tails.\begin{equation*}X_i= \begin{cases}1, & \text{if heads}, \\ 0, & \text{if tails} .\end{cases}\end{equation*}
The expected value is clearly E[Xi]=12\mathbb{E}[X_i] = \frac{1}{2}. But what happens when we perform this experiment repeatedly?
In our animation, we observe:
  • The horizontal line at y=0.5y = 0.5 represents E[Xi]\mathbb{E}[X_i]
  • The blue path shows Snn\frac{S_n}{n} as nn increases
  • Early fluctuations are larger due to the 1n\frac{1}{n} scaling
  • As nn \to \infty, the path stabilizes near E[Xi]=0.5\mathbb{E}[X_i] = 0.5
Calculating π\pi with Monte-Carlo
One of the most important applications of the SLLN is in the Monte Carlo method, which is used to numerically estimate quantities that are difficult or impossible to calculate analytically. The accuracy of Monte Carlo method depends on the convergence of the sample averages to the true values, which is guaranteed by the SLLN.
Let's define a random variable XX based on randomly chosen points (U1,U2)(U_1, U_2) in a 6×66 \times 6 square:
X={1,if U12+U2290,otherwise\begin{equation*}X = \begin{cases} 1, & \text{if } U_1^2 + U_2^2 \leq 9 \\ 0, & \text{otherwise} \end{cases}\end{equation*}
where U1,U2Uniform(3,3)U_1, U_2 \sim \text{Uniform}(-3, 3)
The expected value of XX is:
E[X]=P(U12+U229)=Area of CircleArea of Square=πr2(2r)2=π4\begin{equation*}\mathbb{E}[X] = \mathbb{P}(U_1^2 + U_2^2 \leq 9) = \frac{\text{Area of Circle}}{\text{Area of Square}} = \frac{\pi r^2}{(2r)^2} = \frac{\pi}{4}\end{equation*}
Let X1,,XnX_1, \ldots, X_n be i.i.d. copies of XX. By the Strong Law of Large Numbers:
1ni=1nXia.s.E[X]=π441ni=1nXia.s.π\begin{equation*}\frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{a.s.} \mathbb{E}[X] = \frac{\pi}{4} \quad \Longrightarrow \quad 4 \cdot \frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{a.s.} \pi\end{equation*}

The foolowing example is Theorem 2.3.5, page 59 from [Durrett2019]

One of the earliest results in this direction is the following theorem, which can be easily proved with Chebyshev inequality and Borel-Cantelli lemma.
CANTELLI
Let X1,X2,X_1, X_2, \ldots be i.i.d. with EX14<\mathbb{E}X_1^4 < \infty. If Sn=X1++XnS_n = X_1 + \cdots + X_n, then
EX14<Snna.s.EX1\begin{equation*}\mathbb{E}X_1^4 < \infty \quad \Longrightarrow \quad \frac{S_n}{n} \xrightarrow{\text{a.s.}} \mathbb{E}X_1\end{equation*}
\quad Proof. By letting Xi=XiEX1X_i' = X_i - \mathbb{E}X_1, we can suppose without loss of generality that EX1=0\mathbb{E}X_1 = 0. Now
ESn4=E(i=1nXi)4=E1i,j,k,lnXiXjXkXl\begin{equation*}\mathbb{E}S_n^4 = \mathbb{E}\left( \sum_{i=1}^n X_i \right)^4 = \mathbb{E} \sum_{1 \leq i,j,k,l \leq n} X_i X_j X_k X_l\end{equation*}
Terms in the sum of the form E(Xi3Xj)\mathbb{E}(X_i^3 X_j), E(Xi2XjXk)\mathbb{E}(X_i^2 X_j X_k), and E(XiXjXkXl)\mathbb{E}(X_i X_j X_k X_l) are 00 (if i,j,k,li,j,k,l are distinct) since the expectation of the product is the product of the expectations because of the independence, and in each case one of the terms has expectation 0. The only terms that do not vanish are those of the form EXi4\mathbb{E}X_i'^4 and EXi2Xj2=(EXi2)2\mathbb{E}X_i'^2 X_j'^2 = (\mathbb{E}X_i'^2)^2. There are nn and 3n(n1)3n(n - 1) of these terms, respectively. In the second case, we can pick the two indices in n(n1)/2n(n - 1)/2 ways, and with the indices fixed, the term can arise in a total of six ways: {1,2},{1,3},{1,4},{2,3},{2,4}\{ 1, 2\}, \{ 1, 3\}, \{ 1, 4\}, \{ 2, 3\}, \{ 2, 4\} and {3,4}\{ 3, 4\}. The last observation implies
ESn4=nEXi4+3(n2n)(EXi2)2Cn2\begin{equation*}\mathbb{E}S_n^4 = n\mathbb{E}X_i^4 + 3(n^2 - n)(\mathbb{E}X_i^2)^2 \leq Cn^2\end{equation*}
where C<C < \infty. Chebyshev’s inequality gives us
P(Snn>ϵ)ChebyshevE(Sn4)(nϵ)4Cn2ϵ4Borel-Cantellin0,nn0  Snnϵ\begin{equation*}\mathbb{P}\left(\frac{|S_n|}{n} > \epsilon\right) \overset{\text{Chebyshev}}{\leq} \frac{\mathbb{E}(S_n^4)}{(n\epsilon)^4} \leq \frac{C}{n^2 \epsilon^4}\quad \overset{\text{Borel-Cantelli}}{\Longrightarrow} \quad\exists n_0, \, \forall n \geq n_0 \; \frac{|S_n|}{n} \leq \epsilon\end{equation*}
Since ϵ\epsilon is arbitrary, the proof is complete. \Box
In this section, we introduce the Toeplitz and Kronecker Lemmas, which play crucial roles in proving the Strong Law of Large Numbers with finite absolute first monent.
The Toeplitz Lemma provides conditions under which the averages of a sequence {xn}n1\{x_n\}_{n\geq 1} converge to a limit.
TOEPLITZ
Let {an}\{a_n\} be a sequence of nonnegative numbers, bn=i=1naib_n = \sum_{i=1}^n a_i, b1=a1>0b_1 = a_1 > 0, and bnb_n \nearrow \infty, nn \to \infty. Let {xn}n1\{x_n\}_{n\geq 1} be a sequence of numbers converging to xx. Then
xnx1bnj=1najxjx\begin{equation*}x_n \to x \quad \Longrightarrow \quad \frac{1}{b_n} \sum_{j=1}^n a_j x_j \to x\end{equation*}
In particular, if an=1,  n1a_n = 1, \; n \geq 1 and bn=nb_n = n then
an=1,n1 and bn=nx1++xnnx.\begin{equation*}a_n = 1, n \geq 1 \text{ and } b_n = n \quad \Longrightarrow \quad \frac{x_1 + \ldots + x_n}{n} \to x.\end{equation*}
\quad Proof. Let ε>0\varepsilon > 0, and let n0=n0(ε)n_0 = n_0(\varepsilon) be such that xnxε/2\left|x_n - x\right| \leq \varepsilon/2 for all nn0n \geq n_0. Choose n1>n0n_1 > n_0 such that
1bn1j=1n0xjx<ε2.\begin{equation*}\frac{1}{b_{n_1}} \sum_{j=1}^{n_0} \left| x_j - x \right| < \frac{\varepsilon}{2}.\end{equation*}
Then, for any n>n1n > n_1,
1bnj=1najxjx=1bnj=1najxjbnxbn=1bnj=1najxjj=1najxbn=1bnj=1najxjj=1najx\begin{equation*}\left|\frac{1}{b_n} \sum_{j=1}^n a_j x_j - x\right| = \left|\frac{1}{b_n} \sum_{j=1}^n a_j x_j - \frac{b_n x}{b_n} \right| = \left|\frac{1}{b_n} \sum_{j=1}^n a_j x_j - \frac{\sum_{j=1}^n a_j x}{b_n} \right| =\frac{1}{b_n} \left| \sum_{j=1}^n a_j x_j - \sum_{j=1}^n a_j x\right|\end{equation*}
Then using the triangle inequality and splitting sum ut to n0n_0
1bnj=1najxjx=1bnj=1n0ajxjx+1bnj=n0+1najxjxbn1bn1bn1j=1n0ajxjxε2+1bnj=n0+1najxjxε2ε2+bnbn0bnε2ε.\begin{align*}& \leq \frac{1}{b_n} \sum_{j=1}^n a_j \left|x_j - x\right| = \frac{1}{b_n} \sum_{j=1}^{n_0} a_j \left|x_j - x\right| + \frac{1}{b_n} \sum_{j=n_0 + 1}^n a_j \left|x_j - x\right| \leq \\&\stackrel{b_{n_1} \leq b_n }{\leq} \underbrace{\frac{1}{b_{n_1}} \sum_{j=1}^{n_0} a_j \left|x_j - x\right|}_{\leq \frac{\varepsilon}{2}} + \frac{1}{b_n} \sum_{j=n_0 + 1}^n a_j \underbrace{\left|x_j - x\right|}_{\leq \frac{\varepsilon}{2}} \leq \frac{\varepsilon}{2} + \frac{b_n - b_{n_0}}{b_n} \frac{\varepsilon}{2} \leq \varepsilon.\end{align*}
SERIES AND AVERAGES, KRONECKER
Let {bn}\{b_n\} be a sequence of positive increasing numbers, bnb_n \nearrow \infty as nn \to \infty, and let {xn}\{x_n\} be a sequence of numbers such that xn\sum x_n converges. Then
n=1xn converges 1bnj=1nbjxj0,n.\begin{equation*}\sum_{n=1}^{\infty} x_n \text{ converges }\Longrightarrow\frac{1}{b_n} \sum_{j=1}^n b_j x_j \to 0, \quad n \to \infty.\end{equation*}
\quad Proof. Let b0=0b_0 = 0, S0=0S_0 = 0, Sn=j=1nxjS_n = \sum_{j=1}^n x_j. Then from xj=SjSj1x_j = S_j - S_{j-1}
j=1nbjxj=j=1nbj(SjSj1)=bnSnb0S0j=1nSj1(bjbj1)aj1bnj=1nbjxj=Sn1bnj=1nSj1aj,\begin{equation*}\sum_{j=1}^n b_j x_j = \sum_{j=1}^n b_j (S_j - S_{j-1}) = b_n S_n - b_0 S_0 - \sum_{j=1}^n S_{j-1} \underbrace{(b_j - b_{j-1})}_{a_j}\quad \Longrightarrow \quad\frac{1}{b_n} \sum_{j=1}^n b_j x_j = S_n-\frac{1}{b_n} \sum_{j=1}^n S_{j-1} a_j,\end{equation*}
since, if SnxS_n \rightarrow x, then, by Toeplitz's lemma,
1bnj=1nSj1ajx1bnj=1nbjxj=Snx1bnj=1nSj1ajx0\begin{equation*}\frac{1}{b_n} \sum_{j=1}^n S_{j-1} a_j \rightarrow x\quad \Longrightarrow \quad \frac{1}{b_n} \sum_{j=1}^n b_j x_j =\underbrace{S_n}_{\to x}-\underbrace{\frac{1}{b_n} \sum_{j=1}^n S_{j-1} a_j}_{\to x} \rightarrow 0 \qquad \Box\end{equation*}
REMARK
If bn=n b_n = n , xn=yn/n x_n = y_n / n and (yn/n) \sum (y_n / n) converges, then
y1++ynn0,n.\begin{equation*}\frac{y_1 + \ldots + y_n}{n} \to 0, \quad n \to \infty.\end{equation*}

The foolowing proof is taken from [Shiryaev2] pages 16-17

Here we present the strongest version of the SLLN, which was proved by Kolmogorov and require the finitnes of the expectation of the absolute value.
STRONG LAW OF LARGE NUMBERS, KOLMOGOROV
Let X1,X2,X_1, X_2, \ldots be i.i.d. with EX1<\mathbb{E} | X_1 | < \infty. If Sn=X1++XnS_n = X_1 + \cdots + X_n, then
EX1<Snna.s.EX1\begin{equation*}\mathbb{E}| X_1 | < \infty \quad \Longrightarrow \quad \frac{S_n}{n} \xrightarrow{\text{a.s.}} \mathbb{E}X_1\end{equation*}
We break down the proof of the theorem in several steps. First, we consider the truncated sequence {Xn}\{X_n^{\prime}\}, noting that the convergence of Sn/nS_n/n is equivalent to the convergence of Sn/nS_n^{\prime}/n. Next, we apply Kronecker's Lemma and Kolmogorov's Two-Series Theorem to derive the final result.
\quad Proof. Let's suppose without a loss of generality EX1\mathbb{E}X_1 and define the truncate random variable XnX_n^{\prime}
Xn={Xn,Xnn,0,Xn>n.\begin{equation*}X_n^{\prime}=\begin{cases}X_n, & |X_n| \leq n, \\0, & |X_n| > n.\end{cases}\end{equation*}
Then
n=1P{XnXn}=n=1P{X1>n}0P{X1>t}dt=defEX<.\begin{equation*}\sum_{n=1}^{\infty} \mathbb{P}\left\{X_n^{\prime} \neq X_n\right\} =\sum_{n=1}^{\infty} \mathbb{P}\left\{|X_1|>n\right\} \leq \int_0^{\infty} \mathbb{P}\left\{|X_1|>t\right\} d t \stackrel{\text{def}}{=} \mathbb{E}|X|<\infty .\end{equation*}
Hence, the Borel-Cantelli lemma yields P{XnXn\mathbb{P}\left\{X_n^{\prime} \neq X_n\right. i.o. }=0\}=0, and so Xn=XnX_n^{\prime}=X_n for all but finitely many nNn \in \mathrm{N} a.s.
Because of Xn=XnX_n^{\prime}=X_n for all but finitely many nNn \in \mathrm{N} a.s., we have
X1+X2+Xnna.s.0X1+X2+Xnna.s.0\begin{equation*}\frac{X_1 + X_2 \cdots + X_n}{n} \xrightarrow{\text{a.s.}} 0\quad \Longleftrightarrow \quad\frac{X_1^{\prime} + X_2^{\prime} \cdots + X_n^{\prime}}{n} \xrightarrow{\text{a.s.}} 0\end{equation*}
Note that in general X^n0 \hat{X}_n \neq 0 , but
EX^n=EX^n1{X^n<n}=EX^11{X^1<n}EX1=0.\begin{equation*}\mathbb{E}\hat{X}_n = \mathbb{E}\hat{X}_n \mathbf{1}_{\{|\hat{X}_n| < n\}} = \mathbb{E}\hat{X}_1 \mathbf{1}_{\{|\hat{X}_1| < n\}} \to \mathbb{E}X_1 = 0.\end{equation*}
Hence, by Toeplitz Lemma, for an=1,  n1a_n = 1, \; n \geq 1 and bn=nb_n = n,
an=1,bn=n and EX^nn0Toeplitz1nk=1nEX^kn0.\begin{equation*}a_n = 1, b_n = n \text{ and } \mathbb{E}\hat{X}_n \xrightarrow{n\to\infty} 0\quad \overset{\text{Toeplitz}}{\Longrightarrow} \quad\frac{1}{n} \sum_{k=1}^n \mathbb{E}\hat{X}_k \xrightarrow{n\to\infty} 0.\end{equation*}
By Kronecker Lemma if
n=1Xnn converges a.s. ,bn=nKroneckerX1+X2+Xnna.s.0\begin{equation*}\sum_{n=1}^{\infty} \frac{X_n^{\prime}}{n} \text{ converges a.s. }, b_n = n \quad \overset{\text{Kronecker}}{\Longrightarrow} \quad \frac{X_1^{\prime} + X_2^{\prime} \cdots + X_n^{\prime}}{n} \xrightarrow{\text{a.s.}} 0\end{equation*}
Define the centered truncated variable X^n:=XnEXn\hat{X}_n := X_n^{\prime} - \mathbb{E}X_n^{\prime}. It therefore remains to show that
n=1Var(X^n)n2<.\begin{equation*}\sum_{n=1}^{\infty} \frac{\operatorname{Var}(\hat{X}_n)}{n^2} < \infty.\end{equation*}n=1Var(X^n)n2<Kolmogorov and Khinchin Theoremn=1X^nn converges a.s.\begin{equation*}\sum_{n=1}^{\infty} \frac{\operatorname{Var}(\hat{X}_n)}{n^2} < \infty\overset{\text{Kolmogorov and Khinchin Theorem}}{\Longrightarrow}\sum_{n=1}^{\infty}\frac{\hat{X}_n}{n} \text{ converges a.s.}\end{equation*}
We can estimate, where we use that {Xn}n1\{ X_n \}_{n\geq1} are i.i.d and XnnX_n^{\prime} \leq n
n=1Var(X^n)n2Var(X^n)E((Xn)2)n=1E(Xn2)n2=n=11n2E(Xn21{Xnn})=i.i.d.n=11n2k=1nE(X121{k1X1<k})=\begin{equation*}\sum_{n=1}^{\infty} \frac{\operatorname{Var}(\hat{X}_n)}{n^2}\overset{\operatorname{Var}(\hat{X}_n)\le \mathbb{E}((X_n')^2)}{\leq}\sum_{n=1}^{\infty} \frac{\mathbb{E}({X_n^{\prime}}^2)}{n^2} = \sum_{n=1}^{\infty} \frac{1}{n^2} \mathbb{E}\left(X_n^2 \mathbf{1}_{\{|X_n| \leq n\}}\right) \overset{\text{i.i.d.}}{=}\sum_{n=1}^{\infty} \frac{1}{n^2} \sum_{k=1}^n \mathbb{E}\left(X^2_{1} \mathbf{1}_{\{k-1 \leq |X_1| < k\}}\right) =\end{equation*}
We now interchange the order of summation, this is allowed because all terms are nonnegative:
=k=1E(X121{k1X1<k})n=k1n2n=k1n22k2k=11kE(X121{k1X1<k})X12X1k on {k1X1<k}2k=1E(X11{k1X1<k})=2EX1<.\begin{align*}= \sum_{k=1}^{\infty} \mathbb{E}\left(X^2_{1} \mathbf{1}_{\{k-1 \leq |X_1| < k\}}\right) \cdot \sum_{n=k}^{\infty} \frac{1}{n^2}&\overset{\sum_{n=k}^{\infty} \frac{1}{n^2} \leq \frac{2}{k}}{\leq}2 \sum_{k=1}^{\infty} \frac{1}{k} \mathbb{E}\left(X^2_{1} \mathbf{1}_{\{k-1 \leq |X_1| < k\}}\right) \leq \\&\overset{X_1^2\le |X_1|\cdot k \text{ on } \{k-1 \leq |X_1| < k\}}{\leq}2 \sum_{k=1}^{\infty} \mathbb{E}\left(|X_1| \mathbf{1}_{\{k-1 \leq |X_1| < k\}}\right)= 2\mathbb{E} |X_1| < \infty.\end{align*}