Borel-Cantelli Lemma

Contents

What is limit of the events Proof of Borel-Cantelli Lemma

The Borel-Cantelli Lemma is a fundamental result in probability theory which provides criteria to determine whether an infinite sequence of events will occur infinitely often or only finitely often.

Theorem (Borel-Cantelli Lemma)
If

\sum_{n=0}^{\infty} P(A_n) < \infty

then

P\{ A_n \, \text{i.o.}\} = 0

.
If

\sum_{n=0}^{\infty} P(A_n) < \infty

then

P\{ A_n \, \text{i.o.}\} = 0

Motivation

The Borel-Cantelli lemma provides powerful insights into the behavior of infinite sequences of events. Let's explore this through three illuminating examples of coin tossing experiments.

Symmetric coin

Consider an infinite sequence of independent tosses of a fair coin, represented by random variables

X_1, X_2, \ldots

, where each toss has probability

P(\text{Head}) = \frac{1}{2}

. Will we see infinitely many heads, or will the sequence eventually contain only tails?

Example of Symmetric coin

While our intuition suggests we should see infinitely many heads, the Borel-Cantelli lemma provides mathematical certainty. Consider:

\begin{equation*}\sum_{k=1}^n P \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{2} = \infty\end{equation*}

The lemma confirms our intuition: with probability 1 (almost surely), we will observe infinitely many heads.

Biased coin 1

Now consider a more intriguing case where the probability of heads decreases with each toss:

P(X_n = \text{Head}) = \frac{1}{n}

. Despite this decreasing probability, the Borel-Cantelli lemma reveals that:

\begin{equation*}\sum_{k=1}^n P \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{n} = \infty\end{equation*}

This is the harmonic series, which diverges. Therefore, remarkably, we will still see infinitely many heads almost surely, even though the probability of heads becomes arbitrarily small!

Example Biased coin 1

Biased coin 2

Finally, consider an even more extreme bias where

P(X_n = \text{Head}) = \frac{1}{n^2}

. Here, the Borel-Cantelli lemma reveals a fundamentally different behavior:

Example of Biased coin 2

\begin{equation*}\sum_{k=1}^n P \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{n^2} < \infty\end{equation*}

Since this series converges, the Borel-Cantelli lemma tells us that, almost surely, there exists some finite

N

after which we will never see another head. The probability of heads decreases so rapidly that only finitely many heads will occur.

What is limit of the events

Before discussing the two Borel-Cantelli Lemmas, we first need to define some technical terms related to an infinite sequence of events. Consider the sequence of events

A_1, A_2, A_3, \dots

. We say that

A_n

happens infinitely often (i.o.) if

\begin{align*}\{ A_n, \text{i.o.} \} &= \{ \omega : \forall m \in \mathbb{N}, \, \, \exists n(\omega) \geq m \,\, \text{such that} \, \, \omega \in A_{n(\omega)} \} = \\&= \bigcap_{m=1} \bigcup_{n \geq m} A_n = \limsup_{n \rightarrow \infty} A_n = \lim_{m \rightarrow \infty} \bigcup_{n \geq m} A_n\end{align*}

and we say that

A_n

happens ultimately often (ult.), for all but finitely many

A_n

\begin{align*}\{ A_n, \text{ult.} \} &= \{ \omega : \exists m(\omega) \in \mathbb{N}, \,\, \forall n \geq m(\omega) \, \, \text{such that} \, \, \omega \in A_{n} \} = \\&= \bigcup_{m=1} \bigcap_{n \geq m} A_n = \liminf_{n \rightarrow \infty} A_n = \lim_{m \rightarrow \infty} \bigcap_{n \geq m} A_n\end{align*}

Example
The concept of the limit of a sequence of infinite events can be intricate. To better understand this, consider a simple example that fits well with the definitions of limits superior (

\limsup

) and limits inferior (

\liminf

). Suppose we focus on two subsets within the interval

(0,1]

A_1 = (0, \frac{1}{2}]

and

A_2 = (\frac{1}{2}, 1]

. We define the whole sequence

A_n

, as follows:

\begin{equation*}A_n = \begin{cases} (0, \frac{1}{2}] , & n = 2k + 1, \quad k \in \mathbb{N} \\ (\frac{1}{2}, 1], & n = 2k, \quad k \in \mathbb{N} .\end{cases}\end{equation*}

With this sequence, we can compute the upper limit

\limsup_{n \rightarrow \infty} A_n

\begin{equation*}\{ A_n, \text{i.o.} \} = \bigcap_{m=1} \bigcup_{n \geq m} A_n = (0, 1]\end{equation*}

and the lower limit

\liminf_{n \rightarrow \infty} A_n

\begin{equation*}\{ A_n, \text{ult.} \} = \bigcup_{m=1} \bigcap_{n \geq m} A_n = \emptyset\end{equation*}.

Proof of Borel-Cantelli Lemma

Theorem (The first part of Borel-Cantelli Lemma)
If

\sum_{n=0}^{\infty} P(A_n) < \infty

then

P\{ A_n \, \text{i.o.}\} = 0

\quad

Proof. If we denote

G_m = \bigcup_{n \geq m} A_n

then we can see that

G_{m+1} \subset G_m

and

G_m \downarrow \limsup_{n \rightarrow \infty} A_n

. So using the continuity property of

P

\begin{equation*}P \{ A_n, \text{i.o.} \} = P \{ \bigcap_{m=1} G_m \} = \lim_{m \rightarrow \infty} P ( G_m ) = \lim_{m \rightarrow \infty} P \bigcup_{n \geq m} A_n \leq \lim_{m \rightarrow \infty} \sum_{n \geq m} P (A_n)\end{equation*}

where the last inequality is due the sub-additivity property of

P

. When

\sum_{n=0}^{\infty} P(A_n) < \infty

, the right-hand side becomes

0

, and we get

P\{ A_n \, \text{i.o.}\} = 0

\Box

The typical application of the first part of Borel-Cantelli Lemma is to consider events

A_n

with

\sum_{n=0}^{\infty} P(A_n) < \infty

, then from the statement

P\{ A_n \, \text{i.o.} \} = 0

we get by taking complement

P\{ {A_n \, \text{i.o.}} \}^c = 1

. The last probability can be rewritten

P \{ \omega : \exists m(\omega) \in \mathbb{N}, \,\, \forall n \geq m(\omega) \, \, \text{such that} \, \, \omega \in A_{n} \} = 1

, which means that

\exists m

\forall n \geq m

all events

A_m

occur.

For example, if the probability of the head at

n

-th coin toss equals to

p_n = \frac{1}{n^2}

, the the series is convergent and then starting from some

m

we can observe only heads.

Theorem (The second part of Borel-Cantelli Lemma)
If the events

A_n

are independent and

\sum_{n=1}^{\infty} P(A_n) = \infty

, then

P\{ A_n \, \text{i.o.}\} = 1

borelCantelliLemma exp.png — Plot of $1 - x$ and $\exp(-x)$

\quad

Proof. If

A_1, A_2, ...

are independent, so are

\overline{A}_1, \overline{A}_2, ...

. Hence for

N \geq n

we have, using

1-x \leq \exp(-x)

\begin{align*}P \left( \bigcap^N_{k=n} {A^c}_k \right) &= \prod^N_{k=n} P({A^c}_k) = \prod^N_{k=n} (1 - P(A_k)) \leq \prod^N_{k=n} \exp(-P(A_n)) \\&= \exp \left( - \sum^N_{k=n} P(A_n) \right) \quad \text{as} \quad N \rightarrow \infty\end{align*}

Consequently

\begin{equation*}P \left( \bigcup^{\infty}_{k=n} A_k \right) = 1\end{equation*}

for all

n

, and since

\bigcup^{\infty}_{k=n} A_k \downarrow \limsup A_n

it follows that

P(A_n \, \text{i.o.}) = 1

\Box

Applications

Proof of Strong Law of Large Numbers

The foolowing example is Theorem 2.3.5, page 59 from [Durrett2019]

Theorem
Let

X_1, X_2, \ldots

be i.i.d. with

\mathbb{E}X = \mu

and

\mathbb{E}X^4 < \infty

. If

S_n = X_1 + \cdots + X_n

, then

S_n/n \to \mu

a.s.

\quad

Proof. By letting

X_i' = X_i - \mu

, we can suppose without loss of generality that

\mu = 0

. Now

\begin{equation*}\mathbb{E}S_n^4 = \mathbb{E}\left( \sum_{i=1}^n X_i \right)^4 = \mathbb{E} \sum_{1 \leq i,j,k,l \leq n} X_i X_j X_k X_l\end{equation*}

Terms in the sum of the form

\mathbb{E}(X_i^3 X_j)

\mathbb{E}(X_i^2 X_j X_k)

, and

\mathbb{E}(X_i X_j X_k X_l)

are

0

(if

i,j,k,l

are distinct) since the expectation of the product is the product of the expectations because of the independence, and in each case one of the terms has expectation 0. The only terms that do not vanish are those of the form

\mathbb{E}X_i'^4

and

\mathbb{E}X_i'^2 X_j'^2 = (\mathbb{E}X_i'^2)^2

. There are

n

and

3n(n - 1)

of these terms, respectively. In the second case, we can pick the two indices in

n(n - 1)/2

ways, and with the indices fixed, the term can arise in a total of six ways:

\{ 1, 2\}, \{ 1, 3\}, \{ 1, 4\}, \{ 2, 3\}, \{ 2, 4\}

and

\{ 3, 4\}

. The last observation implies

\begin{equation*}\mathbb{E}S_n^4 = n\mathbb{E}X_i^4 + 3(n^2 - n)(\mathbb{E}X_i^2)^2 \leq Cn^2\end{equation*}

where

C < \infty

. Chebyshev’s inequality gives us

\begin{equation*}P(\frac{|S_n|}{n} > \epsilon) \leq \frac{\mathbb{E}(S_n^4)}{(n\epsilon)^4} \leq \frac{C}{(n^2 \epsilon^4)}\end{equation*}

Summing on

n

and using the Borel-Cantelli lemma gives

P(|S_n| > n\epsilon \text{ i.o.}) = 0

\exists n_0, \, \forall n \geq n_0 \: \frac{|S_n|}{n} \leq \epsilon

. Since

\epsilon

is arbitrary, the proof is complete.

\Box

References

[Durrett2019]
Rick Durrett. Probability Theory and Examples. Fifth edition. 2019.