M HYPE SPLASH
// general

Why is Linearity of Expectation so important?

By Abigail Rogers
$\begingroup$

I understand what linearity of expectation is. In short:

$E[\sum X_i] = \sum E[X_i] $

However I don't quite see its significance. That is, in what scenario does applying $E[\sum X_i]$ over $\sum E[X_i]$ (or vice versa) make a difference?

$\endgroup$ 2

3 Answers

$\begingroup$

I'll try to give you an example that shows why it is very useful. Suppose that the random variable $X$ follows the binomial distribution with parameters $n$ and $p$. Then the expected value of $X$ is given by $$ \operatorname EX=\sum_{k=0}^nk\binom{n}{k}p^k(1-p)^{n-k}. $$ But we know that $X=\sum_{j=1}^nY_j$, where $Y_1,\ldots,Y_n$ are independent and identically distributed Bernoulli random variables with the parameter $p$. Hence, $$ \operatorname EX=\operatorname E\biggl[\sum_{j=1}^nY_j\biggr]=\operatorname \sum_{j=1}^nEY_j=np. $$ So it's much easier to calculate the value of $\operatorname EX$ if we use the fact that $X=\sum_{j=1}^nY_j$.

$\endgroup$ 1 $\begingroup$

It is very convenient that (more general) measures can be looked at as linear functions on suitable functions in the sense that $\mu(f+g)=\mu(f)+\mu(g)$ and $\mu(cf)=c\mu(f)$ where $\mu(f)$ denotes the integral over $f$ with respect to measure $\mu$.

One example in the smaller context of expectation: it makes it easy to find expectation of the binomial distribution. If $X$ is binomially distributed with parameters $n$ and $p$ then we can write $$X=X_1+\cdots+X_n$$ where the $X_i$ are iid and Bernouilli-$p$ distributed. That means that $X_i=1$ with probability $p$ and $X_i=0$ with probability $1-p$. It is easy to find that $\mathbb EX_i=p$ and with the linearity of expectation we find: $$\mathbb EX=\mathbb EX_1+\cdots+\mathbb EX_n=p+\cdots+p=np$$

That is much more efficient and elegant than applying $$\mathbb EX=\sum_{k=1}^nk\binom{n}{k}p^k(1-p)^{n-k}$$

$\endgroup$ $\begingroup$

Well like already said before "significance" is fairly subjective. But, if you want an application of that property, here it is one:

Let $X_1, X_2,..., X_n$ be a set of random variables representing loss, each with mean $\mu_1, \mu_2,..., \mu_n $. We want to pool our loss together so we set $ S_n = \sum_{i=1}^n X_i $, we want to find the expected pooled loss. Then $ E[S_n] = \sum_{i=1}^n \mu_i $. In a more practical way say we know all the $X_i$'s are distributed $\sim$Exp($\lambda$), and are independent, then $E[S_n] = \sum_{i=1}^n \mu_i = \sum_{i=1}^n 1/\lambda = \frac {n-1}{\lambda} $ .

Note that without knowing this property, you would have to find the convolution of all the $X_i$'s and then integrate them. i.e. $E[S_n] = \int_{-\infty}^\infty {s\cdot f_{S}(s)}ds$, where $f_S(s)$ is the transformation of $S_n = X_1+X_2+...+X_n$. Something that is not always very straight forward or easily computed.

$\endgroup$

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy