M HYPE SPLASH
// general

The mean of the of a sum is the sum of the means

By John Peck
$\begingroup$

enter image description here

Transcription:

The mean has good mathematical properties. The mean of a sum is the sum of the means. For example, if $y$ is total income, $u$ is "earned income" (wages and salaries), $v$ is "unearned income" (interest, dividends, rents), and $w$ is "other income" (social security benefits and pensions, etc.). Clearly, a person's total income is the sum of the incomes he or she receives from each source $y_i = u_i + v_i + w_i$. Then $$ \overline{y} = \overline{u} + \overline{v} + \overline{w}. $$ So it doesn't matter if we take the means from each income source and then add them together to find the mean total income, or add each individual's incomes from all sources to get his/her total income and then take the mean of that. We get the same value either way.

I've been trying to prove this, but it doesn't make sense to me.

e.g. $$ \frac{3 + 4 + 2}{3} = 3 $$ $$ \frac{6 + 14}{2} = 10 $$ $$ 3 + 10 \neq \frac{9 + 20}{2} $$

$ 3 + 10 $ is the sum of the means

$ \frac{9 + 20}{2} $ is the mean of the sums which are $3+4+2=9$ and $6+14=20$

$\endgroup$ 4

3 Answers

$\begingroup$

Suppose you have observations on income from work $w_i$ and income from benefits $b_i$ for a population of $n$ people

The text is simply saying that

$$\frac{1}{n}\sum_{i=1}^nw_i+\frac{1}{n}\sum_{i=1}^nb_i=\frac{1}{n}\sum_{i=1}^n(w_i+b_i)$$

where the left hand side is the sum of the mean work income and mean benefit income, while the right side is the mean of the sum of work income and benefit income.

$\endgroup$ $\begingroup$

In your example, you have $u_1, u_2, u_3$, $v_1, v_2$, and you have correctly showed that $$ \text{mean}(u_1,u_2,u_3) + \text{mean}(v_1,v_2) $$ is not necessarily equal to $$ \text{mean}(u_1 + u_2 + u_3, v_1 + v_2), $$ so in that sense you are exactly correct.

However, this is not what the statement was intended to express. What is intended is that if you have two (or more) lists with the same number of elements, and you take the mean of each list and sum them, that will be the same as summing the corresponding elements and then taking the mean. So if we have lists $u_1, u_2, u_3$ an $v_1, v_2, v_3$, it is saying that $$ \text{mean}(u_1 + u_2 + u_3, v_1 + v_2 + v_3) = \text{mean}(u_1,v_1) + \text{mean}(u_2,v_2) + \text{mean}(u_3,v_3). $$ Notice how in the phrase "sum of the means", the individual means must take elements of the same index -- we take the mean of $u_1, v_1$ and the mean of $u_2, v_2$ for example, rather than mean of $u_1, v_1, v_2$ or $u_1, u_2$ or anything else.

$\endgroup$ 4 $\begingroup$

The statement is not precise enough. You didn't say at what mathematical level it is intended to be. The most general formulation is, if you have a number of random variables defined on the same measure space and they all have finite averages (expected values), then their sum also has finite average and the "average of the sum equals the sum of the averages" holds. In your counter-example the two "random variables" are defined over different sets, let alone different measure spaces, so the claim is not intended to cover that case.

I agree that mathematical texts should be more careful in their formulations - otherwise they are bound to cause confusion. Good question!

$\endgroup$

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy