# When Averages Attack: For the Love of Means

What is not to love about Averages?
(Courtesy: ImageChef.com)

Americans love the average. Ask the average american on the street what he knows of statistics, and they will probably answer in so many words about something relating to an average (arithmetic mean). An average describes for us the central tendency of some data; the whole distribution of whose values we find it easier not to remember. Yet averages have a darkside to them, beyond sunny days on a baseball diamond figuring your favorite batter’s batting average. Let us look at statistics… when averages attack!

## Why We Love Averages

An average is a summary of some data’s central tendency that’s easily remembered, spoken of, calculated, and we believe in part because we learned it describes the central tendency of the data–an average is something we can intuitively reason about.

### Easily Remembered

Pete Rose, Philadelphia Phillies (1981 baseball card)

Remember one value instead of ten, hundreds or thousands. In 1979 as a first baseman for the Philadelphia Phillies (my favorite baseball team), Pete Rose had 208 hits (H) in 628 at-bats (AB). You can choose to recall this data as a sequence of 1s and 0s; six-hundred and twenty-eight of them, in fact. With the help of www.baseball-reference.com, you can find that it starts:

`0, 0, 1, 1, 1, 0, 0, 0, 0, 0, ...`

Many of us conveniently choose to instead remember one value that summarizes this data, the Batting Average (presumptiously abbreviated, AVG). For Pete Rose that season it was a remarkable .331, refuting the notion that at 38-years old, any individual’s performance must wane.

If you can remember this sequence of values, then you do gain something that you would not have from the AVG alone. These first 10 AB precisely describe Pete Rose’s streaky performance during the opening series that season at Busch Stadium against the St. Louis Cardinals, a team that tried to sign Pete Rose that off-season.

### Spoken Of

With so many americans facing the challenge of obesity, it is no wonder that speaking of our weights is a sensitive subject. As we meet with our physicians for a (hopefully) annual check-up, we are all weighed and our body weights recorded. It is not terribly uncommon for our weight to change modestly between these weigh-ins at the doctor’s office, producing a time series of data entries in our medical records like the following:

`240, 250, 245, 250, 260, 255, ...`

Now unexpectedly you are filling out some forms for a trip across the country because the airline has instituted surcharges for overweight passengers. Wouldn’t you know it, you forgot to pack your bathroom scale in your luggage! You want to give an accurate weight on the form, although you can’t measure it at the moment (it could very likely have changed) and don’t have immediate access to your medical records. What do you say?

What you are familiar with over the past six years is that your average weight has been 250 pounds. Sometimes it has been ten pounds more, sometimes it has been ten pounds less, and at this very moment you couldn’t know without a measurement that it’s actually 253 pounds, 3 ounces. Should there be an inquiry, at least you can supply the medical history that backs-up your use of this average value summarizing your weight’s central tendency over the past six years.

### Calculated

Averages are easily computed using only the basic arithmetic we learned as children in school, one reason for statisticians referring to them by the name: “arithmetic mean.”
The simplest and most straightforward procedure for calculating an arithmetic mean is:

1. Add up each individual data value (datum) to produce a sum.
2. Divide this sum by the count of data values.

In the foregoing example of our average weight,

```240 + 250 + 245 + 250 + 260 + 255      1500
---------------------------------   =  ----  =  250
6                        6```

Why do we calculate it this way? It’s the linear nature of the arithmetic mean that presents this means of calculation, while reinforcing our subconscious notions of how these averages work. For instance, since the average indicates to us where we believe the “middle” of the data is, then the deviations above and below the average should cancel out to zero. Revisiting our weight calculation:

```(240 - 250) + (250 - 250) + (245 - 250) + (250 - 250) + (260 - 250) + (255 - 250)
-10     +      0      +     -5      +      0      +      10     +      5
-15      +      15
0```

If we didn’t know the average beforehand (let’s call it x), this reasoning would lead us to use simple algebra to come up with the calculation:

```(240 - x) + (250 - x) + (245 - x) + (250 - x) + (250 - x) + (255 - x) = 0
1500 - 6x = 0
1500 = 6x
250 = x```

Can you see how this always yields the same arithmetic procedure we learned in school? That flowed from the premise we intuitively take for granted: that the average tells us about the center of our data, and any deviations from that center should cancel each other out in the end.

Good morning, I am with the U.S. Census Bureau …
(Courtesy: The Wrong Pong, book by Steven Butler)

Let’s be clear, by “averages attack,” I don’t mean cases where the average includes some fraction open to a grisly misinterpretation. We’ve all heard statistics tossed around such as, in 2016 the U.S. Census found the average number of children per married family is 1.89. We’re all likewise aware in interpreting this average that it does NOT mean families are suffering at the hands of some ferocious troll, which comes into childrens’ bedrooms to steal one-ninth of a child for their breakfast snack. Instead, we’re comfortable reasoning that these nominal categories exist in the data:

• no-child families,
• one-child families,
• two-child families, and
• more-child families;

and that the “midpoint” of this distribution lands somewhere closer to the two-child families than the one-child families. This is called interpolation.  It’s relatively common that an interpolated value lands between nominal categories (or discrete data), and we frequently map these onto a continuous data value when it makes sense to do so.

When you are dealing with linear relationships between categories of your data (families with one or fewer children, and families with two or more children), interpolation serves our understanding well. But what you need to ask yourself, is whether this presumption of linearity always holds true?

You’re probably thinking, “Of course it does! It’s math, it’s algebra even, you proved it in the previous section, didn’t you?” Not at all. What I had demonstrated algebraically was that if you take the axiom of there existing a linear relationship between all of the data as true, and incidentally that the average sits at the midpoint of the range of data (the difference: high – low), then the variations from this average net to zero.

## Weighted Averages: A Curveball For Samantha

Sometimes we want to calculate an average where some values (often the most recent) should be treated as having a greater impact. A ready example of this is found in how university students have their academic grades determined. Take for example the schedule of exams found in a typical university physics course syllabus:

22 February 2017 Exam 1 – Waves  (20% of final grade)
29 March 2017     Exam 2 – E & M   (20% of final grade)
01 May 2017        Lab Notebook       (25% of final grade)
03 May 2017       Cumulative Final  (35% of final grade)

Suppose the university’s starting softball shortstop, Samantha, is taking this General Physics II course, and she receives the following grades on her first two exams, lab notebook, and cumulative final:

`83, 88, 85, 70`

Your intuition probably expects Samantha received a B in General Physics II with an average somewhere in the mid- to lower-80s. But look closely here at the weightings each grade receives. You must take these into account when computing the weighted arithmetic mean, like so,

```83 x (0.20) + 88 x (0.20) + 85 x (0.25) + 70 x (0.35)
16.6 + 17.6 + 21.25 + 24.50 = 79.95```

What a shame!  Samantha falls shy of the 80 point threshold she needs to earn a B. Instead, she earned a C. Unless she has a particularly forgiving professor, those are the breaks.

### Diagnosing What Went Wrong

In this example, it was NOT as simple as merely looking at the deviations from the midpoint, although that may be what our brains told us subconsciously:

```(83 - 79.95) + (88 - 79.95) + (85 - 79.95) + (70 - 79.95)
3.05     +     8.05     +     5.05     +    -9.95
16.15     +    -9.95
6.20```

The good news is that the algebra can be adapted to work like before, in this case where the weights are simple constant factors. Do you see what adjustment must be made? Of course, we must multiply each variation by its respective weighting factor.

```(83 - 79.95)(0.2) + (88 - 79.95)(0.2) + (85 - 79.95)(0.25) + (70 - 79.95)(0.35)
3.05(0.2)     +     8.05(0.2)     +     5.05(0.25)     +    -9.95(0.35)
0.61       +       1.61        +       1.2625       +       3.4825
3.4825       +      -3.4825
0```

In doing so, the sum of all deviations from the weighted arithmetic mean again net to zero. Reflect carefully on how your mind processed the list of Samantha’s grades at first glance, and contrast that to how you come up with the correct answer when working through the arithmetic of multiplying each grade by its weight factor.