By Richard Bleil (and his new editor)
Let’s talk about Sadistics, shall we? (Editor’s note: Bleil, do you have to start so soon? Take this out; you know you mean “statistics” I”m in no mood for your puns.).
The term “average” really has no meaning, mathematically, or if it does, it’s a generic term for three more specific terms, the “mean”, the “median” and the “mode”. I thought it might be informative to walk through these (Editor’s note: and a good cure for insomnia, no doubt.)
When one says “average”, typically they mean the “mean”, wherein “mean” is more meaningful. (Editor’s note: Bleil…that’s twice now. I’m serious; knock it off.) The mean is the sum of all data points, divided by the number of data points. It’s really the “central tendency”, meaning that it is close to what you can expect on a similar data set. If you were to roll a standard six-sided dice many many times and recorded the values on each roll, you should get a mean value of 3.5. You might notice that this is not even a possible roll; but you are as likely to roll three, for example, as four, so the mean is somewhere in the middle.
The “median” is the central point of the data set. Note that in our six sided dice example, the median is also 3.5. To find the median, take all of the data points and arrange them from lowest to highest or highest to lowest, and the median is exactly the middle point. So if there were five points, it would be the third, the one right in the middle of the two extremes. In our example, there are six data points: 1, 2, 3, 4, 5, 6. Since this is an even number, the median is halfway between the two in the middle, that is, halfway between 3 and 4. So the median is also 3.5.
And the ode to mode. (Editors note: Oh, God, they’re getting worse!) The “mode”, then, is the most frequently occurring value in the data set. There is no mode in our dice; each number is as likely as the other. But if most people eat three pieces of pizza (Editor’s note: Mmmmm…piiiiiizzaaaaaaahhhh!), that becomes the mode. Fewest people might eat one piece, followed by two, and three is most, but some people might eat four, and there’s always that one guy that eats nine. Nine pieces of pizza. It’s ridiculous. I mean, seriously, what’s that all about? (Editor’s note: Bleil, please get back on the subject at hand.) Fine, then, let’s move on.
There are really two main types of truly random distributions. One is a flat distribution, like rolling dice. There is no reason to expect one value to be more heavily weighted than the other. The other natural type of distribution is referred to as a “Gaussian distribution”. This is the so-called “bell curve” that dingalings often talk about (Editor’s note: Bleil…seriously? A “dingaling” joke???) For example, according to one source, the average weight of an American man is 197.9 lbs. This seems a bit high to me, but assuming that it is correct, we all know that some men will be more, and others will be lighter, but the distribution is probably pretty random and evenly distributed.
In a truly unbiased and true distribution, the three “averages”, the mean, the median and the mode are very close to the same number. However, is there is an influence on the distribution, these three numbers can be quite different. For example, a “weighted die” might show a distribution with all possible numbers, but the mode and the median will be quite different (the median will still be 3.5, in our example, but the mode will be the weighted value since it will come up most frequently, and the mean will be somewhere between the mode and the median. Much like my love-llfe is skewed towards zero. (Editor’s note: there’s a good reason for that, Bleil. Wanna take a guess as to what it might be?)
As an example, I’ve been trying to find the average salary for Americans (Editor’s note: Bleil, do you mean “mean”, “median” or “mode”?) It’s actually an incredibly elusive piece of information to find. The closest I’ve found is from the US Census Bureau (Editor’s note: as if we can trust THEM!) that says, that the “median household income is $56,515”. This means that half of the households in the US makes less than this, and half make more.
It’s hard to find, but it looks to me like the mode for annual income is roughly $18,000. Roughly 6% of the US population is making about this amount. Roughly 20% of the US population makes $18,000 or less, or one in five, and notice that this is household income, not individual. Currently, the federal minimum wage is $7.25, which amounts to around $15,000. So why wouldn’t the government want to boast that the most common household income is only about $3,000 over minimum wage? (Editor’s note: gee, I wonder!)
The total US income is around $13 trillion. With a population of 125 million households, that means that the mean household income is $104,000. So, why wouldn’t the US want to report that the mean household income is $104,000?
Look at the disparate values of the mean ($104,000), median ($56,500) and mode ($18,000) incomes of US households. The most typical income is less than half of the median income, and the median income is less than half of the mean. So is it possible that the government is telling a lie by reporting the median value? (Editor’s Note; OUR government, dishonest? Naaaawwwww!) In a way, yes. This disparity between mean, median and mode probably reflects the current concern about income disparity. What’s truly interesting (Editor’s note: subjective!) is how with statistics you can be very honest, and misleading at the same time. By reporting the median means that more than half of the population will be happy knowing they are making as much or more than most households in the nation, and the number increases when you add the people happy that they are near that value. How happy would people be if 80% of them were suddenly aware that they are below the mean income? Or how would people feel if they realized that most American households are barely beating the federal minimum wage? Choosing the value to report depends on the goal of the reporter. Read these numbers with care.