The use of the Average Statistic deceives readers very often whenever the Mean gets severely affected by outliers within the data. One of the most repeated critics to data analysts is the unaware use of average figures, which frequently leads to dubious generalizations. Social scientists, those of whom refuse to use statistics in their analysis, commonly attack this analytical tool by saying: ok, so… if you eat a chicken and I do not eat anything, in average… we both have had half chicken. Nobody would oppose that conclusion as wrong and deceiving. However, such a reasoning uses just half of the procedure statisticians and econometricians use for determining whether or not the conclusion is statistically valid. Therefore, although it is evident that none of the subject in the example ate half a chicken, it is also true that the analysis is half way done.

## Outliers heavily affect the Mean statistic:

There is no question that all types of statistics have limited interpretations. In the case of the Mean (arithmetic average), outliers heavily affect the statistic, thereby –very often- the analysis. However, that does not mean arithmetic averages cannot illuminate wise conclusions. For instance Real Earnings, which is a very easy deceiving data on labor economics. Data on Real Earnings “are the estimated arithmetic averages (Means) of the hourly and weekly earnings of all jobs in the private non-farm sector in the economy”. Real Earnings are derived by the US Census Bureau of Labor Statistics from the Current Employment Statistics (CES) survey. So, any unaware reader could jump quickly on to ask if Real Earnings are the average of the hourly earnings of all Americans working in the non-farm private sector. Thus, analysts may also quick respond that in fact that is true. Then, most of the times, the follow up question would read as the following: Does Real Earnings mean that as a “typical” worker in the United States, I would make such an average? The answer is no, it does not. There is precisely where statistical analysis starts to work.

### Few Examples:

First. In terms of worker’s earnings various aspects determine how much money people make per hour. Educational attainment is perhaps the greatest determinant of earnings in the American economy. One also can think of geography as a factor of income per hour; even taxes could have an effect on how much money a worker does; age clearly controls income; so on and so forth. Intuitively, it is possible to see that for Earnings and Income there might be many exogenous factors influencing its variability.

Second. For the sake of discussion, let us say that neither education nor taxes affect hourly income of workers. In such a case, and at first glance, it is naïve to believe that counting such a low number of observations could work for any type of analysis, regardless of it being qualitative or quantitative. That means basically that for both qualitative and quantitative analysis, the number of observations matters a lot. In quantitative research the threshold number of observation hovers around 30. Hence, sample size are crucial not only for debunking the cited joke above, but also for reaching valuable conclusion in both qualitative and quantitative social science research.

Taking Real Earnings as example has no pitfall of the latter kind, but it surely does on the former, which certainly bounds the set of conclusion analysts can make. As an Average statistic, Real Earnings have a numerator and a denominator, for which the number in the series is the number of nonfarm private jobs. All types of jobs are included, regardless of age, education attainment, location, taxes, and etcetera. In other words, the companies CEO’s salaries may pull up the statistic. Conversely, minimum wage earners could drag down the Average.

### The Median statistic would do a better job sometimes:

At this point, it is clear that for some social science analysis, perhaps other type of statistics happen to be rather more suitable. For instance, the median would help analysts better understand income. So, why should one consider such a computation on Real Earnings? The answer is that Averages figures can be really useful as long as the analyst makes thorough caveats on what the Average really tells; and more importantly, limitations on what the Average figure does not tell.

Hence, changes in Real Earnings shed light onto changes in the proportion of workers in high-wages and low-wages industries or occupations. High-wages salaries will tend to, as in the CEO’s example above, pull up the average without substantial change in the number of hours worked. Conversely, as in the example of the minimum wage earners above, low-wages industries or occupations will tend to lower the outcome statistic. Furthermore, when paired with other data, Real Earnings could be useful for noticing improvements in use technology. If the number of work hours remains stagnant, but both earnings and employment levels increase, the net effect might stem from improvements in technology, which turns on increasing productivity. In other words, workers may work smarter rather than harder and longer. Lastly, Real Earnings Averages can also inform analysts about the amount of overtime work.

So, uses of Arithmetic Means, such as Real Earnings, can be thought-provoking. However, much caution has to be considered whenever economic assertions are stated.

Categories: Macroeconomics, Statistics and Time Series.