Recently I was finishing off an online interactive dashboard for a client. In various reports in the dashboard, I wanted to use a statistical metric called a median in addition to using the average. The median of a dataset is always defined as the value that splits the data evenly into 2 groups, higher and lower. In fact, the median is what people usually think of when they use the word average.
The problem is that the calculation of the arithmetic average in the world of fundraising is always problematic, in terms of creating a result that makes sense with our internal expectations of a ‘number in the middle’. Just think about it for a moment, in fundraising, the majority of people give a little money, whereas a small handful of people are SUPER generous. This always creates a scenario where the arithmetic average is not the ‘number in the middle’. That number represents something alright, but it’s not going to give you a reasonable expectation of what people are donating. For an example, let’s have a look at a graph I drafted up of lifetime donor value of donors to a charity based on how many months its been since those donors were acquired:
On this graph you can see what are termed ‘violins’ superimposed on clouds of dots. Both visual aspects are meant to give you an idea as to how many data points/donors there are at each combo of months since acquisition and lifetime donor value. For example, you can see that in the bottom-most violin, most data points/donors have a lifetime donor value at or below $100. The next thing to explain are the coloured X’s. The green X is a placeholder for the median within each violin, whereas the red X is the placeholder for the average. Which X looks like the number in the middle? Hopefully you’ve noticed that it is indeed the green X! The red X’s are quite different. If you’re looking for a way to adequately set your expectations as to how much donors are worth based on their lifetime on file, the average is very clearly an overestimate!
Just so that we’re crystal clear on how biased you can end up being by using the average, let’s forget about the graph for a moment and just look at the average vs. median numbers themselves in a table:
For each level of the months since acquisition variable, you can easily see how much bigger the average is than the median. What’s more, I’ve included what are called percentiles for the average vs median summary metrics. That means that we can see how many donors have a lifetime donor value lower than the average vs median at each level of the “months since” variable. Remember, for a summary metric to make sense as an average, we’re looking for a ‘number in the middle’. That means that about half the donors should be lower than our summary metric. Instead, what we see is that the arithmetic average is higher than somewhere around 80% of donor records at each level of the ‘months since’ variable. Those averages just aren’t very average!! In contrast, the median does a fantastic job at being the number in the middle.
In conclusion, while the average is certainly an industry standard, it just doesn’t make sense if what you’re after is a number that sets your expectations about the average donor. It will always be skewed thanks to the generous donors! Remember the median.