During the Second World War, America faced a serious dilemma: the Nazis were too good at shooting down American warplanes. One airforce captain had a plan: investigate every plane that returned to the base for damage, and record it. Soon a pattern emerged: almost all the bullet holes were on the wings, around the cockpit and on the wing. The logical conclusion was to reinforce those areas with heavier metal in the hope that fewer planes would be downed.
Fortunately, the evidence was also shown to a leading American statistician based in New York, who immediately exposed the error. The patterns were of planes who returned to the base. The planes that had been downed had not been investigated, and recorded. Abraham Wald’s suggestion was thus exactly the opposite: strengthen those parts of the plane that had no bullet holes, because the planes that were shot in those places never returned to the base.
Wald and his wife would tragically die in a plane crash a few years later, but his work on survivorship bias has lived on – and now become a kind of internet meme. There are many examples of how many of us fail to acknowledge survivorship bias. Think about the student who argues that it is unimportant to complete his degree because so many of the richest men on earth – Bill Gates, Jeff Bezos and Mark Zuckerberg – also never completed theirs. (This is especially a popular argument during final year exams.) But such a student would benefit from a stats class, because this is just survivorship bias: there were thousands of other students who never completed their degrees and who have very little wealth to show for it. This is similar to books by successful businessmen who provide advice about the ‘road to wealth’ (we never get to hear from the failed entrepreneurs who tried that same advice) or people who complain that ‘music in the 1960s or the 1990s is so much better than today’ (but who forget all the rubbish music that was also produced in the 1960s and 1990s).
We make these mistakes every day. Covid provides a vivid reminder. If I think back to the first lockdown, I remember how frequently I checked daily statistics of cases and deaths. I am pretty confident I care less about these numbers now than then. But this is also a case of survivorship bias: I forget about all the other things I did during those early months, and how quickly my interest in those statistics disappeared. One way to test this is to keep a diary. This is exactly what I asked my students to do in March last year. An anthology of their anonymous entries (‘Life under Lockdown’, co-edited with Laura Richardson) will appear with African Sun Media early next year.
It is fascinating to go back and read their entries. Here is one on 21 March 2020, a few days before South Africa locked down: ‘We hear so many statistics everyday that it’s hard to think in terms of actual people dying. I think there is a huge danger of people becoming desensitised to the effects of the virus.’
This entry captures an important truth. Humans yearn for stories rather than dry statistics. Reporting thousands of deaths can attract less attention than the story of a deceased child. That is because evolution has helped humans attach emotional connections to stories rather than numbers. For that reason numbers can sound more objective, factual and scientific.
And that is the danger of statistics, because statistics can easily be twisted and turned to suit a certain agenda. The South African local elections provide a helpful example. Almost every political party (except the ANC, who suffered massive losses that even numbers-spinners couldn’t help them) claimed success. The Democratic Alliance announced that their share of votes increased from 2019 to 2021. That is one interpretation, of course. Another would be that their share of votes declined from 2016, the previous local government elections. Both stories are true, so the choice of fact will depend on the purpose of the statistical storyteller. ‘Lies, damned lies and statistics’, it is said, but perhaps it should rather be: ‘Your story, my story and statistics’.
The fact that stats can be tweaked doesn’t mean we should do away with it. On the contrary. The advantage of numbers is that it is transparent: we can test it against other evidence, question the methods and, just like in science, preserve those numbers over which we can find consensus. Government policies on poverty alleviation or service delivery, Reserve Bank decisions on interest rate adjustments or entrepreneurs choosing where to invest all require accurate, timely statistics that are interpreted intelligently. This is just one reason why statisticians are almost never unemployed, and why I am a big proponent of teaching statistics at school.
The quality of interpretations statisticians can make is only as good as the quality of numbers. Rubbish in, rubbish out. Stats SA, the entity responsible for collecting South Africa’s demographic and economic data, has seen its budget slashed substantially. Next year is a census year. South Africa’s previous census, in 2011, is already questioned for poor data quality. All indications are that the 2022 one will be of even worse quality. This matters: if we do not know what is actually happening, we have no way to fix what is wrong.
The silver lining is that we are not as reliable on the state for these numbers as we used to be. We live in a world of data abundance. Economists now have access to a wealth of sources, from daily supermarket sales to social media posts. Statistics matter more and more, from my watch screening the quality of my sleep to the Proteas exiting the T20 World Cup because of a poor net runrate. One thing remains true: the more data, the more ways it can be managed and manipulated. All the more reason for cultivating the statistical skills of the next generation.
*An edited version of this was first published in Rapport on 14 November (in Afrikaans).