COVID-19 Narratives By the Numbers

"There are three kinds of lies: lies, damned lies, and statistics."
- Mark Twain

The best way to think about data is as a tool. On its own and without context data can't really tell you very much. If someone tells you that they make $60k a year, how much does that really tell you about their standard of living or how well they are able to support a family? This obviously depends on context, first and foremost where you're living. $60k in San Francisco for example gets you a lot less than $60k in Boise, ID.

The same of course is true when discussing COVID-19 and its impacts. 100 fatalities in a town of 30k people means something entirely differeant than in one with 1 million, and both circumstances would demand completely divergent policy responses. The same is true when we're talking about testing. 1k positive test results tells us something very different when there are 50% positive or 4% or when half of them are asymptomatic vs. only 5%, all of which has wildly divergent implications on an area's hospital system. The list of confounding variables goes on and on: patient age, demographics, comorbitities, type of tests, testing levels, date of first recorded cases, population density, culture, etc.

What is the purpose of this site?

The goal of this site is to make the case that data on its own tells us very little. By taking, in most cases, the exact same data sets regarding the impacts of COVID-19 on regions in the U.S. and around the world but painting a completely different picture either by changing the time frame or using different "lenses" through which to visualize the data, we can show how easily real data can be massaged to further a particular narrative.

Cargo Cult Science

When someone says that they only trust in science or they're listening to the "experts", more often than not it's a way to shut down debate. Who can argue with science after all! Unfortunately, the truth is that science is only as effective as the dissent that is allowed to be brought to it. Data on its own isn't the same thing as "doing science".

While data itself doesn't lie, people can use real data in service of one.

This is the general idea of "Cargo Cult Science", first introduced by Doctor Richard Feynman in 1974: the process of going through the motions of "science" more as a performative exercise but which ultimately undermines the search for knowledge that it is meant to represent. This project is meant to highlight how seductive this tendency can be by showing some examples of how this has been happening with our current narratives, even for something as important as a global pandemic.

Learn more about this idea of "Cargo Cult Scientists" and what Dr. Feynman had to say about it here.

How to use this site

(See the FAQ page for any other questions not answered here)

The Narratives

At the top of every page is a list of buttons (like the ones below this paragraph) that list out a series of narratives commonly heard about COVID-19. Each "Narrative Page" contains a set of visualizations using publicly available data to make the case for that narrative.

For example, if you want to make the case that New York State "beat" COVID-19, you can view a graph that shows fatality and case counts from after late May. Conversely if you want to see how the US outperformed other countries in Europe, there are a set of graphs that show data adjusted for population.

The Explanations

Each graph has an accompanying description that details what can potentially be learned from the visualization as well as any explanations regarding special calculations or data abstractions being done. These explanations can be toggled on and off via the main menu accessible from the header navigation.

The Filters + Comparisons

Many visualizations offer filters so that you can view comparisons between regions in a more isolated way. It's useful to have many regions available to make the comparisons but it can sometimes make the graph too crowded. There are also switches on some graphs so that you can alternate between absolute and relative views (e.g. cases per 100k). Most narratives have corresponding alternatives in an opposing narrative that you can also look to for comparision. Where relevant, this will be linked in the graph explanation.

This is not meant to be a political statement

While there are some conclusions that can be clearly drawn when all the narratives are laid out side-by-side, the point is not to cast blame on any set of politicians. Taking any of these narratives on their own is precisely how partisans play political games, getting their followers scared or riled up.

Rather, the story we see is more about how little we knew when the virus first hit and how the story of what happened, what worked, and what didn't is a more complicated one. One major difference between the first and second halves of 2020 though is that at the beginning of the year, we didn't have access to this much data nor the ability to see the types of patterns that are becoming much more clear in hindsight.

My hope is that by not letting demagogues control public fear and perception by telling us to trust them because they "believe in science", but rather look at the story the data is telling us for ourselves, we can make better, more informed decisions as a society, and hopefully start being smarter about holding our political leaders and leaders in the scientific community more accountable.

Data Sources

All of the data you see on this site was sourced from the following websites and API services. Not all of the data may even have been used yet in any visualizations, but they were helpful nonetheless and could find their way into future visualizations. Some pre-processing is done for certain fields. All data handling as well as the saved versions of the raw data files themselves can be viewed and reviewed on GitHub.

Check the FAQ for any other questions.

State Level Data: COVID Tracking Project and Johns Hopkins University

Populations (US): DataUSA.io

Employment/Labor Statistics (US): Bureau of Labor Statistics

Labor and GDP Statistics (EU): Eurostat

Policy Tracking Data (US and Global): Oxford Covid-19 Government Response Tracker

Behavior/Survey Data: Imperial College London YouGov Covid 19 Behaviour Tracker Data Hub

Rt (transmission rate) Data: Rt.live

Country-level Data: Our World In Data (OWID)

Polimath has a great list of data sources on his substack and his own data visualization tool on GitHub.

Acknowledgements

While the data was all sourced via the channels listed above, I was greatly influenced in my thinking and data/trend discovery by the following individuals. I found it interesting, as you might too, that I quickly found that while primary news sources, the self-proclaimed "most trusted names", seemed to be the least scrutinized and the most shallow, the sources that I started to turn to below were almost all pseudonymous and on Twitter (not on official publications). All provided more details, more background, and opened themselves up to more scrutiny than what I found in the most common "mainstream" narratives.

I'm grateful to them for taking the time to put this information out there and allowing others to come to their own conclusions with the most information possible.

"There are three kinds of lies: lies, damned lies, and statistics."- Mark Twain