Lies, Damned Lies, and Statistics

tloojs
Jan 17, 2021
11 min read

The following essay is meant to demonstrate why we say that “Law students teach themselves the law; law school teaches them to think like lawyers.” When we say that we are trained to “think like lawyers,” we mean that we learn how to approach evidence and the law to apply it logically and without bias. If we learn this lesson well, it can be applied to many situations outside the law. For example, to conspiracy theories and arguments on public policy.

“There are three kinds of lies: lies, damned lies, and statistics.” The quote was attributed to Disraeli by Mark Twain in Chapters from My Autobiography, published in the North American Review in 1907. There is scant evidence that the Victorian Era Prime Minister said it, or that if he did, that it was original to him. It has been suggested that its origin is through the corruption of a similar legal aphorism that is rendered as describing the three classes of witnesses: Liars, Damned Liars, and Experts.

Whatever the origin, both phrases have an element of truth – but also do a disservice to statisticians and expert witnesses. There are unquestionably instances in which an expert testifies according to the wishes of whomever has paid her fee, but this does not necessarily mean that the testimony is not accurate – and that it is why we can have a “battle of the experts” in a trial.

Likewise, statistics are not so much true or false as they are accurately portrayed or mis-portrayed. One could argue that statistics which are “not true” simply are not statistics. That is, if the numbers are simply made up, they do not represent the “status” of anything but the lie.

When judging statistics (or expert witnesses), the essential question is what is the underlying basis of the proposition being put forward? If I told you that in a recent survey of American adults 100% of the respondents said they never ate fried chicken, you would likely say that the survey was unscientific and plainly wrong, but if I told you the same statistic about a survey of committed vegans, you probably wouldn’t question its veracity (or at least assume that those who occasionally visited KFC were too embarrassed to admit it).

The point is that like most questions raised in the law, context is key. So, let’s consider the context of two arguments based in “statistics” that have plagued the media (including, perhaps especially, social media) of late.

First is the claim that there must be evidence of election fraud because “everyone I know voted for Trump.” When asked to explain the rationale by which this anecdotal evidence supports the existence of election fraud, the speaker will generally indicate that the narrowness of the election outcome in his state is “statistically impossible” given how few people he knew supported Biden, how few people were at Biden rallies, the lack of Biden yard signs in his neighborhood, etc.

The explanation is itself a refutation of the accuracy of the statistic as the speaker begins by admitting that, in fact, he knows or suspects that some of his acquaintances did not vote for Trump. If questioned more deeply, the proponent will candidly admit either that he does not know what the ratio of Trump to Biden voters were in his precinct or, if he does, it was probably by a less nearly unanimous margin than his statement implies. Election results consistently show that the nation is much more “purple” in composition than it is “red and blue.”

If his neighborhood leans to the maroon rather than the violet shade of the electorate, the absence of Biden yard signs may be explained by the wariness of the Biden supporters to place yard signs advertising their “aberrant” behavior. “Perhaps,” he says, but “what about the rallies?” If pressed further, the speaker may concede that he did not personally attend a Biden rally, that, yes, he understood that Biden did not conduct as many rallies as Trump and, yes, these rallies were often small because unlike Trump’s campaign, the Biden campaign insisted on the (in the speaker’s view, unnecessary) social distancing restrictions.

At this point the must-have-been-fraudster pivots to his alternate statistic on rallies: Trump’s supporters were always more enthusiastic. Here, the lawyerly thought process may have some difficulty addressing this point because, if one is honest, it is objectively true. The answer that Trump rallies are more boisterous by design because of the candidate’s style does not counteract that truth, in as much as it would support the argument that Trump is more energetic and inspiring a leader, and therefore “should” have won.

Rather than avoid the issue as a “tangent,” however, we can address it from another angle. What, we ask, inspired the enthusiasm? What did the audience respond to? The must-have-been-fraudster then will use phrases like “Reopen Now,” “America First,” and “Lock Them Up!” The response to this is likely to barely dent the speaker’s belief, but we can nonetheless point out that these slogans, or ones much like them, were emblematic of Trump’s first campaign – but the rallies then were larger and even more enthusiastic.

Is it possible that some voters viewed the campaign as a “retread,” lacking in new ideas? In fact, the GOP failed to adopt a platform for the election, essentially adopting the one from 2016 – including the pledge to accomplish healthcare reform within the first 100 days – something that had yet to be proposed, let alone achieved. Is it possible that the enthusiasm for “four more years” of the same was not quite so strong because the country was facing new challenges? If the must-have-been-fraudster is honest, he might just admit this possibility, and that a decrease in enthusiasm usually translates to a decrease in turnout.

Finally, if the must-have-been-fraudster is pressed further, he may even concede that it is possible that in other areas of his state, say those with urban and minority populations, the President might have been not quite so popular as in his particular bailiwick – though it still baffles him that this could be the case. When the “statistic” is based on nothing more than personal observations and a limited perspective of the evidence, bafflement is likewise the inevitable reaction of the recipient of the “expert’s” pronouncement. In court, it can usually result in the expert not being deemed qualified; in the court of public opinion this ruling is more often influenced by the recipient’s own perspective.

The second recent use – or more properly, misuse – of statistics relates to the commonly heard claim that the COVID-19 virus is “just the flu” or “just a cold.” The first problem with responding to these assertions is that they contain a grain or two of truth. Coronaviruses and influenza viruses are, in fact, quite similar (more along the lines of cousins, rather than “kissing cousins,” but related nonetheless). Likewise, “colds” can be caused by all sorts of pathogens including variant strains of the same type of viruses that cause both COVID-19 and the various illnesses that we call “the flu.”

So, before we look at the statistics, let’s break this down. The “flu” is any disease that is caused by an influenza virus, of which there are many strains. Every year, national and international organizations forecast what strains of influenza are likely to present the most serious threat to public health (not necessarily the most widespread) and prepare vaccines to combat the 3 or 4 most serious. Provided that there is widespread vaccination within a given population, this should reduce both the incidence and the morbidity of the targeted viruses, but it does not ensure that these viruses will not be present, nor does it negate the certainty that other strains will fill the void left by their reduced incidence.

[Editor’s note: Like “jurisdiction,” morbidity and its related term mortality are words of “many, too many meanings.” So, for the sake of clarity, I will specify that herein I use the term “morbidity” to refer to the likelihood that someone who contracts a disease will die as a result and “mortality” to refer to the overall impact of deaths from a disease on a given population, e.g., if in a population of 100, 10 contract the disease and one dies, the morbidity is 10% and the mortality is 1%; if the population is 1000 and 10 contract the disease and one dies, the morbidity is still 10% but the mortality is 0.1%.]

COVID, or more specifically COVID-19, is a disease caused by a specific strain of the coronavirus – contrary to the assertion made by many including Kellyanne Conway, the “19” does not mean that it is the nineteenth such strain discovered, but rather that it was identified in 2019. Coronaviruses are not as common as influenza viruses. This is the result of many factors, including the fact that influenza has been in the human population much longer.

This is neither the time nor place for a lesson in virology, but in simple terms most viruses in nature cannot survive in human hosts (and likewise, most viruses that effect humans cannot survive in non-human hosts). Once a virus can infect a species, however, it tends to mutate more readily to adapt to that host. The longer a virus exists in a population, the more varied it becomes – and the more the host adapts to combat it.

Thus, COVID and the flu are not the same, even if they are cousins. But what about “colds”? That’s easier because “colds” don’t exist. Well, of course they exist, but not as a specific type of disease. A “cold” refers to the symptoms of a disease, not the disease itself. What we commonly refer to as a “cold” is typically caused by a “rhinovirus,” the strains of which are even more common in the human population than influenza viruses. Colds may also be caused by parainfluenza viruses (which, despite the name are not closely related to influenza viruses – if COVID and influenza are “cousins,” parainfluenza and influenza are fourth cousins, twice removed), respiratory syncytial viruses (usually called RSVs), or coronaviruses.

“Aha,” declares the “same as” proponent, “there’s the proof. COVID-19 is ‘just a cold!’” Well, yes and no, but much more no than yes. Coronaviruses do indeed cause illnesses that have “cold-like” symptoms – recall that “cold” refers to symptoms, not a specific disease. For that matter, many people with a mild case of the flu often say, “It’s just a cold,” although some virologists would say, “No, it’s a mild case of the flu.” I suppose if the same virologists were to culture a “cold” caused by an RSV or a coronavirus they would insist “No, it’s a respiratory syncytial disease” or “it’s COVID.” COVID is simply short for “coronavirus disease,” which is why virologists insist on using “COVID-19” to refer to the pandemic (and the highly contagious variant first identified in Britain may eventually be said to cause of COVID-20 if it is determined to be a “new” virus).

Medical science is brave enough to admit that between 20% and 30% of “colds” are caused by pathogens we have not yet been able to identify. In short, comparing any specific disease to a “cold” does not result in a meaningful analogy. Saying “COVID-19 is just a cold” is as meaningful as saying “both the Eiffel Tower and basketball players are tall.” Both statements are true in a broad sense, but they do not create a complete equivalence between the two.

Which brings us to the statistics used to support the claim the COVID-19 is no more serious a public health problem than “the flu” or “colds.” Most often the argument uses prevalence and consequence to show that COVID-19 does not present a significant threat to public health warranting the measures taken by some state and local governments to combat it.

Prevalence arguments compare the number of estimated cases of influenza to the confirmed diagnosed cases of COVID-19. The first issue is, of course, that the data for flu are estimates whereas the COVID numbers are confirmed. “But that’s fair,” say the proponents, “because most flu diagnoses are not based on lab tests.” This is true, but that is because unlike COVID, flu symptoms are specific and readily treatable with over-the-counter remedies, so sufferers are less likely to seek treatment, and when they do, physicians are less likely to require a lab test, so let’s concede this point.

Early in the pandemic, the argument was that flu was actually more prevalent than COVID. Again, this was true – after a fashion. The reason flu was more prevalent in the early months of the pandemic was the 2019-2020 flu season was at its height while COVID was just taking hold. It’s the same reason that there are more cars speeding on the interstate than on the entrance ramps – there are more cars and they’ve been on the road longer on the travel lanes, the ones on the entrance ramps are just getting started.

Initial numbers for the 2019-2020 flu season (May to April) are now available. To be fair to the proponents of the “same as” argument, I will refer only to the highest possible numbers in the wide range suggested by the data owing to the lack of lab confirmed diagnoses. The data show that there were 56,000,000 cases of flu in the US over the course of the year. The first case of COVID was confirmed in the US in February 2020, but it is all but certain that there were cases before then, so let’s use today’s (1/16/2020) totals (rounded down) for COVID, shall we? Those numbers show that there have been just 24,000,000 confirmed cases of COVID – and, unlike flu, there is no agreed metric for measuring undiagnosed cases with estimates ranging from “almost none” to “3 times the confirmed.” Again, to be fair to the same-asers, we will assume it’s “almost none.”

“There you have it,” proclaim the same-asers, “the flu is more prevalent, ipso facto COVID is not a serious threat to public health!” Well, yes and no, but mostly no. Prevalence by itself is not the proper way to judge the seriousness of a public health threat. Roughly the same number of people suffer seasonal allergies as have the flu, but deaths due to hay fever are rare. By contrast, deaths attributed to flu, 62,000 in the 2019-2020 flu season, are fairly significant – flu was the eighth leading cause of death in the US in the most recent estimate and is consistently in the top ten – and has been the most common cause of death from a viral disease for many years. But COVID deaths are now over 400,000, making COVID the third leading cause of deaths for 2019 and, likewise the leading cause of death by viral infection – a profoundly serious public health threat. Q.E.D.

Still, the same-asers are not done. “You want to talk deaths? Let’s consider the mortality rate for flu, which was 1,589 per one million last year and it’s only 1,214 per one million for COVID! Gotcha!”

And here is where the editor’s note above comes into play. This argument is based on the same-asers not recognizing that the term “mortality” can be used in different ways. So, they see a report that pegs flu mortality at 1,589 and another stating that COVID mortality is 1,214, and it seems clear that flu is the more serious threat to public safety. Both statistics are correct – but they do not refer to the same thing. This is where the legal rule to “read on” is helpful – keep reading to make sure you understand the context of what is being said.

The flu mortality rate cited by the same-aser is per one million cases, while the COVID mortality rate is per one million people. This is why I chose to explain that morbidity – death caused by having the illness – is the more accurate term for describing that statistic. If flu morbidity is 1,589, you have to compare it to COVID morbidity, which is 16,666; Flu mortality, by contrast is at best 187 per million to COVID’s 1,214. And recall that the flu numbers are based on the highest possible within a range of between 72 and 187.

And, as a final point, here is where statistics, far from being beyond dammed lies, can prove their usefulness. Suppose we want to try to come up with a more accurate understanding of how much flu there really was in the US in 2019-2020. One tool we can use is comparative analysis. In the above example, we can see that based on our assumptions the ratio of COVID to flu morbidity is about 10.5 to 1. Because deaths related to COVID and flu are far more certain than estimates of the prevalence of these diseases in the general population, but COVID prevalence is the more accurate of the two, we can assume that this ratio is more accurate as well. We can then apply that ratio to the uncertain number of flu prevalence to drill down to a more accurate number. In the above example of flu mortality, if we assume a similar ratio to COVID mortality, we get a flu mortality of 115 per million, which falls nicely in the middle range of the 72-187 estimate.

The Virginia Appellate Lawyer’s Court of Appeals of Virginia Blog

Lies, Damned Lies, and Statistics

Comments

Recent Posts

Battle over Confederate Monuments takes a Macabre Turn and other Recent Decision of the Court of Appeals

Long Time no See.

When the Happiest Duty is instead the Most Difficult Decision

Three New Opinions From the Court of Appeals and an Anecdote for those who Know the Meaning of the Word Prurient.

Dog Days of Summer Slows the Flow of Opinions

Archives

Categories

RSS Feed

Subscribe to this Blog's Feed