MGrogie: Election 2022 -- Stats 101: A Primer

Our late brother Tim was an expert at statistical analysis. And I’m no slouch at it myself having spent a big chunk of my career doing financial analysis, which is just statistics as applied to corporate numbers, and having taken and thrived in several courses in statistics and calculus (that branch of mathematics that deals with statistics) in college and grad school, I’ve become rather disheartened by how so very many people (who really should know better), especially in the media, have interpreted and characterized the recent political polls as showing so many races as dead heats. They are not.

I don’t really blame the media as they are laboring under the handicap of needing to explain things in a way that a layperson can easily understand. That is why one of my key gripes is when they oversimplify things to the point that it becomes inaccurate and then that inaccuracy becomes forever the accepted truth with the public, it’s very frustrating. One key example is how, in mathematics and statistics, the number “thousand” is expressed by the symbol “M” and “million” as the symbol “M” (or “M Bar” except with the “bar” over the M; sorry, MS Word won’t put it above.) But the media, in order to not confuse the public, always expresses “million” as simply “M,” and this has become so accepted that the average person will not even believe a mathematician when they are told this is wrong. Thus the rub: if it’s in the news, people believe it must be correct.

And so I am disheartened by how the recent poll results are being interpreted to the public. I will focus on two key races which are actually pretty typical of many of the races around the country: The Pennsylvania contest between Fetterman and Oz, and the Michigan contest between Whitmer and Dixon. I will do a brief primer in Stats 101 using these races to demonstrate the fallacies. And I know that if Tim were here, he’d agree with all of it.

Let’s begin with an analysis of Fetterman vs Oz. Before the debate, the numbers were decidedly in favor of Fetterman, after the polling became quite a lot tighter, but then it always does after a debate and as the closing days and saturation political advertising is upon us. After the debate and has been such until a few days ago, the stats were 47/46 in Fetterman’s favor. Just in the last few days, at least according to Emerson College polling, it is now 48/46 in favor of Oz. This is within a 3-point margin of error and because they are only 2 points apart within a 3 point margin, it is being called a tie. Real Clear Politics still shows a 48/47 contest favoring Fetterman. Monmouth University shows 48/44 in favor of Fetterman and in many of these polls, the margin for error is almost 5 percent, therefore still a tie.

What about Michigan? Whitmer vs Dixon? As recently as June there was a whopping 35% difference with Whitmer in the lead. But that lead has been closing month after month until yesterday most of the polls show a mere 1 to 5 percent difference, Whitmer still leading. Again, within a margin of error between 3 and 5% depending on which poll, it is being called a statistical tie with the numbers ranging from 49/47 (Wick) to 54/43 (Detroit Free Press).

Now for the Stats 101 primer. Falling within the margin of error does not make the contest a statistical tie. It is quite the opposite. For the sake of illustration, let’s take the larger of the margins at 5%. This would then mean that the real numbers could lie anywhere from 54 to 42 (Wick) to 59 to 38 (Free Press) or nearly 2 to 1 in favor of either candidate, and anything in between.

The same is true of Pennsylvania. It is not a tie but rather anywhere from 50 to 43 (Emerson) to 53 vs 39 (Monmouth) in favor of either candidate. And, of course as before, it can mean anything in between. The margin of error does not equal a tie but rather the statistical estimate for how far off the polling can be for either and both candidates. This margin must be both added and subtracted from both candidates’ numbers to render the true picture.

And it doesn’t end there as all the polls have shown there is anywhere from 3% to 8% of the voters are undecided. It is statistically invalid to assume that those undecided voters would ultimately be divided the same way as the decided voters. Instead, it must be assumed that all of the undecided voters will go to either candidate. That means the end result could be anywhere from 58 to 34 in favor of either candidate, and anything in between.

It is statistically invalid to interpret these polls as a tie. Rather, the correct interpretation is that the race can be anywhere from a squeaker to a landslide. However it falls along that spectrum is anyone’s best guess and since it is just a guess and a whole host of significantly divergent guesses depending on the poll, that means nobody really has any idea. By the improper use of the margin of error, the polling reports incorrectly interpret it as a tie.

And I won’t even get into the importance of considering the Standard Deviation, which is probably the most important statistical tool financial analysts use in making their decisions. The margin of error is basically what statisticians call the Variance. But the Standard Deviation is one very important mathematical step beyond the Variance, which is why analysts rarely use the Variance. But let’s see the media try to use (and explain) to the public what the Standard Deviation is, how it is used, and why it is so important. You will see most people just tuning out.

We use the margin of error because it is easy to understand. But it is not accurately interpreted. What is easy to understand, what is so terribly simple to understand, is that so many people look at these polls and believe they are a tie and become either disheartened believing their side is now going to lose, or energized believing their side now might win. How many Democrats have I talked to who are very sad believing that November 8^th is going to be a Republican sweep? How many Republicans, despite the fact that they remain behind, feel the opposite?

Anything can and almost always does happen. If these things really could be accurately predicted, we wouldn’t even need elections, just let the polling decide. The only way to lose is to not show up. The saddest fact at this moment is that, per today’s news, the voter turnout is so far astoundingly light compared to 2020 when 2 million ballots were cast in Michigan in early voting. As of today, there have only been 1 million. Considering what’s at stake, a 50% drop is disgraceful. My big hope right now is that most people I’ve talked to have said they are voting in-person this year rather than by absentee. I certainly hope that makes the difference.

And please, I have one final request. Don’t anyone blame my optimism should the Republicans follow the historical trend and prevail tomorrow. You know who you are … Val! (The national polls today show the Republicans with a 1% advantage.) If the Democrats do have light turnout, or worse, vote their pocketbooks as do so many Americans, then they get the government they deserve. Throw off this cloud of pessimism, buck up and get out to the polls.

MGrogie

Monday, November 7, 2022

Election 2022 -- Stats 101: A Primer

No comments:

Post a Comment

My Screenplay and My Beard