Hillary wasn't 99% to win

Understanding why made me a better Forecaster

Carl Allen

Jan 27, 2024

Now, to clear this up right away:

Saying “Hillary is 99% to win” and then Hillary losing is not in itself evidence that a forecast was bad or wrong.

Similarly, if I say “there's a 97% chance you DON'T roll a 12”

And then you roll a 12…that doesn't mean I was wrong.

Events with 1/100 probability happen

Events with 3/100 probability happen about 3x more often

With dice, we have the benefit of an infinite sample size (thanks, computers) and also extremely accurate data.

With elections, we have no such benefit.

While the saying goes “God does not play dice with the universe”

God - or your metaphorical higher power of choice - does play dice with elections, as far as us mere mortals are concerned. Anyone who claims any outcome is certain (before votes are, at least, being counted) is a liar and a fraud.

Which brings me to the next point:

Just because the Electoral College treats states independently, doesn't mean they are.

Unlike with dice, in which each roll is independent, the number of people who vote for the Presidential candidate in one state has a high correlation to another state.

It makes sense that people in Pennsylvania and Michigan (with similar geography, demographics, etc) would vote similarly.

I'll reiterate because it's important: “similarly” does not mean “the same.”

While it's hard to predict who people will eventually vote for, especially people who say they are “undecided” in polls, those people tend to choose at a similar proportion across state lines.

This “independent state” error is the only way someone could have had a forecast in which Hillary was ~99% to win

And they admitted it! Except, they still blame the polls for being wrong. And it gets much worse.

In an interview with Wang conducted by PSmag:

No reputable forecaster would ever make a “call” such as this.

It's not just hacks like Wang, otherwise reputable forecasters like Silver (whose forecast for that election was much better), or Morris, Jackson (whose weren't)…they all do this same little dance.

They assume that the poll margin is a prediction, and that any deviation is “the polls” fault.

Even when they admit mistakes (like Wang does at arm's length here) they still pretend as though their assumptions that were extremely flawed (as in, the margin of the poll should predict the margin of the election) were reasonable.

On the note of polls

Poll data is the best possible data you can have to help inform a prediction. Bold, highlight, underline:

Inform a prediction.

Polls are not predictions, polls are not tools that try to predict anything and anyone who says any variation of “the polls predicted” are is wrong.

I'll repeat because it always comes up:

I don't care how many degrees you have, how many years you've worked in the field, or how much academic literature you've published:

Anyone who says “the polls predicted” (or any variation) is wrong. You're welcome to debate me on that topic but I promise you it won't take long.

Now, you, dear reader, may understand that polls aren't predictions already, but understanding why is essential to building a good forecast, and I'll explain why.

Let's say we know with certainty that 48% of the population prefers D, and 48% of the population prefers R, and 0% will vote 3rd party.

If this sounds too hypothetical or that I'm making lots of assumptions, good. We're scratching the surface of what a good forecaster must account for.

But let's just pretend you want to forecast from this (known) starting point.

What's left?

Well, even if you know with absolute, unrealistic, magical certainty that 48% of voters will vote D, and 48% will vote R, and there are a tiny number of 4% undecided…

Your forecast still bears responsibility for…forecasting

And here we have the first, big step in why major analysts and forecasters demonstrate they do not understand how polls work.

They think, say, and confidently assert that any discrepancy between the poll and eventual result indicates a poll error.

That's wrong.

Even in our unrealistically certain (but useful) hypothetical, the forecaster must still forecast where the final 4% undecided will go.

And therein enters some fun debate.

How should undecideds be forecasted? That's an entirely different debate (about which the alleged experts don't agree, but they choose not to discuss it; very healthy field!)

Here is a simple simulation (which ran 10,000 times) in which I inputted the known variables:

48%-48% starting point
100% total (4% undecided and 0 third-party)

And this is where the forecaster decision comes in:

For this simulation, I assigned each voter unit (in increments of 0.1%) a probability of 50-50.
The simulation gives each candidate an exactly equal chance to gain 0.1%, starting from their known “base” of 48%.

Predictably, this simulation ends with a pretty neat bell curve around 50%. Both candidates have a 50% chance to win in this forecast.

Neat.

But notice, even with a “randomly assigned” undecided voter and a 50-50 chance, very few of the simulations generate exactly 50-50 (or, within 0.1% of 50-50).

That's the nature of the unknown, you have a range of possible outcomes.

And remember, in this simulation, we have the enormous benefit of knowing with 100% certainty where both candidates are starting from!

And still, elections in which one candidate wins by a margin of 1+ are not uncommon.

But starting at 48-48 (and magically knowing neither side could possibly be less than 48) is very restrictive. Of course the range of outcomes is narrow.

But now, let's use the same starting conditions, but instead of 50-50 undecided, let's say we forecast (for whatever reason) that undecided voters have a slightly better chance of leaning very slightly D.

55-45.

Tiny. Miniscule. Undetectable, in fact, unless you have a deity telling you.

What do you think that does to the win probability?

When I ask this question, I get answers ranging from “55%” to, very rarely, in the 60s.

Here's how it works out:

A forecast that gives each voter just a 55% chance of voting D, opposed to a 50% one, moves the whole forecast from 50/50, to 78/22.

What did the poll tell us about how undecideds would eventually vote?

Nada.

Yet, tiny movements in undecided preference - nothing to do with the poll, everything to do with the forecast - can make even a certain starting point like 48-48 highly favorable for one side.

Every forecast makes assumptions that have nothing to do with the polls. They must.

There is no such thing as a “polls' only” forecast. Every forecast makes assumptions with lots of sources for error that have nothing to do with the accuracy of the poll data.

Real forecasting is much more complex

While it is impossible to know with certainty that two candidates are starting at exactly 48%, polls do provide a very good estimate of that starting point.

But of course, polls have a margin of error.

What that means is very often misunderstood.

While the hypothetical “we know for certainty that they're starting at exactly 48%” was a simplistic exaggeration, it illustrates an important point about polls and forecasts:

Polls are not predictions; polls are an approximation of the candidate’s “floor”

If I believed that polls were very accurate (or that my aggregation method of choice was) I might assign a small possible error to each of the approximated floors.

For a (more realistic) scenario of a 48-48 poll average, but understanding that polls are imperfect approximations of the “floor” I might assign a +/- 2% to each of those starting points.

That could mean anything from a floor of:

D = 50, R = 46 to

D = 46, R = 50

And just that easy, right? Model from an uncertain starting point?

Nope!

The margin of error in a poll applies to every number in the poll*

*This is an oversimplification in the sense that large numbers (like 48%) are more likely to have a 2% error than small ones (like 4% undecided) but it's important to note because…

The polls' margin of error has nothing to do with the “spread” or “lead” or “margin.” It applies to each number in the polls, and that's it.

What that means is:

A starting point of 49-48 (with either candidate leading) with only 3% undecided, or 48-47 with 5% undecided, are extremely possible.

Tiny, undetectable, negligible (1%) errors in a poll or poll average can swing win probability a ton.

Starting to see how many variables a good forecast must account for, without even getting into state-to-state covariance?

For this, now, not-so-hypothetical, we will consider a large statewide election, whether Senate or Presidential.

Now, predicting the behavior of millions of humans, even if we were able to simplify their choices to a simple “D” or “R” is not an easy task.

But it's not an impossible one, either.

Let's use the 48% +/- 2% as a starting point for our model.

Below, I've modeled the simulation the same as the original - 50/50 assignment of undecideds, but instead of a known “48%” starting point, I assigned an unknown 48% +/- 2% starting point.

There is still a 50/50 chance of D (or R) winning..but the range of possible outcomes is much wider.

This is a simple, intuitive, and true reality of forecasting:

More undecideds = more uncertainty.

More uncertainty = wider range of possible outcomes.

There is no way to look at the 2016 poll data and conclude that Hillary was 99+% to get enough votes to become President, unless:

You assumed states were independent
You do not understand the role of undecideds in higher uncertainty and range of outcomes.

Both of those happened to almost all forecasters, in a field with truly smart people. It's baffling.

As it happens, we know (and knew) that there were a lot of undecided voters in 2016.

Hillary Clinton wasn't polling above 47% in any swing state!

The bad forecasters and analysts obsessed (and still obsess) over “margin” and how much she was “up by” because they assumed the poll margin was supposed to be predictive. Oops.

Naturally, most Presidential elections do not have double-digit undecideds (or a substantial third-party presence) but 2016 had both.

That made forecasting both hard and uncertain.

Not understanding those simple facts are yet another reason why anyone who made a “call” is a hack, and 99% was wrong.

For the sake of time, space, and technicalities, let's keep things simple.

If I asked you to give me a probability that both “D” and “R” received AT LEAST 10% of the vote in a random swing state in 2024, you'd probably say “100%”

At least 20%?

At least 30%?

At least 40%?

With a lack of substantial third-party presence - which is almost always the case in the US - both major parties are virtually guaranteed at least 40% in any swing state in their worst foreseeable years, certainly an election as near as 2024.

And you can do this, by the way, with no real data other than the basic understanding that US elections are largely binary, and major party candidates in highly contested states almost never win by a huge amount.

So here, the job of a forecaster starts not with dice, but with loaded dice.

Not only do we have to identify the number of people who will choose “D” and “R” we also have to identify the number of “IDK.”

Then, we have to figure out how many of that “IDK” will eventually vote (if at all) for which candidate.

It’s a very hard job, as illustrated above even with KNOWN floors of 48%-48%.

But Hillary wasn’t above 47% anywhere. She was polling higher than Trump in most swing states.

That means, even with a higher-than-normal third-party turnout, forecasters (not pollsters) had to account for nearly 10% undecided.

Who would those undecideds favor?

Again, we have the “50/50” vs “55/45” issue.

A good forecast does not simply assign a value and run the simulation, but assigns probabilities to each of those outcomes.

For example:

There’s X% chance undecided voters favor either candidate 50/50 +/- X%

An X% chance undecided voters favor either candidate 55%…and so on.

And a similar process for each candidate’s starting point or “floor.”

Add it all up:

For a forecaster to arrive at Clinton being ~99% to win, they must have concluded there was ~0% chance that Clinton’s floor of support was under the 46%-47% she was polling at (indefensible) AND
Of the larger-than-normal number of undecideds, there was almost no chance that Trump would win more than a small majority (also indefensible) AND
If any of the above unlikely events happened in one swing state, it probably wouldn’t happen in another (oops)

I know why they concluded those things:

Because their models, their words, their analysis is OBSESSED with ‘MARGIN’

They thought that Clinton’s “five-point lead” was bulletproof, because “five” is a big number! FIVE!

How could anyone overcome FIVE?!

FiveThirtyEight even, proudly, released this horrible video obsessing over margin as an explanation for how polls work.

When the experts in a field are spreading misinformation, it’s forgivable for the public to be misinformed. I hope to start on the path to correcting that - both from within the field, and in the public.

Understanding how and why they’re wrong have made me a better forecaster, and they can make you a more informed consumer of data. Or forecaster!

Poll Data Series with RealCarlAllen

Discussion about this post