Tag: Nate Silver

Nate Silver Wasn’t Wrong…We Were: What Hollywood Can Learn from the Election – Most Important Story of the Week 6-Nov-20

Let’s not kid ourselves. This week was about one story. Everyone in America–and around the globe–was watching for the results of Tuesday’s election. I didn’t get any work done on Tuesday or Wednesday because I was distracted by following the news.

So it’s our story of the week!

(As always, sign up for my newsletter to get all my writings and my favorite entertainment business picks from the last 2 weeks or so.)

Most Important Story of the Week – What a Electoral Polling Mistake Says About Decision-Making

I won’t discuss the impact of the election from a political perspective. That is for political pundits, which I am not.

If I’m anything, though, I’m a Nate Silver fanboy. Silver has been incredibly influential for this site. I regularly cite him and his work. I use the phrases “signal and the noise” a lot. And I try to build forecasting models for entertainment. That’s my political affiliation: data truther. 

Which is why the narrative bothers me. Imagine this scenario.

Pennsylvania’s legislature allows votes to be counted several days before the election, whereas Florida votes to end early counting. 

Then, on election night Joe Biden is declared the winner in Pennsylvania. 

Would the narrative have been different? Heck yes.

(To be clear, while Biden leads in Pennsylvania, the race hasn’t been called by networks or the AP nor have final votes been tallied, the latter being the true “count”.)

But the “narrative” is not reality. The narrative is the collective chattering of talking heads, social media conversation, and news coverage. But notably, the narrative is not reality. Reality is reality. See how easily the narrative changes if just two states count their votes differently?

On top of the narrative are the “expectations” that came into the election. These were set, overwhelmingly, by Nate Silver and his 538 model. His model is based on the polls, which had another error like 2016 that underestimated Donald Trump.  (We don’t know the true magnitude yet, which I’ll get to.) This set up a lot of optimistic expectations for Democrats. This set up another narrative, that Silver screwed up his models or that polls are irrevocably broken, or both. Even more than the results, this could be the narrative driving Twitter, “How did 538 lead us so astray?!?!?”

Let me provide just one example from my experience. In my previous role at a streamer, I gathered all the data to help the key decision-makers decide what shows to order and renew. Yet, the data wasn’t mine alone. Often, executives wanted the data immediately. Meaning a streaming show premieres on a Friday morning, and the executives wanted email updates for how the show was performing. Sometimes hourly! Several times, a show would start slow, for whatever reason, and finish strong. Or vice versa. But executives checking every hour would often use their first impression as the takeaway for how the show did.

In other words, I routinely saw the mistakes being made for this election at  a big company in America.

If you’re a decision-maker, your goal is to focus on reality, not the narrative. And where they diverge, you take advantage. If you’re an investor, you invest to make money when the narrative clashes with reality. If you’re in business, you build a competitive strategy off it. And if you’re in politics, you win future elections.

In general, the biggest “error” was not from the polling. The biggest error was how we consumed election night information. (And I’m guilty here too.) Understanding that will help us all–and especially business leaders–make better decisions.

What Nate Silver’s 538 Is Trying to Do (And Other Modelers)

Before we can understand what went wrong, we have to understand what both polls and models of polls are trying to do.

The goal of a poll is to forecast the feelings, opinions or thoughts of potential voters for upcoming elections. It’s a survey! Of course, the goal of this survey is to be accurate. If you could, you’d survey everyone in America. But that would be really expensive! The compromise is to survey a sample and draw conclusions from it. 

The challenge is how pollsters gather that sample. If their sample is biased, the survey won’t be accurate. That is why pollsters get paid the big bucks. Eh, big is too high. Let’s say “some bucks”. Surveys are easy to do, but really hard to do well.

In America, recent technological developments have weakened the ability to survey various populations. Specifically, the rise of cell phones with caller ID has decreased the number of respondents who talk to pollsters. Unlike the past when landlines were ubiquitous, many homes do not have a landline. The potential replacement–internet polls–come with their own sampling problems.

The solution is to adjust the sample population and weight it by various demographic categories. Like age, gender, location, income, past voting history and now educational attainment. Various pollsters use different methods to adjust these results and have done this for decades. Yet, this introduces its own uncertainty. It means that polls are models of what they anticipate the electorate to look like in a given election. 

This brings us to Nate Silver. He makes a model based on polls. Or a “poll of polls”. But since the polls are models, really he makes a “model of models”. The logic is that the average of multiple data points will be more accurate than picking any individual poll by itself. (He’s right here, by a long shot.) And his model crunches tons of additional data and historical evidence to make it as representative of potential outcomes as possible. 

Yet, if all the polls Nate Silver uses have the same correlated error, his model won’t be accurate.

In other words, garbage in, garbage out. But garbage is too strong. So “slightly biased data in”, slightly biased data out. (As it stands, his model predicted 47 of 50 states, but we have to wait to find the margin of error, which is more important.) A model is only as good as the data going into it.

The Models Weren’t Wrong…We Were

Yet, the most misleading thing about the election wasn’t polling errors. The bigger mistakes are still being made.

  • A lot of folks saw the results on Tuesday night, and then rushed out to provide their takes.
  • Worse, even more  folks consumed a lot of data about the election on election night. And then they stopped. (Or they will stop when the election is called.) 

If you’re a voracious news consumer, you’re actually more at risk of this. For example, can you tell me what the polling error was in California in 2016? Did you know that polling actually underestimated Democrats in that election? (If you listen to Nate Silver’s 538 podcast, then yeah, you probably heard him say this, which is where I got it from.)

Yet, because California wasn’t a swing state and folks didn’t check in for final results (which take weeks, unfortunately, to get) most folks never internalized this lesson. They only internalized the miss in Rust Belt swing states. In other words, most folks were not properly informed about what happened politically in 2016 if they focused on election night. Meaning if they had to draw conclusions from 2016, they were more likely to make the wrong conclusions. 

Let’s explain what these decision-making errors are (via their logical fallacies if possible) to correct these mistakes.

Using biased samples/Drawing conclusions too early

This is perhaps the biggest problem with drawing any conclusions from the election:

We don’t have all the data in!

As I write this, California only has about 74% of its vote counted. Many other states are like this as well, and states are still certifying their results. Frankly, you can’t draw conclusions until you have a complete data set, otherwise you risk a biased sample size. 

Which is really ironic, isn’t it? 

The problem with the polls is they have some correlated error which makes them biased…and we judged that on election night with a biased sample size!

Specifically, many urban centers are very slow in counting ballots since they have orders of magnitude more votes to count. Yet, that definitionally makes conclusions biased towards rural communities. This is definitional bias in the “poll” of current vote tallies.

This happens outside of elections. For example, folks evaluate a feature film’s performance on its opening weekend box office. Which is pretty correlated with final performance, as I’ve written. But the two week box office numbers are even more correlated with performance. If you wait two weekends, in other words, you can have a more accurate recall of box office numbers.

The “Temporal data fallacy”/Drawing narratives from sequential data drops.

Obviously, the order of revealing the data shouldn’t impact our conclusions from the data as a whole. What matters are the final results. I’ve taken to calling this the:

The “Temporal Data Fallacy” is drawing a narrative based on the sequential release of data, when the timing of release is uncorrelated with the outcome.

And it doesn’t just happen for elections. In sports, we often weigh what happens at the end of a game much more strongly than what happens in the middle. But a missed basket in the second quarter impacts the outcome just as much as a missed basket at the end, for example.

Availability heuristic/Rare events are easier to recall than common ones.

If you watch all of election night, but have to go back to work the next two nights, then when you recall the election later, you focus on events in the moment, but not the outcomes that happen later. Moreover, the stronger the emotions you feel (like despair at losing) means you recall those events with even more alacrity. Which is why Biden could wind up winning with a greater margin than George Bush in 2004, yet it will feel like he lost because he lost Florida early on Tuesday night.

Folks like to mention Capital in the 21st Century as the most common book that is purchased but not read by intellectuals. I’d offer that Thinking Fast and Slow by Daniel Kahneman as the book that was most read but least applied. We all know the availability heuristic is a thing, but we still are walloped by it on a regular basis.

The curse of small sample sizes/Overconfidence in results.

Elections are a pretty small sample. Which Nate Silver repeatedly tries to tell us, but we usually forget. (They only occur every two years, and Presidential elections every four.)

That’s why his model had everything from a close Biden win to a big Biden blow out in their range of outcomes. With small sample sizes comes greater uncertainty. While Silver has tremendous uncertainty in his model, most folks only focus on the average outcome.

Valuing Process over Results/Expectations

This also relates to the penultimate problem, which is the focus on results over process. Maybe this is philosophical, but I’d rather be wrong for the right reasons, then right for the wrong reasons. The former means I’m still making accurate predictions; the latter means I don’t know. In Nate Silver’s case, his model only gets tested one out of every four years. Sure, sometimes he’s going to miss, but the value is in the model, not the results.

Meanwhile, we often only care about results in terms of the expectations. Thus, Biden will likely win, but since folks thought Democrats would win the Senate too, it feels like a loss. (Winning the Senate, House and Presidency had about the same odds as Hillary winning in 2016, 70-77% in 538’s model.) But the Democrats will likely win an election against an incumbent, which is really, really, really hard to do! That’s worth celebrating, though folks won’t. 

This happens in entertainment even more often. Say two films have the same budget. One is expected to gross $300 million and gets $280. The other is expected to gross $100 and gets $180 million. Sometimes the narrative praises the latter film, even though it made less money. But it beat expectations.

My Advice

So I have recommendations. To make you slightly better at analyzing data in your everyday life and professional role:

– Get rid of dashboards. Dashboards are the election night of data. They take a stream of data in and folks can check them whenever. Even if they don’t need the data or the data is wrong or a decision doesn’t need to be made.

– Determine what numbers are your signal, what numbers are noise, and don’t check the latter. Checking data that isn’t tied to key outcomes will only jumble the narrative and pollute your thinking.

– Check data less often. And check later. This is the hardest thing in the world for decision-makers. A new TV show comes out, so everyone wants to see the ratings. But if you don’t have a reason to check the data–and reason means a decision to make–then checking the data could mean you’re absorbing misleading data. Which the availability heuristic shows will be tough to forget later. 

– Have a “data” plan for your company. (And a communication plan while you’re at it.) This plan should explain what numbers your value and when you check them. And that should be tied to decisions.

– Lower or raise your statistical significance. One of the crazy parts of statistical analysis is how much we still rely on the 5% threshold for statistical significance. This is an artifact of pre-computer calculations. But for some measurements, we need more confidence, and others we actually need less. You should analyze your data with this in mind.

– Ignore most headlines with statistics you haven’t tracked. And please don’t repeat them if you don’t know how they were calculated. Those are likely datecdotes.

Entertainment Implications/Entertainment Strategy Guy Updates to Old Ideas

Entertainment is Filled with Small Sample Sizes

If you take away nothing else, remember this. Much data in entertainment tends to be annual, and that means your sample size is only as big as the number of years in your sample size. In other words, when drawing conclusions, be careful about overconfidence.

(Think box office year over year. Most folks will willingly tout all sorts of reasons for why the box office declined year-over-year or raised, but most likely it’s just statistical noise.)

Surveying Customers Is Still Valuable

One result of the election is folks questioning all political polling. Or asking if this is the end of quantitative data. It isn’t.

 In general, I’m a fan of more data in general: surveys, polling, quantitative, behavioral, even focus groups! They al have a role.

The key is finding which data matters and when it matters. But will “qualitative” data replace quantitative? Hardly. Surveys will still be better than relying on anecdata or datecdotes. 

(In TV in particular, if you get rid of ratings, and rely only on making TV shows using “qualitative” data, that could mean making TV for folks like you. Since you aren’t a representative sample size, this is a bad decision.)

The Presidential Race is Logarithmically Popular

My favorite chart returns! The spending/awareness for Presidential races is orders of magnitude larger than dozens of senate races, hundreds of house races and thousands of state legislatures. That’s logarithmically distributed!

Image 8 - without additionsStreaming Analytics Firms Have Polling Error…But What Is It?

This is a lens I plan to analyze all the streaming analytics companies through, some day. Some firms have potentially biased sample sizes (all users of their service is not representative), others have limited sampling (potential bias), others are limited by their own data (streamers know exactly how many folks watch their content, but not other streamer’s data) and some firms have models with unknown weighting (so you can’t judge the process).  

Given they are the best data we have, I will use their data. Heavily. But I’m aware of its limitations, which lots of news coverage doesn’t seem to be.