A few weeks back, I explained why “small sample size” dooms any effort to use big data to predict box office performance of feature films. But what about TV shows? What about streaming services? Can’t they use advanced algorithms to predict success there?
Nope.
As “No, Seriously, Why Don’t You Use Data to Make Movies?” explained in a “mini-statistics lesson” how small sample size and multiple variables combine to make forecasting very inaccurate in movies. Today, I want to take the lessons of that article and apply it to making TV shows in the streaming era.
Here are the key reasons why “big data” can’t solve making hit TV shows.
1. It’s also data poor environment.
To start, TV has long had fewer data points than feature films. Only recently did the number of scripted TV seasons pass feature films (depending how you count it). Currently, there are over 500 scripted TV series per year in the US. As I wrote last time, that’s still a small sample size.
2. It’s even smaller when you factor in returning series.
Most new “seasons” aren’t brand new, they’re returning seasons of TV series that have been on for several years. That kills your forecasting model.
Take Game of Thrones season 8. Yes, you could call “season 8” a unique data point to study. But with TV shows, to have an accurate model, you’d need to introduce a categorical variable, “has had a previously successful season”. The answer for Game of Thrones for that categorical variables is “Yes!” In other words, it’s super easy to predict that subsequent seasons of Game of Thrones and The Walking Dead will have high ratings because their previous seasons had high ratings. (Though not always.)
The challenge is predicting successful new shows, and that data set is much much smaller than the 400 or so scripted seasons produced every year.
3. The number of categorical variables for a TV show at “pitch” is near infinite.
When a TV show is being pitched or is at the script stage, it has a huge number of categorical variables still in flux. Each of these could influence the final independent variable, which is viewership (depending on if you’re a network or streaming platform you could define this multiple ways).
Everything from the director who ultimately directs the first episode or the acting talent who signs on to the story plan for season one could impact the ratings. Even variables most studios don’t care about like “who is the production manager?” or “can the showrunner manage a room of people?” are categorical variables that could affect the final outcome. Without a large sample size, it’s just tough to predict anything. (And some of them are super hard or very, very difficult to quantify.)
Many good or great scripts or TV pitches become bad TV series. For a lot of reasons that don’t have to do with the script. This is why “algorithms” can’t predict things with high confidence. This explanation also definitely applies to feature films, though I didn’t mention it last time.
4. Most pitches/scripts/pilots will never get made. Hence no “dependent variable”.
Most claims to use advanced metrics or analytics or data to pick TV series utterly discount this key fact. Sure you get thousands of pitches and scripts to read, but they don’t become TV series. Replace “dependent variable” with “performance” and you see the challenge. You have three scripts, and you pick one to become a pilot. The other two scripts don’t get made into to TV shows. So can we use them in our equation for forecast success? No, because they don’t have the same dependent variable to allow us to use them as data. All you can say is you didn’t make them into TV shows. But that’s not a data point.
5. Finally, most of the time, you can only control your own decisions.
The best way to control a data-driven process is to own all the data. And for a TV studio or streaming service, that means understanding all the decisions that went into making a TV show. So, if you don’t make it yourself, well, you can’t really understand what decisions were made. So for a streaming service, that “n” is very, very small.
So let’s use Netflix as an example. They made what, eighty TV shows to date? (Not counting the international productions, that again are their own categorical variables.) So the maximum for their sample size is eighty. Break it down even further by separating kids shows from adult shows and previous IP versus new IP and then you can break it down by genre. You see where I am going with this. The “n” is dwindling rapidly.
What about all the customer viewing data they had from the TV shows on their platform? Well, it doesn’t give Netflix that much of an advantage of traditional networks. Even if traditional networks don’t have Netflix streaming data specifically, they have Nielsen TV viewing data and box office data. Netflix uses that data too.
Which isn’t to say Netflix doesn’t have tons of data and doesn’t use it a lot. But they don’t use it to “pick TV shows” they use it broadly. The “data analysis” that Netflix does is pretty simple: it sees what is popular with its user base. So do traditional TV networks and studios. And what has Netflix learned? People like broad based comedies and dramas featuring crime and/or police. It also knows some people like quirky comedies and some others like arty-shows. (Netflix’ key advantage is it just pays a lot more for the same shows with less opportunity to monetize. That’s a problem for another post.)
So is Netflix is “using data” to decide on TV shows? Yes, but it isn’t that much better than the rest of the industry. Do they have an algorithm that tells them which shows will do well on their platform? Yes, but it is wrong a lot of the time.