My dad didn’t like the ending of Empire Strikes Back. His felt that it didn’t finish the story, it left off with a, “See you next movie!” conclusion. That irritated him. He hasn’t seen Avengers: Infinity War yet, so you ...
Hypothetical question: in any given week, do more people in America watch CBS or Netflix?
Think about it for a moment, but you know why I’m asking: the firing of Les Moonves is the most important story in entertainment. Absolutely for last week, definitely for the month and in competition for the year. I almost put some other articles that were overwhelmed by the news cycle in today, but the Moonves/CBS thoughts went long. Tune in Friday for those other ideas.
The Most Important Story of the Week – Les Moonves is fired/removed as CEO of CBS Corporation
Take that question I asked at the start. My guess is that many folks who live on the “coasts” would say Netflix. Many twenty-somethings and thirty-somethings would say Netflix. (I won’t use that term to describe them.) Heck, many people in entertainment & media would say Netflix, especially if they themselves cut the cord. I wish I knew the answer, but I don’t.
Here’s a bad data approach to comparing CBS and Netflix. One I expect anyone answering Netflix would use. Grab CBS’ highest rated show–The Big Bang Theory–and note that it had 18.6 million viewers. Grab Netflix has about 55 million plus subscribers in the US. Since 60 is greater than 18, Netflix wins!
If only it were so simple. That comparison isn’t “apples-to-apples” (my explanation of that term here). Netflix only releases subscribers. CBS only has TV ratings. The comparison above is subscribers to highest rated show, and logically the highest rated show is only a subset of all subscribers.
We don’t know Netflix’ highest rated show. So we can’t continue our comparison that way. But we do know CBS subscribers, since it is a broadcast channel featured in nearly every cable package, if we know the universe of TV viewing homes–via cable, broadband, satellite or over the air–we know it’s subscribers. That’s a number of something like 95-100 million households. (Note: I’m not counting Showtime or CBS All-Access subscribers either, since I don’t know the crossover.) Thus, the question hinges on the number weekly viewers as a percentage of total subscribers. If 100% of Netflix subscribers watch every week, then CBS needs only 55% of its potential audience to tune in. In other words, CBS has a huge head start in this hypothetical.
If I had to bet, I’d bet on CBS. And that quiet difference between the perception of Netflix and the performance of CBS says a lot about the entertainment industry as we head into the 21st century. Netflix may be the future, but CBS made a lot of money the last few decades as the number one broadcast channel.
Before I go on Les Moonves’ tenure as head of CBS, let’s provide a couple of caveats.
Caveat 1: This is my “gut” thinking.
I would really like to dig in deeper to the numbers behind Moonves’ tenure at CBS. But even something like CBS financial performance isn’t something I’ve studied in-depth. So as a reminder, this is my “gut” thinking as opposed to an analysis. (See the explanation for the difference here.) I’ll pull some data and links, but not a full-blown analysis.
Caveat 2: This is from a business/strategy perspective.
Following the financial crisis, the big question for business schools was ethics. Should/does business have any? Being a card-carrying liberal, in that tradition, I think it should. Conservatives in the religious sense should tend to agree. Only soulless free-marketeers would disagree. That said, today I’m writing about Moonves’ impact on the entertainment industry, and that means evaluating his performance by largely ignoring the ethical and social implications of the #Metoo movement. More importantly, others have written about it better, and my goal isn’t to echo good ideas, but to create new ones.
Regarding Moonves in particular, if a CEO commits unethical and/or illegal behavior, he needs to be fired without pay. A strong, independent board of directors should facilitate that based on a fair reading of the evidence. It’s pretty clear where the evidence led in this situation so despite his success, he shouldn’t be at CBS anymore.
So my conclusions/thoughts/predictions:
1. CBS was hugely popular in unpopular ways.
If you created a word cloud to describe CBS in the popular perception of Hollywood, you’d get something along the lines of…
…middle to lower class
Here’s the thing: that’s the perception, but is it the reality? Was CBS successful in middle America? Sure, but his shows were still among the most popular shows in LA and New York. Was CBS only successful among lower class viewers? Maybe he over-indexed there, but you’d be surprised how many wealthy people watch CSI or NCIS or Blue Bloods. Were their shows popular and hence lowbrow? I think this is fair in that critics couldn’t wait to pan most of CBS shows, but The Big Bang Theory is an awards juggernaut. (Which is a conundrum. You can’t call the Emmy voters out of touch when they vote The Big Bang Theory or Modern Family, then praise them for awarding Transparent or Veep.)
Les Moonves success as a TV executive mostly went unremarked by TV critics. Or at least it wasn’t a buzzy topic. Being popular tends to make you “unpopular” with critics, so CBS was generally not buzzed about. Or at least less buzzed about compared to the coverage of Netflix, Hulu or even Amazon Studios/Prime/Video during awards seasons.
CBS appealed to the masses and by doing it so it managed to be popular with just about every group. Not everyone, but every group. Since The Big Bang Theory and the NCIS/CSI families were super popular, they were popular across almost every demographic, geographic and social category you could find.
Why don’t we know this as a business community? Well that will take some time to explain, but to summarize, poor segmentation driven by “over-indexing” means that the entertainment business tends to stereotypes certain networks/companies.
(This little section inspired two future articles for me: 1. “Indexing…Explained!” and 2. An analysis of CBS to find the data or lack of data supporting the CBS stereotypes.)
2. Les Moonves really was a hit maker.
The watchers of the media world weren’t focusing on CBS, so it kept accumulating viewers even if it wasn’t accumulating reams of Emmy and Golden Globe awards. This is a very “gut” statement, and I hope to do the analysis on it, but it seems like every season CBS trotted out successful new shows to replace the ones leaving, across both drama, comedy and reality. If we charted out all the successful broadcast/cable shows in the 2000s and 2010s, we’d see that Moonves/CBS shows would take an out sized portion of the top 20% of shows. Given the logarithmic distribution of returns (so excited to use that already!), that means he had an outsized impact in creating hugely popular shows.
That’s why I call him a hit maker. Moreover, I didn’t realize until this week he was at Warner Bros. television during the dawn of Friends and ER. He really did seem to have the talent to make hit TV shows. Or at least identify those people who could make hit TV shows.
(This section is just the tip of an iceberg for a third article I’m writing, this one on “development executives” and how many hit makers there truly are. It’ll be fun and super controversial.)
The only caution to all this is the idea of “network effects”, which isn’t quite the right word, but close enough. Network effects are when a business that has a network gains additional benefits as the network grows in size. Facebook or Amazon Marketplace are the best examples; if everyone is on Facebook, you benefit more from joining the social network; if everyone is selling on Amazon, customers go there more often to buy things.
In TV, though, “owned-and-operated” media is the one place where size can beget size. So if you’re launching a new TV show, would you rather have Comedy Central advertising it to it’s hundreds of thousands of viewers, or CBS advertising it to the millions of The Big Bang Theory viewers? You want to be on the latter, which means it can be easier to launch new TV shows. CBS definitely benefited from this effect, but it can’t explain all of CBS sheer dominance.
3. This makes the CBS/Viacom merger more likely.
Pretty simply, the CBS board would have backed Moonves against the Redstones in the takeover. Now I can’t see that happening.
4. And I would agree with this merger from CBS’ perspective.
Clearly, I hate industry consolidation. But I hate it because it is a prisoner’s dilemma: if everyone else is consolidating (instead of growing by adding value), then everyone has to consolidate. If you don’t, as your competitors grow, they can use size as a weapon to negotiate with buyers, suppliers and customers. That’s bad for customers, and for the remaining small firms.
An independent CBS would have been fine with Moonves. Probably. Then, CBS could have bulked itself up to prepare for the impending “streaming wars”. Without Moonves, CBS really risks becoming an also-ran. Moonves was a hit maker and I have no guarantees his successors (whether in the CEO role or as head of development) will have the same ability. If his successors are just average, overtime the network effects will wane and CBS will go away. Instead, I’d recommend that CBS join Viacom and let size help them negotiate.
5. Who will step into the CBS void?
I don’t know.
It doesn’t have to be another broadcast channel. It could be, but no network has shown they have a reliable hit maker. It could be a cable channel, but again, no obvious examples jumps to mind. (If this were the mid-2000s, I’d have bet on USA, but they’ve under performed compared to their 2000s performance.) It could be a streaming service or premium cable, but only Netflix has flirted with popular programming for popular sake. The downside with Netflix is that their hit rate could be the lowest in the industry, which is the opposite of “hit making”.
Or no one. There isn’t a law that one channel/platform has to rise up and achieve dominance. But if a streaming/cable/broadcast platform wanted to seize TV market share, now is the time. If you have a hit maker, you could take CBS place.
The challenge is the development execs. In Hollywood, it’s sexy to make award winning, buzzy, prestige shows for peak TV. TNT, FX, Netflix, Starz, HBO, Amazon, Hulu, NBC and USA/Syfy have all dabbled or tried to pursue this strategy. It isn’t sexy to make cop shows. It isn’t sexy to make multi-cam sitcoms. They can make lots of money, though.
If you run a content company, do you have development executives willing to risk industry scorn for making popular shows that don’t appeal to critics? A lot of money and market share can be won by making popular things.
Long Reads of the Week – Other Good Reads on Les Moonves’ Exit
I enjoyed a few reads this week.
To start, listen to the emergency banter with Kim Masters and Matt Belloni of The Hollywood Reporter and KCRW on The Business. Alway worth a check in, and in their banter from this week on reporting on CBS board.
Next, The Ankler’s Richard Rushfield published in Vanity Fair about Hollywood protecting powerful men in Hollywood, with the flair he usually writes with.
I said others wrote better about the larger #MeToo explanations, and I’d point to Todd Vanderwerff at Vox (my go to general news site right now) as the best example.
Finally, I’d point out Joe Adalian’s piece for the “CBS will be fine” narrative that I sort of challenge above. That said, I’m simplifying Adalian’s poin. He has faith that CBS is the work of a few people, many of whom could have absorbed Moonves’ style.
My dad didn’t like the ending of Empire Strikes Back. His felt that it didn’t finish the story, it left off with a, “See you next movie!” conclusion. That irritated him. He hasn’t seen Avengers: Infinity War yet, so you know he won’t like that.
My article yesterday probably did sort of the same thing to the audience. I come up with this big conclusion—the logarithmic distribution—but then barely touch on it.
Well, since we’re already talking about the movies, we might as use that as the ur-example of my magic trick, “Logarithmically distributed returns”. I first learned this law by analyzing movie performance, and it’s my best tool for teaching it to others. But I’m not just going to show you this phenomena, I’m going to show you it multiple ways, in multiple categories. Then I’ll explain the biggest statistical mistake I’ve seen when forecasting box office performance.
Logarithmically Distributed Returns…What is it?
Let’s start with the last word. What I’m describing today is the “output” of most entertainment or media processes. So my examples are about the “result” or the “y-value” or the “dependent variable”, to describe it in three different statistical terms.
In other words, performance. This means how well something does. Box office for movies. Ratings for TV. Sales for music. Attendance for theme parks. No matter what the format, the success (or very frequent failure) is logarithmically distributed.
What does logarithmically distributed mean? Essentially, orders of magnitude. The returns don’t grow on a geometric scale, they grow on an exponential scale. This means that the highest example can be in the billions while the smallest can be in the dollars. That’s a difference in magnitude of 9 zeroes.
The most common summation of this is the “Pareto principle”, who coined the term about “power law” distribution. Roughly speaking, Pareto is summarized by the 80-20 rule, or 20 percent of the inputs deliver 80% of the returns. And like any mathematics/statistics topic, there are obviously a ton of variations on this law and specifics that I’m not going to get into.
(For those who are curious, inputs have their own distributions, but aren’t as reliably distributed as outputs. A topic for the future.)
Logarithmically Distributed Returns Visualized: Feature Films in 2017
Enough talk about what it is, let’s use an example. I went to Box Office Mojo and pulled all the films from 2017 that grossed greater than $0 in theaters. I didn’t adjust for year and pulled everything, no matter how small. The result was 740 movies released. Oh, and I only pulled domestic gross.
I’m going to show you the data two ways to help you visualize it. First, is the less accurate way, but I love it because it shows scale. This is all 740 movies plotted from lowest to highest, with the y-value as the domestic gross in dollars.
Source: Box Office Mojo.
I love how smooth the curve looks. But the true measure of the data is the “histogram”, where you count the number of examples per category. I set up the categories myself at $25 million dollar in intervals, starting from zero.
Source: Box Office Mojo.
Most people don’t realize how many films are written, produced and even released every year. Like I said, last year was over 700. So let’s add a threshold of $1 million dollars at the box office to our list. If I had production budget estimates, I’d sort by that, but the result gets you to the same place. (The reason for using production budget is that when you scan that “almost grossed $1 million threshold”, you see some legitimate films such as Patti Cake$ and Last Flag Flying, from Fox Searchlight and Lionsgate/Amazon Studios respectively. Those films cost a lot more than $1 million to make.)
Source: Box Office Mojo.
All the charts show the same story in different ways: there are hundreds of films that made less than $1 million at the box office, around 150 that did less than $25 million (many of which probably lost money), a range of movies in the middle and then a few monsters (Star Wars: The Last Jedi, Wonder Woman, Jumanji and Beauty and the Beast).
I think I can hear some of you insisting that I give you the “counting statistics”. You still want to know the average, right? Well here they are, for all 740 films. I mainly did this because I’m going to use them in the next section.
How Logarithmic Distributions Differ from Other Distributions
Perhaps the best way to describe the logarithmic distribution is to show how it isn’t other distributions. In other words, to show how inadequately the normal distribution and uniform distribution capture the performance of feature films.
Let’s start with the uniform distribution. The idea that, “Hey, a movie can gross anywhere between $600 million dollars (Star Wars) and $0, and every where in between.” What if we had an equally likely chance of that? In decision-making, the human brain often defaults to uniform distributions when assessing possibilities, so this isn’t completely academic. Here’s how that would look:
If only this were how to finance movies! The industry would green light a lot more movies. But it isn’t, only a few films hit that rarefied air of $200 million plus dollars.
What about the normal distribution? I tried to chart this, using our mean of $15 million and standard deviation of $50 million. Unfortunately, that gives us a lot of “sub-zero” grosses, which I just cut off at zero. The problem with the normal distribution is it makes misses as rare as hits. That just isn’t the case. Also, the odds of a giant hit become astronomical in a normal distribution. In this case, a hit like Star Wars: The Last Jedi would be 10+ standard deviations form the mean, meaning it has a 1 in a million chance. Obviously, hits like that happen every year, so more like 1 in 200.
Let’s put them all on the same chart, to really show how logarithmic distribution of returns just looks different.
Source: Box Office Mojo
This chart shows how quickly the results drop off in reality compared to other hypothetical distributions. If someone tells you Hollywood isn’t normal, show them this chart and say, “You’re sure right!”
Variations on the Initial Theme
I might still have skeptics in the crowd.
Maybe, they’d say, I just got lucky. That distributed returns happen to be power-law-based for the year 2017, but this lesson doesn’t really apply to other parts of film. Well, that would be wrong.
Spoiler alert: no matter how you slice the inputs, you get the same result.
First, I could expand the number of years I’m using. I happen to have box office gross from a project I did that covers 2012-2014. Here’s that chart.
Source: SNL Kagan
Here’s the next fun trick: the distribution of returns still applies for sub-categories. Take horror, which I looked at a couple of months back. Here are all the horror movies going back to the Exorcist, according to Box Office MoJo. Specifically, “Horror-R-rated”, which is 504 films:
Source: Box Office Mojo
The rule still holds! In this case, there has been one monster horror film—It—then some other smaller ones. Of course, I could hold all the box office and adjust them for into 2018 grosses. Does that change the picture? No, if anything it amplifies it. In this case, The Exorcist did $1 billion in adjusted US gross, and The Amityville Horror did $319 million. But for those increases, a lot of other smaller films drop down even more, especially recent films.
I’ve done this for a ton of different genres. Superhero movies. Foreign films. And it always holds. The only caution is that sometimes the “ceiling” of the range gets compacted.
What about sorting by something else? Say, rating? Do R-movies have more hits versus PG-13 or PG? Fortunately, my 2012-2014 data set has ratings. First, know that G, NC-17 and Not Rater just don’t have a lot of examples (only 45) so I deleted them from this analysis. Here are the other three, in line chart form:
Source: SNL Kagan
As we can see, for R, it holds. For PG-13, it holds. For PG, it looks like it holds, but honestly since we only have 39 examples, it doesn’t show as clearly. Increase sample size and we’re going to see this.
You could do this analysis setting for production budget and studio and even types of studios. As long as the input is independent, it holds.
Two Examples Where This Works Less Well
Listen, I believe in being up front with my data analysis. Even though this is a magic trick, I’m not trying to hide or obscure data that doesn’t make my case as well. That’s why I left PG rated movies in above, even though it’s the least logarithmic looking line in my analysis.
So in my experience, have I come across sub-sets of movies where my rule/law/observation doesn’t hold? Absolutely, so I’ll share those with you next. To clarify, it’s not that my magic trick fails, it is that the floor disappears. So look at this chart, from my series on Lucasfilm:
Source: Box Office Mojo
These is my data set of “franchises” that included Star Wars, Marvel, DC, X-Men, Harry Potter, Lord of the Rings, Indiana Jones and Transformers. As you can see, those films just don’t have flops. The “floor” is about 200 million in domestic box office, with only 14% of all films dropping below that. So it isn’t logarithmic on one end. I actually think my timeline of films by box office, with their names, shows this floor pretty clearly over time:
Source: Box Office Mojo
My rule doesn’t hold—this is important—when I sort by another output, not by an input. In other words, I’m sorting by the result.
A franchise is a series of films made off a successful first film. In other words, it is sorting by “success” of the first franchise film. Many aspiring franchises therefore didn’t make my data set. Four examples off the top of my head that I did not include, from three different genres: The Golden Compass, Battleship, The Lone Ranger and John Carter from Mars. If I included all aspiring franchises, the list would have looked more exponential Also, this data set is small, only 50 movies.
What about that huge data set I just pulled to look at Oscar grosses? Well, I haven’t even histogrammed that yet, so I don’t know what it looks like. So we’ll see. Again, though, this is in a way a “success” metric in that these are all “good” films. Obviously, a lot of films at the bottom of our list—meaning getting sub $1, $10 and $25 million grosses—were just bad, so no one saw them. With the Academy Awards, we’ve deliberately sorted that out.
Source: Box Office Mojo
The rule holds! Mostly. Now, with adjusted gross we do see a bit of a floor. Historically, a best picture film tended to get more than $50 million in domestic box office. But with both Oscars and Franchise Films, we can see that “super-hits” are still rare, but present.
Final Lesson: This is Why Linear Regression Doesn’t Work in Entertainment.
I have one final lesson for the data heads in the crowd.
Let’s say you’re an aspiring business school student who hopes to go into entertainment. Or you’re a junior financial analyst. Or a statistician diving into entertainment. (Three real world examples I’ve encountered.) You’re given a mess of data on the performance of feature films at the box office. And you want to draw some conclusions.
Well now that we know how our data is distributed—logarithmically—we should come to one clear conclusion: linear regression WILL NOT WORK!
It’s really just right there in the name. Linear regression works on things that have linear growth, and our things have exponential growth, which throws off all conclusions. The work around is that you can convert our data points to logarithms, and then have a “log-normal” distribution, which gets you closer to accuracy. (Though, as I wrote here, you still have a sample size problem.) In general, as well, since you have so few examples of success—the long tail at the right—you just can’t draw statistically meaningful conclusions.
Conclusion – What’s Next?
Well, I didn’t say this was a law of media and entertainment because it applies to feature films. I said it applies to everything. And it does.
But that’s for our next installment and another dozen or so tables and charts!
You want to know a secret? The underlying secret to all media and entertainment? The peak behind the curtain that explains all you see in film, TV, music and more?
Here it is.
“Logarithmically distributed returns.”
Once you learn it you can’t forget it. Like how to do a magic trick, which is what I call it, my magic trick for the business of entertainment. I didn’t discover logarithmic distributions. I first read it in Vogel’s Entertainment Industry Economics, the wonk bible of entertainment financial analysis. (Figure 4.8 in chapter 4 if you’re really curious.) I also assume it’s the theoretical underpinning of Anita Elberse’s Blockbusters, which I haven’t read. (Her book is one of those books that has been on my “to read list” for years.
Unfortunately, I can’t just show you that logarithmic distribution under girds all of entertainment. As important as the “logarithm” part of the statement is, the “distribution” part is even more crucial. I don’t want to gloss over that. The value comes in not just seeing one chart, but seeing the value of distributions as a tool.
Today, I’m going to teach you about distributions. What they are and why you need them. This is a mini-statistics lesson to pair with my other mini-statistics lesson on why you can’t use data to pick TV series. I won’t use any equations, because they’re boring, but I’ll show you what the distributions look like. Then, tomorrow I’ll show you the ubiquity of logarithmic distribution.
(As I recommended before, go pick up The Cartoon Guide to Statistics for the best reader on statistics. Learn them in a weekend. It’s way better than this very useful, but very technical Wikipedia page.)
Before we get to the “what” of distributions, let’s get to the “why”.
We live in a statistical distribution
A lot of news coverage on most issues—politics, sports, criminal justice, business—might mislead you on this point. The world seems like an either or world. This or that. One or the other. Binary choices.
But it isn’t. It’s a distributional world. What that means is that most outcomes fall on a spectrum of possible outcomes. An election could be won by a thousand votes, a million votes or ten million votes. A team can win fifty points, tie or lose by fifty, and everything in between. A blockbuster movie could earn a billion dollars or 100 thousand dollars or anywhere in between. A range of outcomes.
We often try to summarize our distributional life in “averages”. Let’s use an example to make it concrete. Since the NFL season just started, we’ll use that. I found the scoring margin of victories for all NFL games (2,668) going back to 2002 here. (The data set didn’t include ties.) If I calculated the mean average, I’d find that, on mean average, NFL teams won their games by 11.9 points. By median average, that number is 9. Of course, the mode, or most frequent scoring margin is 3 points, followed by 7 and 10.
Those numbers, though, aren’t very helpful. We know something about the data, but in general, we still don’t know what it looks like. Knowing what it looks like is a visual way of interpreting the data’s shape, size and characteristics. That’s where distributions come in. Here’s the above data in chart form:
A distribution, at its core, is a description of data, most frequently using a visualization to show you the percentage of outcomes. You could use tables too, but I’m a visual person. The key is that distributions come in lots of different shapes and sizes. Some fall into similar forms, but many are unique. Those shapes and sizes can have a huge impact on what the data means…impact that is lost in averages.
The Flaw of Averages
At it’s simplest, the flaw of averages is the old saying that a statistician drowned in a river with an average depth of 3 feet. See this cartoon from the San Jose Mercury News:
“Plans based on average conditions are wrong on average.”
Here’s an example of that in action. Say a manufacturing process has a ten steps to it, and each has a 75% percent chance of staying on time. That’s pretty good, seventy-five percent. So, how often is the process delayed? Many would say, “Oh, only 25% percent of the time”. Actually, the result is that the process is almost always delayed! It ends up delayed 94% of the time.
Most businesses, academics and journalists rely on the “average” when it is usually phenomenally misleading. The reason is simple: it’s easy. You have a long column of data, and one excel function returns you the median or mean average. You have to set up an entire chart to show the distribution, and make decisions for how you frame that. If you’re writing to publish on the web quickly, the average is easiest. Often, it’s the sexiest number too.
This has real world consequences. Have you ever seen a five year plan? Of course you have. A five year plan—90% of the time—is a collection of estimates of the average performance of a firm. A CFO took the average revenue projected and subtracted the average costs projected. See where I’m going with this? Financial plans based on averages conditions are wrong, on average.
If you’re reading carefully, you’ll noticed I switched from an example of a data set in the first section—NFL scores—to predictions about future financial performance of firms. This is really the key learning point for distributions: Once we have a description of the real world—be it for sports or finance or entertainment or biology or anything—we can convert our “counts” of real world phenomena into “percentages”. Those percentages become probabilities when we use them to predict the future.
The power of distributions is they help us predict the future, more accurately.
When I write about distributions today and tomorrow, I’ll use data set and probability examples interchangeably. Basically, if you’re describing data in the past, that’s a description of the data. If we use that to forecast the future, we’re in the realm of probabilities. Two sides of the same coin, the past and the future, split by the now.
Since predicting the future is tough—have I written about that yet?—we should use the best tools we have. And averages are poor tools compared to distributions.
Distribution Shape 1: Uniform Distribution
So let’s start with the simplest distribution: uniform. This means that in a scenario every outcome is equally likely. What’s the easiest one to show? Dice!
Quick, what is the average roll of a single dice?
This is one of those brain tricks that I believe Daniel Kahneman and Amos Tversky used to show how behavioral economics works.
Did you say 3? A lot of people do. Take a look at the chart below, showing our first distribution, the uniform distribution:I’m going to explain the axes so we’re on the same page. The left hand axis, the Y-axis, shows the probability of a specific outcome. The X-axis, the one running on the bottom, shows the potential outcomes. For a six sided dice, you have six outcomes, returning a 1 to 6. If you only had a coin, you have only two outcomes. If you’re playing Dungeons and Dragons and had a ten sided die, you’d have ten outcomes. The more outcomes, the lower the odds in a uniform distribution.
Mathematically speaking, this is a “discrete distribution” where you have a specific number of possible outcomes. You could also run a uniform distribution as a “continuous distribution”, where it has a range of infinite outcomes. In today’s article, I’m not going to dive deep into the differences between continuous and discrete probabilities, because I mainly want to show the shapes of different distributions, not how to calculate them. I used the dice example above because continuous uniform distributions are hard to find good real world examples. (I went to my statistics textbook on my bookshelf, and it had an example about a pipe bursting, which wasn’t great. Yes, I keep my statistics text book close at hand.)
Back to the brain teaser, most people just naturally think that since three is the two halves of a die (three plus three), it is the expected value of a die roll, not 3.5. Again, the “average” of 3.5 tells you hardly anything about rolling a die; the distribution says everything is equally likely.
Distribution Shape 2: Discrete Probabilities
What if we don’t have a uniform set of probabilities, but a different amount? So we still have a limited (discrete) set of outcomes, but all sorts of different probabilities? To use the dice game, some board games, skew the odds for rolls. So if you roll a six you “win” a prize, if you roll a 3-5 nothing happens, or if you roll a 1 or 2 you “lose”. Scenarios like this happen in certain cooperative or advanced board games like Eldritch Horror. Yes, I’m a nerd who has a stats textbook and plays board games. This outcome would look something like this:
This type of distribution is great for scenarios where you know all outcomes aren’t equally likely, but you may not have good data so have to make estimates. I did this for my Lucasfilm series in the section of feature film projections. I don’t have data that predicts how many future Star Wars films Lucasfilm will make, but I know all outcomes aren’t equal. Same with box office performance. So I made some assumptions. So here’s how that turned out.
I converted those percentages to the total box office as a percentage of initial price, which gets us a range of outcomes. (This should look similar to fans of Nate Silver’s 538 website.)
The key for discrete probabilities is they still need to add up to 100%. Otherwise, you’re missing something. That said, you can quickly complicate it by having correlated variables and other interactions. Again, just know that discrete distributions can look all sorts of different ways, like my Star Wars example or the NFL scores above.
Distribution Shape 3: Binomial Distribution
Regarding uniform distributions, there is really one even simpler than a six-sided dice. It’s the most simple game of chance, and I would have put it first if it didn’t make such a great bridge to the next distribution. That’s the outcome of a single coin flip. In chart form, it looks like this:
A dice is a uniform distribution with two outcomes. Yes or no. Heads or tails. Odds or even. So on. They’re “mutually exclusive” meaning you can’t have them both occur at the same time. The name for this in statistics is “binomial”. Now you can alter binomials in two key ways and ask a lot of fun questions on those alterations: first, you change the percentage from anything above 0 to below 100%. Then, you can repeat the number of “experiments” which is what you call a single coin toss.
What if you take that outcome, and run the scenario multiple times. So you flip the dice twice, or three times or four times and so on? Well, you get a binomial distribution. This a way to show the outcomes of the data and their various probabilities. It looks like this:
If that looks familiar, well hold on a moment. The key to remember right now is that this type of distribution is the “discrete” scenario where you have a limited number of tests. In the real world, with natural phenomena, you have a continuous range. And that looks different.
Distribution Shape 4: The Normal Distribution
You’ve heard of this one, haven’t you? The chart that shows a peak in the middle, that tapers out to it’s ends? Of course you have. It’s called normal because it is so widely taught, but as I was looking for information, I was reminded that technically this is a “Gaussian” distribution. Here’s from the Wikipedia page that captures the ideal normal distribution.
However, for how common it is, it very rarely occurs perfectly in nature. The classic example is height. Here’s that ur-example:
The funny thing about showing real values is that you can see this isn’t a perfectly even normal distribution. And I pulled this from the US census (and then the link broke on me).
To explain, the x-values, along the bottom axis show the various heights we’ve measured. So we start at five foot four and continue to six feet four inches. The left hand axis, the y-value, is the output which in this case is the count of the sample of people. Or it would be, except in this case it’s already been converted to a percentage to show the population of America.
The results cluster around the middle of the range. So the vast majority of things are close together in around the average. This is why height is such a good explanation for the normal distribution The majority of men are around 5’10 in height, according to the above statistics. And the vast majority fall within 4’ inches of that range, between 5’7 and 6’2. The people who are much, much taller, say 6’8 and above, are very very rare.
The clustering around the mean average is what makes a normal distribution normal. As the Wikipedia example two above shows, 68% of things within one “standard deviation” of the mean. Standard deviation is a measure of how much a data set is spread out, which I probably should have mentioned earlier. In a normal distribution, really rare examples start at “3 standard deviations” from the mean average. So if something is “5 standard deviations” from the mean, like say seven foot tall men, it’s really, really rare.
Of course if somethings isn’t normally distribution, those same conclusions are less important.
Which is a good time to marry the caution I put at the start. I said that I would be using both distributions to show both probabilities and descriptions of data, and height shows how they interact. If you know the historical height of a group of people—and it is statistically significant, which is another stats topic for another day—then you can use the outcomes in the sample group to form probabilities which you can use to predict outcomes.
In other words, given that we know that less than 0.1% of people are greater than seven feet tall in the American population—and we have a sample of hundreds of millions showing this—we know that the odds that any baby born will be seven feet tall are extremely minuscule.
Distribution Shape 5: Variations on the Normal Distribution
The normal distribution can be tweaked in all sorts of ways. First, it can be either very skinny or very wide, as these charts show from Wikipedia.Second, the distribution can lean one way or the other. It could lean right or left. Here’s two examples of that, again from Wikipedia.The Most Important Distribution Shape for Entertainment: The Logarithmic Distribution
Well, I hope some of you are still with me. Cause here’s where the magic starts.
The final chart is for distributions that have variance that isn’t linear. It’s exponential. So the numbers at the tops aren’t multiples of sample at the bottom, they’re orders of magnitude larger. I call this, “logarithmic” distribution because it increases by orders of magnitude, most often exprssed in “base 10”.
(I say logarithmically distributed, even though technically that’s for discrete distributions. Also, some power-law distributions can turn into a normal distribution by adjusting the numbers to logarithms. Again, a lot of specifics that I won’t get into. I just want you to see the shape.)
Anyways, take a look at a logarithmic-exponential distribution and a Pareto distribution.
To explain one last time. The x-axis running left to right shows the various outcomes. This could be wealth owned. Or the population of cities. Or the value of oil reserves. Or the returns on owning various stocks. The y-axis is the probability of that occurring or the count of the observed phenomena in a sample. So most of people (say 80%) have hardly any wealth. Or most stocks return very little money. But a few at the far right of the distribution have an inordinate amount of wealth. Or a few stocks have incredible returns. (Apple, Amazon). Or have incredibly valuable oil fields (Saudi Arabia).
Or become massive blockbusters at the box office.
Tomorrow, I’m going to show you a bunch of examples in of this distribution, so that hopefully you never use the “average” in entertainment again.
I try to write these updates to post by late afternoon on a Friday. Often–most weeks actually–I miss that optimistic target and finish them over the weekend/first thing Monday morning, then back date them to the week they cover.
Obviously the biggest story of entertainment was Les Moonves being fired, but that happened on September 9th, two days after this “update”. So I’ll cover another story and this week, but rest assured I’ll chime in on the Les Moonves controversy at the end of this week. (Or early next Monday morning.)
The Most Important Story of the Week – Broadcast TV Ratings Continue a Slide
A few weeks back, I checked in on the box office results for the summer so far. In my ideal world, all senior executives–heck, all managers period?–would “react” to data by not reacting. That’s right, in my opinion, the real-world-ification of data hasn’t made us better at making decisions. If anything, it causes us to react to bad data or uncorrelated data. (This includes “real-time dashboards” and “email alerts” for data. Even weekly updates can be misleading if the trends are sustained.)
Let’s apply this philosophy to TV ratings. Do executives “need” to know how a show did in over-night ratings, especially since they focus on C+3? For instance, the Thursday night football game from last night had a three year low in viewership. Does this portend down ratings all season? Maybe, but we won’t know. What if Sunday has a high in viewership and some combination of the teams involved, the rain delay and the fact that a lot of people (like me and Bill Simmons) hate the idea of Thursday night football games?
So we can step back and look at the ratings from the season as a whole, which the Hollywood Reporter did for us, including emphasizing that broadcast generally, and scripted shows particularly, were down. So that trend continues. I also love learning that a show I’d never heard of–Yellowstone–was a beast in the ratings on a network most people haven’t heard of. (The new Paramount TV, converted from Spike.) Also, for all the buzz I heard about Succession, Sharp Objects actually delivered higher ratings, which I feel like happens a lot for HBO series (the more popular series have less buzz and vice versa).
Other Candidates for Most Important Story – Amazon Had Technical Problems in US Open Coverage in UK
This is one of the stories I have a feeling most people missed. In short, Amazon Prime Video is distributing live sports in various territories, like how it did in the NFL Thursday night games last year. The big debut in the UK was it’s coverage of the US Open in tennis, but it had a lot of technical issues such as a limited number of games and lagging.
This isn’t THE most important story because surely Amazon can throw engineers at the problem. But it’s a good lesson. As a community, the mantra goes that “content is king”. Don’t forget, though, that “UX is the bishop”. Or hand of the king? So the metaphor isn’t great, but know that a crappy
Big Bad Data of the Week – The Hollywood Reporter on International Film Sales of African-American movies
Honestly, I hesitate to even write this little blurb for fear of offending people. So let’s be clear: I want more “variety” in my movies (wait until my listen of the week to explain that term). I love diverse movies on a variety of topics. I celebrate those. And celebrate diverse voices in directing, acting and writing. I also think I have a better grasp on the problem than most execs (panels and reports don’t solve problems; economics do), but they will never solve it because of self-interest. (Basically, nepotism, self-dealing and bias towards class prevent true diversity/variety.)
To solve our problem–a serious lack of diversity–we need to be precise in diagnosing the problem. We have to let the data guide our decisions. The old axiom, “multiple anecdotes don’t make data” applies here. Unfortunately, too often the latter happens when discussing diversity.
I see this a lot in coverage about the success of films featuring diverse casts, including African-American, Latino and, recently, Asian-American casts. Instead of drawing an entire data set of all movies, articles such as this prominent one by The Hollywood Reporter rely on a self-selected dataset featuring a biased sample of successful movies.
To start, this is an example of the availability heuristic at work. The availability heuristic is when your brain calls out easily “available” examples. Often, these are misleading examples and not a representative samples. In films, it’s easy to think of popular/successful movies–especially if you have an emotional connection to them. It’s much harder to think of flops.
Take the sample set from the above article. These movies are hardly representative of all movies. They feature films nominated for Oscars. Oh, and a Marvel movie, either the first or second most successful franchise in film history. The alternative is to capture all movies in a given time period, give them all diversity categorizations, then measure performance. That takes time, and a lot of journalists and companies don’t take the time to do that analysis.
This is really important for the decision makers. I’ve first hand seen the availability heuristic and, more importantly, a biased sample get a 9 figure business plan launched. It later lost the company lots of money. (The key to the success of the plan? HiPPO. See here or me writing on it here.)
We have a diversity/representation/variety/inequality problem throughout our industry. We need to solve it, and bad data doesn’t do that.
Listen of the Week – “Variety” episode on Martini Shot by KCRW/Rob Long
I loved two things about this episode:
- The word play between “variety” and “diversity”. You can just tell by listening that Rob Long is a writer; he’s a wordsmith. Sometimes changing one word can have profound effects on how you look at an issue. This wordplay did that for me. As he points out, the examples of “diverse” films don’t feature diverse casts, they feature in some cases uniform casts, just different than traditional films. So the better word to describe that is “variety”. Rob Long says it better.
- He gets at why variety is so valuable. Sometimes we focus on diversity for diversity’s sake. Which may be okay. But from a business standpoint, a well-executed film featuring a unique subject matter can offer audiences something they don’t usually see. That leads to higher box office returns in general, and this applies to all sorts of films.
(This is Part VII of a multi-part series answering the question: “How Much Money Did Disney Make on the Lucasfilm deal?” Previous sections are here:
Part I: Introduction & “The Time Value of Money Explained”
Appendix: Feature Film Finances Explained!
Part II: Star Wars Movie Revenue So Far
Part III: The Economics of Blockbusters
Part IV: Movie Revenue – Modeling the Scenarios
Part V: The Analysis! Implications, Takeaways and Cautions about Projected Revenue
Part VI: Disney-Lucasfilm Deal – The Television!)
In business school, as I said in my first article in this series, I was super bullish The Walt Disney Company. The Lucasfilm acquisition followed on the heels of the Pixar and Marvel acquisitions—which were already doing well—and at the time ESPN was a cash juggernaut. Strategically, they’d made a series of great decisions.
Still, those moves, while good, weren’t the core reason why Disney has succeeded so much over the last forty or so years. I believed then, and still do now, that Disney is one of the few movie studios that has a business model derived from a distinct competitive advantage. As others have written about, this competitive advantage goes back to drawings by Walt Disney in the 1950s.
Basically, while having a great content is at the center of the plan, they develop and reinforce their relationship with customers through everything else. Or, to be cynical they make their money off everything else. Walt Disney created an iconic character in Mickey, then another in Snow White, then another in Cinderella, and so on to start. Then Walt Disney (the person and the company) would monetize the characters through music and books and comics and eventually television. Then they pioneered the concept of theme parks. Michael Eisner took this approach and applied it to home entertainment and acquiring TV networks.
When I was in b-school, I took the famous chart and summarized it in economic terms thusly:
This is the simplest description of supply and demand in the marketplace, the core model at the heart of economics. Basically, along any curve, you maximize your price and quantity sold to yield the highest profit. I’ll cover this more when I write an article on “Transaction Business Models Explained!” (the sequel to my two articles on subscriptions) but for movies you basically can only charge the same price per movie ticket, regardless of movie. As a result, to maximize revenue you need to maximize customers, and hence Hollywood makes blockbusters.
Most studios stop there. But not Disney. They aren’t just selling movie tickets, they’re selling merchandise on top of that. And then, for the piece de resistance, they sell theme park admissions (and in park up-sales) in an experience they own outright. Other studios do this, but nobody does it as well as consistently as Disney.
In my adventures after business school, I’ve only become more convinced that Disney knows its business model, knows its competitive advantage and makes moves to sustain that model. They may be the only movie studio, er, “giant media conglomerate” that has a competitive advantage. To continue our series on Lucasfilm, I’m going to add in the rest of those boxes going up, starting with merchandise.
When I wrote “Theme 2: It’s Not Value Capture, It’s Value Creation” last week, I made things seem really simple. Probably too simple.That said, I hold to my core point: most businesses could benefit by pulling out that chart and answering three simple questions,
“What price do we charge customers?”
“What is their willingness to pay (WTP)?”
“What are our costs?”
Then they could ask the forward looking questions: “How can we raise the willingness to pay for customers?” or “How can we lower our costs?” In short, how do we create a competitive advantage derived by creating value, not capturing it?
Real life, unfortunately, is never that simple. That simple chart gets complicated. Really quickly. Here are some ways.
The entire value chain
I kept the chart and examples from the last post relatively simple. I only used one buyer and one seller. But this transaction is repeated down the chain. I pay the store for the beer, the store pays the beer distributor, the beer distributor pays the beer producer and the beer producer pays it’s suppliers of water, hops and aluminum. Each stage has it’s own version of this chart.
This applies to film: the production company pays the talent (who pays a piece to their agent), the studio pays the production company, the distributors pay the studio (theaters, tv networks, streaming platforms) and the distributors collect the money from the customers.
One time transactions versus relationships
Of course, I don’t just go into the story to buy beer once, I go in regularly. (Not that often. Well, maybe.) For customers, regular trips like this can develop habits or a sensitivity to the changes in the price. So I could choose to measure the WTP/Price/Costs as one time events, or over the course of a month, or over a year or even longer. That’s a great way to make something simple complicated.
For example, say you lower the price of a good, which causes a customer to buy it more frequently or larger quantities. In other words, this chart looks like a single transaction, where profits went down, but they would go up with increased iterations. Of course, a customer could just stock up on items and store them, which means you did lose value, but the customer gained in consumer surplus. This is an age old challenge in “consumer packaged goods” that can offer regular discounts. Like I said, it gets complicated quickly.
This biggest ramification for this for entertainment is evaluating subscription services. Analyzing MoviePass last week, I focused on the per month value chain. Arguably, MoviePass could consider their relationships annually, so they look at it on that basis. Maybe any given month is a bad deal, but over a year it saves you money. Or take HBO, I subscribe for a year, usually, but the biggest TV show by far that I devour is Game of Thrones. Is a year subscription worth that one show? Maybe, so being too lazy to aggressively cancel isn’t that bad of a deal, overall.
Distributions of people
I hate averages. Telling me the average almost never tells me anything useful about a data set. Take height: most men are five foot eight inches tall. Is everyone clustered around that point, or are there outliers? (Maybe an excellent explainer on this next week.)
Same with movie box office grosses. Chart it next to height and they look completely different. One is logarithmic and one is normally distributed.
So the value creation chart is basically the averages, especially for WTP. To extend the beer analogy, some people would pay a lot for a very bitter IPA, other people would pay a little more, many wouldn’t pay anything. And even among the people who would pay for it they have different values attached to the IPA. You can’t really summarize this as one number, though that’s exactly what I did.
When in doubt, use distributions, even with value creation. Understand who gains the most and try to emphasize that, but don’t stop with the averages.
You can’t measure parts of the chain
Especially “willingness to pay”, which is an imaginary value. How do you measure imaginary? Well you have to guess, and there are complicated and often unreliable ways to do that. (The worst way? Ask someone what they would pay for something. That never works.) The most reliable way is a conjoint analysis, but even that can get unwieldy with too small a sample size.
Streaming services are bedeviled by this problem, especially when they have to figure out what consumers actually love on their platform. Is it Stranger Things? Or GLOW? Or both, in some combination? Or is it actually the Disney movies, but the other shows are filler? That’s an epically tough problem to sort out.
Costs can be tricky
The “costs of goods sold” can be difficult to allocate. Especially for support functions that don’t directly tie to a good. Allocating the value correctly can be the difference—in a big conglomeration—between profitability or loss. Right now, content costs and how companies allocate those costs versus the prices customers pay is the biggest accounting/economics/finance question in the industry. Getting that answer right could determine he future of entertainment, for good or ill.
With this update, we’re officially out of the slowest, dumbest month of news, August. Here’s my round-up of the “Most Important Story of the Week”, a few days late because of that blasted long weekend. (I’ll save my rants on how much better America would be with more 3 and 4 day weekends for a future article.)
The Most Important Story of the Week – The Fall of Global Road
So I held this story for a week. Coincidentally, I’d been mining some box office data for another project, and had looked into Open Road’s film history. I’ll admit when I first saw Global Road when bankrupt, I thought, “Wait, what is Global Road? Oh, it was Open Road.” Then I thought, “What happened?”
The story has been well covered. Since I spend so much time “reacting” to negative news stories, it’s worth praising when the trades really dig in well. (Hat tip to the Hollywood Reporter and Variety.) That said, I have a theory that the trades usually know the dirt on companies, they just wait to dig in until after an adverse incident (bankruptcy, firing, scandal). I, on the other hand, have no problems calling out what I perceive to be bad strategy.
If I had one single take away from the demise of Global Road, it’s this: “content is hard”. Especially when someone is keeping track. Looking at their slate, Global Road, and Open Road before it, didn’t have a huge blockbuster in the US they could hang their hat on. Without that huge hit–and not owning any IP outright–they couldn’t sustain operations.
Who should we watch out for as possibly being next? Well, a candidate off the top of my head–and note this down for a great future project for the Entertainment Strategy Guy, predicting who could go bankrupt next–is STX Entertainment. I devoured the New Yorker profile of that company, and frankly couldn’t understand their competitive advantage beyond “China money”. Let’s compare Open/Global Road’s US domestic box office performance and STX’s same numbers for the last three and a half years:
In chart form, with each film’s gross as the Y value: In table form, counting the number of films at various box office levels:
(Source: Box Office Mojo. Open Road. Global Road. STX.)
(I used unadjusted box office gross from Box Office Mojo, going back to 2015 and deleting any films less than $1 million in total box office, which was three films for Open Road.)
Why did I think of STX? Well, Global Road just released the underwhelming AXL and STX released the underwhelming Mile 22 and Happytime Murders. Both are backed by Chinese money and new mini-majors headed by execs with long careers in Hollywood. But looking at the data, we can see for the near term, STX overall has just a higher trajectory. In addition, STX has had a “hit” that spawned a sequel, Bad Moms and A Bad Mom’s Christmas. Now the “hit” wasn’t tremendous ($113 million US) but that’s enough with their supposedly huge line of credit.
Of course, STX may have higher aspirations and may lose more money when you factor in production budgets and P&A spend. Arguably, they shot the highest on with Valerian and the City of a Thousand Planets, which only did $41 million in the US with a big marketing spend. It had franchise potential, but didn’t love up to the billing. (It did do $184 million in foreign box office; I don’t know how much STX kept of that.)
You know what is really cool, though? The ability to “keep track” of how well movie studios are doing. You know who I can’t do that for? Television shows that premiere on digital platforms like Netflix, CBS All-Access or Amazon Prime/Video/Studios. Instead, everything is a winner based off buzz. With movies, you still need a good box office performance to justify your existence. Enough flops and you go out of business. (First, Relativity, now Global Road.)
Which also brings us to the “successful” part of both Global Road: their TV business. (Paramount is having the same story right now; STX has moved into TV too.) How is it that a company can’t make enough successful movies to stay in business but they can for TV? Well, because SVOD platforms buy TV shows by the boatload, and pay profit up front, instead of in success. Since every show is renewed–no one fails in streaming–everyone in the TV production business is finding buyers for shows.
Which doesn’t mean people aren’t losing money in TV streaming, it’s just that they can afford to lose money and Global Road couldn’t afford it in movies.
Long Read of the Week – How Hollywood is Racing to Catch Up with Netflix in Variety
I’m going to stop writing on the above topic before it turns into a “Long Read” of the week. Instead, you should head to Variety for this good summary by of the state of “direct-to-consumer” offerings in the marketplace. The most useful part is the summary of each DTC service, it’s pricing and some basic information about the services then the summary of the streaming video players. The most glaring omission is something author Cynthia Littleton doesn’t have: the daily, monthly and annual active users and subscribers by platform. (It is also a little too praising of Netflix for losing billions every year, but isn’t everyone?)
I’ll also say there remains a glaring disconnect between the huge volume Netflix offers and it’s low, low cost compared to all these new DTC options. How is it possible? Well, Netflix loses money and Disney needs to earn a profit, as Littleton points out. This disconnect for me tarnishes the entire Netflix narrative, or at least challenges how disruptive it truly is, but that’s for later articles.
Listen of the Week – Malcolm Gladwell’s Revisionist History on “12 Rules for Life” and Pulling the Goalie
All the development executives of the world should listen to this episode. It argues you should “think the unthinkable” and ignore the responses of fellow humans. For me, the episode illuminated the key challenge of our industry: “relationships”. If you listen to Kim Masters regularly (and you should!), then you can hear her skeptically address outsiders coming into entertainment who don’t understand this is a relationship-driven business.
This is why the hockey coaches in the podcast–not to spoil Malcolm Gladwell too much–won’t pull the goalies. Their relationship with the fans would suffer.
But it’s also why the people who recommend this strategy–pull your goalie! Be unconventional!–work in one of the few fields where you don’t need relationships (or as many), which is hedge funds. They can do their trades automatically via computers…so they don’t really need to worry about pissing off people. In the parts of finance where relationships matter, like investment banking or wealth management, this strategy wouldn’t work. But certain hedge funds can get away with it.