How I’m Using AI (And Why I’m Still Underwhelmed)

(Welcome to the Entertainment Strategy Guy, a newsletter on the entertainment industry and business strategy. I write a weekly Streaming Ratings Report and a bi-weekly strategy column, along with occasional deep dives into other topics, like today’s article. Please subscribe.)

So many newsletters use AI to generate intro images these days, so I figured I might as well. Based on my inside joke about heading to my “data bunker”, I used this prompt: “worker in a data bunker, in the style of a World War II cartoon”, but it looks more Studio Ghibli to me. Also, “DATA BUNKER” just written on the wall feels pretty on-the-nose.

I’m not sure why this hasn’t become industry standard, but I believe that every website/publication/newsletter/blog/what-have-you should disclose if they’re using AI and how they’re using it, just like I’m doing today, updating this exercise from eight months ago.

Since I wrote about this last October, I stopped using LLMs for a bit (out of frustration since it wasn’t worth what I was paying), but I made a second push a few months ago after reading numerous articles about AI improvements. Specifically, LLMs can now search the web, which is really useful to me, and I wanted to test out LLMs’ newer (alleged) logic functions. Frankly, I don’t want to be left behind in knowing how to use this potentially ground-breaking technology…if large language models can fix some of their bigger issues. 

Unlike last summer, AI/LLMs are finally/sorta useful, though still wildly overhyped in my opinion. And yes, my personal experience does tie to my thinking on the strengths and limitations of AI.

How I’m Using AI Update

First off, two assurances: 

  • AI will never write anything on the EntStrategyGuy website or newsletter.
  • Any AI data collection will be rigorously double-checked by a human.

So with those two caveats out of the way, here’s how I am using LLMs/AI:

  • My LLM is an EXCELLENT companion to Google; I regularly use both search engines now. (This probably says more about Google atrophying as a product due to market power than AI itself.)
  • I use LLMs to format links for me, with spotty success.
  • I’m regularly utilizing AI to convert images like top ten lists into spreadsheets. The conversions are mostly solid (using some logic to transform them), but additional logic requests completely flummox it. Right now, this transcription is the best, most time-saving use case of AI for me.
  • I experimented with using an LLM to write social media content, and while it performed better than last summer, it’s still not good enough to make it usable. (I hold social content to a lower standard than my published writing.) With some training, though, it might get there.
  • I plan on integrating AI/LLMs into doing more data/coding work for me in the second half of the year, and when I do, I’ll let you know.

That’s about it for now, but my key takeaway is this:

AI/LLMs are really, really subpar at some very basic tasks. 

Honestly, using LLMs numerous times per week, I keep getting frustrated at its inability to perform some simple tasks without making obvious mistakes. Comparing my experience to the breathless headlines, I feel like the Mugatu GIF in real life:

I pay for a top-end model from a premier LLM, yet for everything I listed off above, the LLM makes mistakes, if not huge mistakes:

  • When it comes to formatting links, my LLM constantly flubs the style. And if it can’t format the link—because it can’t visit the website—it never tells me. Instead, it will always deliver incorrect info rather than say that it can’t deliver a result.
  • If I ask the LLM to search within a specific time frame, at some point, it will just forget the time frame. And its search ability has seemingly gotten worse.
  • Any attempt at asking for logic completely breaks down, like providing an LLM a series of spreadsheets and asking how many weeks a title made the charts. This fact, in particular, is why I have to rigorously double-check everything, but also why LLMs aren’t saving me much time as they could.
  • I literally couldn’t get the LLM to count a list of films. Just asking it to count a list, it put the wrong number at the top (so I had to ask it to put the number at the bottom). But that’s really, really basic. Maybe you can put this in the bucket of “Knowing how to use LLMs well” expertise, but it makes me very skeptical when people brag about it doing very complex tasks.
  • Even the tone is inconsistent. I tried to give it personality, but after a little bit of conversation, the personality disappears, and it sounds like an obsequious (but inept) toadie who uses a Thesaurus way too much. Also, frankly, I’m constantly begging my LLM to just say less and give my information more concisely, but it always goes back to being a talkative son of a bitch.

I really can’t understate how often the LLM I use hallucinates or delivers inaccurate information/subpar results, but equally frustratingly, doesn’t fix the problem despite my constant prompts providing instructions. The LLM will promise me that it’s fixed the problem, and it won’t happen again, then that same problem happens within the hour. 

Literally, just yesterday, my LLM had the worst performance yet for my regular, repeated data requests, even flubbing chart conversions. The service seems to be getting worse. 

Why This Matters

I can’t trust any data analysis/data collection that entirely relies on AI or LLMs to find and compile the results. I just can’t. Recently, I listened to this Planet Money podcast about a DOGE worker who developed an AI program (in about fifteen minutes) to flag potentially wasteful contracts; I don’t know how an LLM can make judgement calls on government spending but can’t figure out if The Four Seasons was on JustWatch for three weeks or four. (Even after I asked my LLM to spell out the workflow that got these inaccurate results, it still got the answer wrong.) 

AI gets so much hype, and I have a feeling that, right now, examples of AI screwing up are getting “file drawered”, both in academia and the media. Physics researcher Nick McGreivy wrote about this for Understanding AI, describing how the hype around solving a particular problem in physics didn’t live up to the reality. But there are many more published papers (and funding) for AI research than research tamping down the hype. McGreivy also mentions an example in material science that got a lot of hype about AI’s accomplishment, but failed to replicate. Another very influential paper on material sciences by an MIT graduate student was retracted by MIT.

In my field, I know of two or three other data wonks who use AI/LLMs to synthesize datasets. What holds me back from doing the same is that I’m just terrified that an LLM would start making up results midway through the analysis, and I’m not sure how you’d double-check if it did. 

If corporations are using LLMs en masse, my god, I’m afraid for a lot of business data in the future. (Unless more advanced AIs in the future can go back and fix these problems.) If you can’t personally verify every piece of data found and compiled by an LLM, I’m skeptical. 

To repeat, I’m very worried that a reliance on AI is introducing data errors across our economy, and we may not be ready for that.

(To be clear, my skepticism applies to large language models specifically, not the broader “machine learning” applications, which often don’t have the same issues. The former have well-known hallucination issues, while the latter have been used for decades.)

Two Theories on AI’s Limitations

First, many mistakes are just due to how LLMs get trained. They’re trained to predict words, and many of their mistakes are just the LLM producing what “sounds” right, for lack of a better word—more precisely, the word that’s most likely to follow the previous word—rather than what is true. When it comes to data analysis, that’s not ideal. 

Second, I think LLMs are using shortcuts to save on energy/memory use. 

Why? My LLM’s performance degraded significantly last month. It went from being able to embed links in its results to not being able to do that (was it a user-interface change?), plus it started making way more mistakes. The LLM was (admittedly, by itself) taking a ton of shortcuts to save energy and memory. 

Now, why was it taking those shortcuts? My theory is that my LLM’s company released a new photo creation update which used way too much memory/data processing, which costs a ton of money, so cost savings had to be found elsewhere, lest they burn too much money. That meant the text program suffered as a result.

I’ve noticed this with web search. Using LLMs for web search is incredibly valuable to me, but I’m guessing it also uses a lot of tokens, so the AI company is doing its best to train its LLM to search the web as little as possible.

For me, personally, the cost of AI barely pencils out. Right now, personally, $20 is just on the edge of how much value this service provides me. Is it worth what I pay for it? Maybe? If it cost any more, I probably couldn’t use it. 

But should it cost more? Probably. Of course, now I’ve stumbled into the huge topic of “will AI change the world…or not?” And to properly explore that, well, I’ll need another article.

Picture of The Entertainment Strategy Guy

The Entertainment Strategy Guy

Former strategy and business development guy at a major streaming company. But I like writing more than sending email, so I launched this website to share what I know.

Tags

Join the Entertainment Strategy Guy Substack

Weekly insights into the world of streaming entertainment.

Join Substack List