Are you able to carry extra consciousness to your model? Think about turning into a sponsor for The AI Impression Tour. Be taught extra concerning the alternatives here.
In case you missed it, yesterday Netflix departed from its modus operandi of conserving all however its most profitable viewership numbers underneath wraps. It really printed a public dataset containing all the titles with greater than 100,000 viewership hours from the interval of six months between January 2023 and June 2023.
“In complete, this report covers greater than 18,000 titles — representing 99% of all viewing on Netflix — and practically 100 billion hours considered,” the corporate wrote in a weblog publish saying the brand new report, entitled “What We Watched: A Netflix Engagement Report.” Netflix additionally dedicated to updating and releasing the report biannually.
Netflix counts “viewership hours” versus viewers or households (although it doubtless additionally has this info) due to course, some individuals might watch issues greater than as soon as.
Whereas the streamer highlighted a few of the findings, I made a decision to obtain the report — it’s accessible on Netflix’s weblog as a .xslx file, or Excel spreadsheet file — and run it by way of OpenAI’s ChatGPT (utilizing GPT-4 on a private ChatGPT Plus subscription) to check out the information evaluation capabilities of late.
Spoiler alert: ChatGPT did a good job offering a transparent, easy, if transient evaluation of the information contained therein. It suffered hiccups although — presenting an error when requested to generate a chart and struggling to create what I requested for in my prompts.
Check out my course of under. I began with a easy request: “Are you able to please carry out an information evaluation of this information?” ChatGPT dutifully complied and offered a pleasant description of what was contained therein.
ChatGPT additionally highlighted “key factors” and “key insights” for me, together with one thing that I in all probability wouldn’t have caught if I have been trying on the information with my very own (untrained) human eyes: “The ‘Launch Date’ column has a big variety of lacking values (13,359), which can restrict sure forms of time-based analyses.”
Intriguingly, although the primary part ChatGPT gave me from its “key insights” was titled “The High 10 Most-Watched Titles (Jan-Jun 2023)” it didn’t really listing these out. I needed to ask for them individually.
I additionally requested for the bottom considered titles throughout this era by viewership hours, the median considered title, and the typical hours considered and the title that was closest to this worth. ChatGPT offered all of them for me.
Nonetheless, ChatGPT struggled once I requested it to generate me a line plot displaying the viewership hours for titles on a month-by-month graph (notice: the dataset didn’t embody this information to start with, and solely included complete viewership hours of every title for your entire six-month interval measured).
It initially generated an nearly illegible plot that included dates going again to 2010 on its x-axis, which represented the earliest launch dates among the many titles within the set.
Nonetheless, once I requested it to right the error and focus solely on the six-month span included within the dataset, it offered a extra legible — if nonetheless in the end deceptive — plot.
As a result of Netflix’s information didn’t embody a breakdown of what number of viewership hours every title incurred per thirty days — not to mention a compiled model of complete viewership hours for month — the chart above really solely represents complete 6-month cumulative viewership hours for brand spanking new titles launched in every month.
The hours considered for a title launched in January, for instance, really characterize all of the hours it was watched throughout your entire January-June interval.
ChatGPT will not be sensible sufficient by itself — with out prompting — to determine the right way to appropriately label this chart in order that it’s clear to the human reader what information is being introduced: it’s not complete hours considered because the y-axis label states. As a substitute, it’s simply the complete hours considered over the Jan-June 2023 interval for all titles launched in every month. That’s in the end not a really useful chart, sadly.
It took me a number of makes an attempt to get ChatGPT to create a helpful and appropriately labeled chart, going forwards and backwards with it because it created model after model that was not fairly what I requested, till I lastly obtained one thing first rate.
So, whereas ChatGPT could also be a useful evaluation associate — for the informal person like myself — it nonetheless has a protracted option to go to being a reliable, dependable and intrinsically useful information analyst.