Main Stage
✦ SHADES ✦

How does a movie's genre impact its box office revenue?

The movie Shades aims to provide an understanding of whether certain genres are associated with higher revenues and how the relationship between genre and revenue evolves over time. By examining trends, correlations, and revenue distributions, we uncover key insights into the dynamics of movie genres and their box office success.


Setting the Scene

Let's explore the relevant aspects of our dataset. In this analysis, we will focus on attributes like the list of genres and the release year of the movies. There are 316 unique genres in the dataset. The top 20 most common genres are shown in the plot below and include Drama, Comedy, Adventure, Romance Film, Action, Thriller, and others. Interestingly, 97.24% of movies (9243 out of 9505) have at least one genre from the top 20 most common ones.

As you might have already guessed, the movies in our dataset are often categorized under multiple genres. However, what is the percentage of movies that have more than one genre? How does the distribution of the number of genres per movie look like? Let's take a closer look.

The chart above shows that multi-genre movies are very common in our dataset. In fact, 94.19% of the movies have more than one genre. The majority of movies are assigned between 3 and 6 genres, with the highest counts observed for movies with 4 and 5 genres. We notice a steady decrease in the number of movies with more than 6 genres, indicating that it is uncommon for a movie to be categorized into a very high number of genres, for example 10 or more.


Genre and Box Office Revenue: Exploring the Connection

How does the genre impact the box office success of a movie? Our goal is to uncover if certain themes are linked with higher earnings. We will explore the top 20 genres with the highest average box office revenues.

The chart above lists Space Opera, Demonic Child, and Roadshow Theatrical Release as the highest-earning movie genres on average. This finding is unusual because none of the top 20 highest-earning genres are among the top 20 most common genres in the dataset. By hovering over the bars in the interactive plot, you can view the number of movies associated with each genre. For instance, Space Opera and Demonic Child each have just 5 movies, while genres like Coming-of-Age Film and Archeology have only 1. So, what’s going on here?

Many of these high-revenue genres, such as Space Opera and Demonic Child, have very few movies in the dataset. Hence, their high average revenues are likely driven by outliers. For example,

  • Space Opera includes Star Wars. (Yes, Star Wars, the saga that made “May the Force be with you” a cultural phenomenon and turned galaxies far, far away into box office gold.)
  • Demonic Child includes The Exorcist, the terrifying classic that still sends chills down spines.

Given the multi-genre nature of our dataset, these movies are likely listed under other, more common genres whose average revenues are lower. For instance,

  • Star Wars is also classified under Fantasy, Action, Science Fiction, and Adventure.
  • The Exorcist is also categorized under Horror and Drama.

To better understand revenue trends, the analysis will now focus on the top 20 most common genres, which, as noted earlier, account for 97.24% of the movies in the dataset. Let us take a look.

As explained before, we show the average box office revenue for the top 20 most common genres in our dataset. We observe that Fantasy and Family Film genres have the highest average revenues, exceeding $175M. Science Fiction, Adventure and Action also show a strong average revenue, possibly due to their popularity in big-budget blockbusters. On the other hand, genres such as World Cinema and Indie have much lower average revenues, below $25M.

While the average box office revenue provides an insight, it does not fully capture the distribution of revenues across genres. The boxplots below will offer a more detailed view.

The boxplots show the distribution of the log-transformed revenues, which stabilizes the variance and reduces the skewness caused by extreme values in the revenue data. The plots reveal that many genres exhibit wide revenue ranges with noticeable outliers, i.e., movies that earned significantly more or less than others in the same genre.

To explain the influence of the top 20 genres on the box office revenue, we also built a linear regression model. The model explains 19.7% of the variance in box office revenue (R-squared), indicating that genres alone are modest predictors of revenue. Genres with significant positive coefficients included Fantasy, Family Film, Adventure, Musical, Action, and Thriller, all showing strong associations with higher revenue. On the other hand, genres with significant negative coefficients included Indie, World Cinema, and Crime Thriller, showing weaker revenue associations.


Genre and Box Office Revenue: Trends over Time

We want to understand how the box office revenue trends for different genres evolved over time. Were there time periods when some genres were more profitable than others? Let's take a closer look.

The heatmap above shows the average box office revenue per genre over time, masking the years with no data. Clear trends are difficult to observe due to a few visible outliers that skew the distribution. Let’s explore this further.

As stated before, revenue spikes are visible for genres like Family Film, Fantasy, and Period Piece in specific years, likely due to outliers:

  • The revenue spikes in 1937 in Family Film and Fantasy are due to Snow White and the Seven Dwarfs.
  • The revenue spike in 1939 in Period Piece is due to Gone with the Wind.

It’s difficult to determine the most profitable genres over the years based on the previous visualization. What was the most profitable genre for each decade? Let’s find out.

This interactive chart visualizes the ranking of movie genres based on their average box office revenue per decade. The horizontal bars represent the average revenue for each genre, with longer bars indicating higher revenue. The timeline slider below allows you to explore changes in genre rankings over time. As you move the slider across decades (from 1910 to 2010), the chart will update to show how the profitability of genres has evolved.

We can see that Fantasy and Family Films have consistently dominated since the 1930s, maintaining top rankings for multiple decades. Science Fiction gained significant relative success starting in the 1970s. Indie and World Cinema genres consistently ranked at the bottom, showing lower average box office revenue over time.

A logical follow-up question arises: What is the relationship between the number of movies produced over the years and the average box office revenue per genre? We want to explore whether periods with higher box office revenues for genres are associated with increased production. Let’s find out.

This interactive line chart visualizes the average box office revenue (in pink) and the number of movies produced (in green) for a selected genre over time. The horizontal axis represents the years, while the two vertical axes correspond to the respective metrics: average revenue (on the left, in dollars) and movie count (on the right).

As we can observe, the period spanning the late 1970s and early 1980s appears to have been a period of particularly high average box office revenue across most genres. This suggests a potential "golden age" for certain types of films during that time. Furthermore, in more recent decades, we observe a distinct trend of increased production volume across many genres. While production has increased, it's important to remember that this doesn't automatically translate to consistently higher average revenues, as we've noted before (more isn't always more). To understand this relationship better, we can explore the correlation between the number of movies produced and the average box office revenue per genre. Let's have a look at the correlation plot below.

This bar chart shows the correlation coefficients between the number of movies produced in each genre and their average box office revenue over time. The Pearson correlation measures the strength of a linear relationship, while the Spearman correlation captures how well the relationship can be explained by ranked or monotonic trends.

From the plot, genres like Mystery, Science Fiction, and Action show both positive Pearson and Spearman correlations with moderate strength, as for example Spearman values above 0.3. This suggests a meaningful relationship where producing more movies in these genres tends to increase average revenue. In contrast, genres like Indie films have both negative Pearson and Spearman correlations, with moderately strong values, for example Pearson around -0.36. This indicates a consistent decline in average revenue as more movies in this genre are produced.

Several genres, such as Fantasy, Crime Fiction, and Drama, have a positive Spearman correlation but a negative Pearson correlation. The strength of these correlations is generally weak to moderate, with Spearman values slightly above 0 and Pearson values below 0. This mixed result indicates that while the relationship isn't strongly linear (negative Pearson), there is still a consistent ranking trend (positive Spearman)that may reflect more complex patterns, such as diminishing returns or varying movie success.


The Final Scene: Bringing It All Together

What did we discover from all this analysis?

After checking out the plots and digging into the stats, here are the main things we’ve learned:

  • The top 20 most common genres, like Drama, Comedy, and Adventure, dominate the movie market, accounting for 97.24% of all movies in the dataset.
  • High-revenue genres, such as Space Opera and Demonic Child, achieve impressive average earnings but are outliers, with very few movies in these categories driving their success.
  • Among the most common genres, Fantasy, Family Film, and Science Fiction stand out with the highest average box office revenues, often tied to large-scale, big-budget productions.
  • Correlations reveal interesting trends: genres like Mystery and Science Fiction show positive relationships between movie production and revenue, while Indie films exhibit a negative correlation.