top of page
  • Writer's pictureGanna Pogrebna

Behavioural Data Science for Movies and Media

Movies and media have been a vital part of human culture for decades, captivating audiences and leaving lasting impressions. But why do some movies become blockbusters while others fall short? In recent years, behavioural data science has provided us with insights into the emotional journeys that viewers undergo while watching movies and how these journeys correlate with the success of the film. This research made a significant mark on the movie-making industry.


A series of behavioural data science studies explored the link between human emotions and motion picture content. More specifically, it was demonstrated that emotions are the key to a movie's success, and that even the best plot in the world will not go far if the story around it fails to make an emotional connection with the viewer. Recent advances in behavioural data science have allowed us to better understand emotions and use this knowledge to predict viewers' preferences. Using the theory of human emotions and the natural language processing methodology, we explored whether and to what extent emotions shape consumer preferences for media and entertainment content. We obtained over 156,000 scripts from and complimented them with data on movies from the IMDb website as well as data on revenues from After a complex filtering procedure, their final dataset consisted of 6,147 movies with complete scripts as well as information about each movie's gross domestic revenue in the country of the first release, IMDb motion picture ID number, date of release, average IMDb user satisfaction rating from 1 (very bad) to 10 (excellent), critic satisfaction meta score from 0 (very bad) to 100 (excellent), all IMDb genres of the movie, rating count, number of user reviews, number of critic reviews, number of awards, name of the motion picture director, runtime in minutes, and age appropriateness rating.

The sentiment for each movie was accumulated and represented using the motion picture timing from 0% (beginning of the movie) to 100% (end of the movie). We found that (1) all movies produced in the English language can be partitioned into 6 emotional arcs and (2) customers tended to prefer emotional trajectories that could be described as U-shaped emotional roller-coasters.

Six Emotional Arcs in Movies

The analysis also revealed that the highest box offices are associated with the Man in a Hole shape which is characterized by an emotional fall followed by an emotional rise (resembling a U-shape). Interestingly, we found that a carefully chosen combination of production budget and genre may produce a financially successful movie with any emotional shape. For example, if you want to shoot a successful tragedy (the Riches to Rags shape), making it epic with a large budget of over 100 million dollars is a good idea. Other surprising results tell us that Sci-Fi, mystery, and thrillers with happy endings (the Rags to Riches shape) do not do well at the box office. Also, Oedipus-shaped movies on average do not seem to do well at award ceremonies and festivals (other than the Oscars). By using sentiment analysis grounded in behavioural data science, movie makers can potentially engineer content that consumers would want to see. This shift towards data-driven decision-making can help the industry produce more successful movies and media content that better resonates with viewers.

The goal of the project is not to slow down human creativity. It is feasible to create successful movies in any emotional arc. Yet, considering that the motion picture business is expensive and labour-intensive, the developed tool may be used by the scriptwriters to make more informed decisions about their scripts at the early stages of movie ideation. The results of the project were discussed by the expert panel at the Stockholm Film Festival.

Photo courtesy of the Stockholm International Film Festival

The work within the frame of this project is ongoing. Multiple spin-offs of the original tool were developed to analyse trailers, TV series, motion picture supply chains, etc.

Selected References

Del Vecchio, M., Kharlamov, A., Parry, G., & Pogrebna, G. (2020). Improving productivity in Hollywood with data science: Using emotional arcs of movies to drive product and service innovation in entertainment industries. Journal of the Operational Research Society, 1-28.

The Emotions that Make a Film a Hit or a Miss by Ganna Pogrebna in collaboration with BBC


bottom of page