Skip to Main Content

Digital Video and Image Analytics

Analyzing videos and images using digital tools


This lib guide serves as an initial exploration of visual data capture and analysis and as a guide to help students navigate the basic tools and methodologies that can be used in the field of digital video and image content analysis and research. 

This LibGuide was composed by Ping Feng, former graduate researcher in the Digital Scholarship Center at Temple University.

What is Cinemetrics?

Cinemetrics - originated from an eponymous web-site ( initiated in 2005 by Yuri Tsivian, a professor from University of Chicago, Department of Cinema and Media Studies. The website introduces a data collection tool that facilitates the annotation of films, and therefore makes it easier to collect editing data or some other content analysis data (e.g. action, dialogue scenes) manually. The website thus acts as a repository that displays the database of the quantified data about thousands of films across history based on crowd contribution; meanwhile it also presents a couple of published articles that have extensively utilized the website database. The website is a notable resources for scholars and researchers who are interested in taking quantitative approach in film studies. 

Cinemetrics is thus understood as a manner of "statistical analysis of quantitative data, descriptive of the structure and content of films that might be viewed as aspects of style" (Baxter, 2014, p. 2). 

Introduce Basic Terms for Quantitative Analysis on Digital Videos

Baxter (2014): Notes on Cinemetric Data Analysis defines basic terms that are frequently used in quantitative analysis of films and digital videos  as follows: 

Frame rate: The number of frames or images that are projected or displayed per second, usually represented as FPS, frame per second. 

Shot: Between one cutting point to another cutting point in digital post-production editing decision list.

Shot Length:

  • SL: Shot Length
  • ASL: Average Shot Length
  • MSL: Median Shot Length

Short Scale:

  • MS: Medium Shot
  • MLS: Medium Long Shot
  • LS: Long Shot
  • VLS: Very Long Shot

Camera movement: 

  • BCU: Big  Close-up
  • CU: Close-up
  • MC: Medium close-up

"Hard Data" versus "Soft Data" of Digital Video Analysis

Just like text analysis on literature where authors’ language and narrative style can be evaluated by analyzing the frequency of key words usage, the density distribution of key words, the sentiment variation and so on, digital video and film offers elements that can be analyzed in similar ways. The editors or directors' editing style can be analyzed through data such as the total shot counts, shot length distribution, frequently appeared images, color and so on. 

The quantitative way of analyzing digital video and image materials resembles text analysis in many ways, but it can be more complex due to the special texture of digital video and image materials, in that they incorporate many dimensions of data.

Generally speaking, I would define the meta-data of digital video and image materials into two types: the hard data and the soft data.  The “hard data” is usually associated with the production related information generated at the time when the digital videos or images were created. For example, the production time, location, material size, run time, frame rate, resolution, number of shots, average shot length (ASL), theme color and so on. This type of data is typically regarded as comparatively objective and can be used directly into descriptive statistical analysis. 

In contrast, the “soft data” is usually associated with the content of the digital videos. For example, you could categorize the content of digital video into dialogue, action, and silence, or recognize and transcribe the speech and image content into text description.  The qualitative nature of “soft data” determines that the “soft data” is subject to a certain level of interpretation and therefore it is comparatively subjective. This type of data is not natively available and a secondary data extraction and analysis are usually necessary and required. However, once the category of “soft data” is defined, then the preciseness of “soft data” is largely depending on the speech and image recognition technology and thus also has an objective nature.

As for which type of data we should select or combine, it depends on the research questions we have. And no matter which type of data we use, they both serve the purpose of better understanding the video and image materials.  We will introduce in the following tabs the useful tools and methods in terms of both digital video data extraction and visualization for assisting with content analysis.

Research Assistant