Gollum, Caesar, Thanos: The Algorithmic Body Across Twenty Years of Motion Capture

“We wants it. We needs it. Must have the precious.”

It is 2002, and this now iconic line has just been croaked by gangly ring-fetishist Gollum in Peter Jackson’s The Lord of the Rings: The Two Towers. Or rather, this line has been uttered by Gollum in dialogue with his more benevolent split personality, Smeagol, through a cunning pattern of 180 degree continuity editing. Or again, these words have, at some point, been uttered by actor Andy Serkis, dressed in a Lycra suit befitted with numerous dot-like markers to enable the tracking of motion by several cameras, on location in southern New Zealand (and later in a pure blue post-production studio).¹ Gollum was recorded as data about movements and gestures that, once melded with a computationally generated body, became a character that exists with compelling verisimilitude with the human actors of the scene.

Production still showing Andy Serkis in MoCap suit as Gollum/Smeagol in The Lord of the Rings: The Two Towers (Peter Jackson, 2002)

Gollum signified a pioneering development in computer generated imaging in the cinema, in terms of both technological innovation and the opening-up of new formal possibilities. In the twenty years since Gollum appeared on screen, the first time a virtual character was “filmed” on location, motion capture (“MoCap”) has become one of the central techniques of CGI, and, it can be argued, one of the distinctive aesthetics of post-Millennial Hollywood film making.² MoCap characters like Gollum signify a convergence of analog and digital modes, a grafting together of a human body that has moved through tangible space, and an algorithmic body that has been fleshed out through code. The creation of any MoCap character involves various forms of intensive technical, artistic, and computational labour.

Demonstrating that the preoccupation with anatomy and movement has long been at the heart of the persistent entwinement between cinema and scientific enquiry, the history of motion capture predates that of cinema itself, beginning with the proto-cinematic experiments of figures such as Eadweard Muybridge and Étienne-Jules Marey, well-known pioneers of the method of chronophotography. Rigging up to 24 cameras with electrically triggered shutters, Muybridge conducted studies of the movement of horses, birds, and men from 1878 onwards, according to which gestures were fragmented into sequential parts in individual images.³ Slightly later, using a circular photographic plate with a trigger and lens mounted on a gun, Marey captured the continuity of a body in movement across the unified space of a single image, rendered as reflective panels and markers illuminating the movement of a subject’s skeleton through space.⁴ Marey’s experiments bear striking resemblance to data-tracking images gleaned by MoCap today.

Étienne-Jules Marey Man Walking (c.1880)

Typical MoCap frame model

Sometime later, the analog predecessor of digital MoCap as a cinematic special effect emerged in the early 20th century with rotoscoping, a laborious animation technique invented by Betty Boop creator Max Fleischer in 1915 that involved tracing human movement by hand, frame by frame, from a film strip into drawing.⁵ Rotoscoping remained a part of animation production for several decades and was used in many Disney productions such as Peter Pan (Clyde Geronimi, 1953). From the mid-twentieth century onwards, experiments in motion capture for applications in cinema and beyond were undertaken in laboratories around the world, including the development of a “graphical marionette” by the Massachusetts Institute of Technology Architecture Machine Group in 1983.⁶ Today, MoCap has a range of applications beyond cinema, including in orthopaedic medicine and mechanical engineering. Adjacent to cinema, MoCap is commonly used in the production of video games, a practice beginning with The Prince of Persia in 1989.⁷

Production still from Peter Pan (Clyde Geronimi, 1953)

Hollywood’s experiments with digital motion capture techniques began in the early 1990s, among them a failed attempt to use nascent MoCap to animate Arnold Schwarzenegger as an x-rayed skeleton in Total Recall (Paul Verhoeven, 1990) and to successfully light a priest on fire in a single shot of Lawnmower Man (Brett Leonard, 1992).⁸ In the years following the appearance of Gollum – the second virtual character to be created through digital MoCap, following the much-detested Jar Jar Binks in The Phantom Menace (George Lucas, 1999) – numerous films made use of MoCap with varied and at times inauspicious outcomes.

In 2004, Robert Zemeckis’s Christmas-themed film The Polar Express transliterated actors such as Tom Hanks into an animated feature where all characters were created through motion capture. The Polar Express was criticised as a prime example of the uncanny valley at work in CGI, described by Peter Travers in Rolling Stone as “a failed and lifeless experiment.” ⁹ Despite this, the technique of producing an animated film through MoCap was once again used in Beowulf (Robert Zemeckis, 2007) to graft a serpentine tail onto an otherwise human-appearing Angelina Jolie as Grendel’s Mother, as well as in the production of The Adventures of Tintin (Steven Spielberg, 2011). Interestingly, the use of MoCap as a technique for animation dwindled after these early efforts, giving way to a prevailing use of the technology for creating CG characters within otherwise live-action films. MoCap technology can be put towards all sorts of purposes in film, including stunt work (a strategy used for some action sequences in The Wolverine (James Mangold, 2013)) creating multiples of an actor (such as in Gemini Man (Ang Lee, 2019), featuring a younger/older Will Smith duality) and, theoretically, bringing actors back from the dead (evidenced by an as yet unrealized 2018 proposal to cast the late James Dean in a new film through MoCap).¹⁰ Since the release of Avatar in 2009, however, cinematic uses of MoCap seem to have gradually been streamlined in the service of a unified aesthetic and narrative logic, whereby the technology has been consistently devoted to the conjuring of animal, alien, or magical characters in photorealistic style within big-budgeted science fiction and fantasy franchise films.

The reasons for this formal hegemony may be connected to broader industrial configurations. Practically speaking, since the turn of the 21st century, CGI production in Hollywood has coalesced around a handful of major companies, among which Wētā FX, based in Wellington, New Zealand, has emerged as the predominant “creature effects” company. Following the creation of Gollum, Wētā has engineered MoCap characters such as the genetically-enhanced, horse-riding simian Caesar in the Planet of the Apes trilogy (Rupert Wyatt, 2011; and Matt Reeves, 2014 and 2017), the disfigured evil clone Supreme Leader Snoke in Star Wars: The Force Awakens (J.J. Abrams, 2015), the Na’vi aliens of James Cameron’s Avatar series (2009 and 2022) and the eggplant-hued antagonist Thanos from Avengers: Infinity War (Anthony and Joe Russo, 2018), among numerous others. Julie Turnock argues that while film critics and theorists have generally considered the development of CGI as informed by an impulse towards “a commonly held notion of effects photorealism,” CGI has historically been shaped by the “in-house style” and techniques of the company Industrial Light and Magic in terms of lighting, texture, and scale.¹¹ A similar argument could be made about the predominance of Wētā FX in the field of MoCap, whereby the company’s amassing of decades of research and expertise, technological infrastructure, and working connections with numerous production houses has contributed to the establishment of overarching aesthetic modes and applications. More pragmatically, the way that MoCap characters overwhelmingly appear in recent films is based on, again and again, the replication of code developed by Wētā FX.

There are two central techniques for MoCap. The first involves tracking a performer’s body through a suit and/or helmet, while the second is called “markerless” MoCap, entailing the algorithmic analysis of movements recorded on video, such as the study of an athlete’s movements in sports medicine.¹² In suit-based MoCap, there are two strategies for recording movement, either “optical” or “inertial.” Optical MoCap is a process by which performers’ movements are tracked by infrared sensors, placed on joints and muscle areas, by several cameras. Information about the movement of the body in three-dimensional space is correlated from the data yielded by these two-dimensional visual sources.¹³ Similar principles apply to MoCap of the face, with the exception that reflective markers (sometimes hundreds, as in the case of the making of Thanos) are attached directly to the performer’s skin, and recorded by an individual camera mounted on a headpiece.¹⁴ In contrast, inertial MoCap does not require a camera, instead using what are termed “inertial measurements units” (IMUs), including gyroscopes and accelerometers to record movement. Thus far, use of inertial MoCap in film making has been largely experimental.

Production still of Josh Brolin as Thanos in Avengers: Infinity War (Joe and Anthony Russo, 2018)

Significant technological advances in MoCap over the past several years include the rendering of highly detailed crowd scenes in Dawn of the Planet of the Apes (Matt Reeves, 2014), the MoCap of actors rendered in varied scale for The BFG (Steven Spielberg, 2016), and the rendering in real time, including intricate facial data capture, of Josh Brolin’s performance as Thanos in Avengers: Infinity War.¹⁵ The aspect of performance intrinsic to MoCap has been much mythologized, and “making of” featurettes and images of actors moving in-synch with their digital counterparts have been a key part of the marketing strategies of creature-effects laden films, for example, the well-circulated footage of Benedict Cumberbatch performing perhaps overzealously in full facial-capture rig as the eponymous dragon in the Desolation of Smaug (Peter Jackson, 2014).¹⁶ As Lisa Bode describes in her book Making Believe: Screen Performance and Visual Effects, this discourse is ideologically motivated, and functions in two ways. On the one hand, the emphasis on performance works to assuage audience anxiety about technological shifts in film production. By emphasising the fact that a trustworthy actor (such as Andy Serkis, now well-known for his work as Gollum, Caesar, and Snoke) has performed in front of a camera at some stage, the computational otherness of the creature on screen is reassuringly humanised. On the other hand, these “before and after” images that studios circulate are an act of misdirection, accentuating the magic of the technology while concealing the laborious, lengthy – and, in terms of the common practice of outsourcing elements effects production labour to low-wage countries, potentially exploitative – process of special effects production.¹⁷

Benedict Cumberbatch performing as Smaug in The Hobbit: Desolation of Smaug (Peter Jackson, 2013)

The lifecycle of every MoCap character involves a complex journey between the recording of performed movement by an actor, the calculation of binary-code algorithms simulating a body that is imaginative yet anatomically plausible, and the transformation of this model of a body into pixels on a screen. When engineers design CG bodies such as Gollum within the interface of computer software, this process begins with a mathematical modelling of how different components of the creature behave in three-dimensional space according to a system of partial differential equations. One of the most significant techniques for the building of MoCap characters involves a system of equations that Alekka McAddams terms “deformable object simulations,” which work to model the dynamics of moveable, organic objects such as skin, hair, and cloth according to different environmental scenarios.¹⁸ Generally, MoCap creatures are designed according to a layered approach; in order for the surface of the creature, whether in the form of skin, fur, or scales, to move and interact with the non-CG cinematic environment in a plausible fashion, underlying layers of the body, including the skeletal system, musculature, subcutaneous tissue, and dermis, are initially modelled to provide a base for what will ultimately be visible.¹⁹ Once the MoCap character has been encoded, the final stage of production involves transforming this code into moving images through the use of a render farm, large-scale computational infrastructures that require immense electrical and hardware capacity, a process which can take many months to complete.²⁰

Production still from The Hobbit: an Unexpected Journey (Peter Jackson, 2012) showing the subdermal layers of Gollum

The aesthetic and perceptual effects of MoCap springing from this techno-industrial process are multifaceted. As Kristen Whissel reminds us, contrary to popular criticism, CGI is not always a matter of narratively disruptive, “attention seeking spectacle.”²¹ Rather, Whissel argues as to the notion of the “effects emblem,” whereby special effects appear prominently “at key turning points in a film’s narrative to emblematize the major themes, desires, and anxieties with which a film (or a group of films) is obsessed.”²² A similar argument can be made for the interpretative possibilities offered by MoCap characters across a broad body of films. In the films discussed so far, the meaning of nonhuman MoCap characters extends beyond spectacle . Instead, their appearance frequently thematizes technological fantasies and foregrounds nonhuman experiences, ecological concerns, or the effects of prosthesis on subjectivity and the body. Such examples include the human consciousness transplanted into an alien host in Avatar; a normal ape technologically enhanced to super intelligence while humans are rendered extinct by a genetically engineered disease in Dawn of the Planet of the Apes; Thanos assembling the mystical technology of the infinity gauntlet, able to control time and space; and Gollum/Smeagol given immortal life by the One Ring. In a phenomenological sense, these MoCap bodies afford viewers an unusually potent, sensorily engaging encounter with the realm of the nonhuman by representing a unique synthesis between the digital and the corporeal. MoCap characters serve both an allegorical and affective function in cinematic comprehension through their mediation of the body, cinema, and computational technology.

While MoCap characters can play a significant role in narrative patterning, thematization, and immersion, and while advances in MoCap technology have in some ways been driven by the conventions of photorealism, this isn’t to say that the function of MoCap creatures is solely one of verisimilitude, seamlessness, and concealment. Instead, across the plethora of MoCap driven features released in the past twenty years, moments of ostentation, exaggeration, and effect-as-effect are witnessed repeatedly. The glow of bioluminescence on alien skin in Avatar, the webbing of blue veins visible beneath Gollum’s translucent skin, and the delicate clumps of snow clustered in Caesar’s fur are all examples which emphasise that, in terms of spectatorial experience, MoCap characters can be seen as highly contradictory. In these examples, the achievement of an ever-more compelling variety of expressive visual capabilities is tied up with an awareness of these capabilities as technologically contingent. MoCap features are laden with moments that emphasise technological finesse and achievement as a source of visual pleasure.

War For the Planet of the Apes (Matt Reeves, 2017)

Affectively, the experience of a MoCap alien/dragon/talking ape is not just a matter of these bodies appearing in themselves to be impressive or visually pleasing. Rather, the spectatorial experience of MoCap characters intrinsically involves a continually reinforced appreciation of the technological grandeur represented by the character. To borrow from Tom Gunning, the “attraction” effect of these MoCap characters springs from the awareness that they are the result of inanimate, mathematical abstraction that can nonetheless be transformed into something substantially life-like and vital.²³ Instead of solely working towards the suspension of disbelief, MoCap creatures often engender a pleasurable reinforcement of disbelief.

Despite the persistence of MoCap as a central feature of Hollywood filmmaking, recent releases leave few hopeful clues as to the future of this practice. Technological advances in MoCap mean that its creative possibilities are increasingly unbridled, though the purposes of these advances in terms of cinematic expression are not self-evident. Recently, MoCap has been deployed in service of such cinematic atrocities as Cats (Tom Hooper, 2019), wherein performers such as Sir Ian McKellen and Dame Judi Dench were grotesquely transformed into humanoid felines.[24/ For insider commentary on the VFX of Cats, see: vfxanon00002345, “Somethings for Us to Learn after Cats Fiasco,” Reddit, R/Vfx, December 26, 2019] During that same year, two ultra-high-definition facial capture cameras were used in the creation of the eponymous, bug-eyed heroine of Robert Rodriguez’s Alita: Battle Angel.²⁴

Cats (Tom Hooper, 2019)

Considering these crude and disappointing examples of extremely expensive and time-consuming innovation applied to disappointing creative ends, one wonders whether MoCap is destined for obsolescence, in the pattern of special effects like 3D projection. A lot is riding, once again, on the Gatorade coloured aliens of James Cameron’s Avatar: The Way of Water (2022), and the renewed novelty of motion capture, this time underwater.

Production still from Avatar: The Way of Water (James Cameron, 2022)

Endnotes

Caroline Conrad, “How Lord of the Rings’ Gollum Changed the Course of SFX,” Vulture, December 11, 2018 ↩
Tanine Allison, “More than a Man in a Monkey Suit: Andy Serkis, Motion Capture, and Digital Realism,” Quarterly Review of Film and Video 28, no. 4 (July 1, 2011): 325 ↩
See: Marta Braun, Eadweard Muybridge (London: Reaktion Books, 2012) ↩
Arthur Shimamura, “Picturing Motion in Photography: When Time Stands Still,” Art21 Magazine, January 4, 2016 ↩
Matt Delbridge, Motion Capture in Performance: An Introduction (New York, Springer, 2015) 15 ↩
Ron Fischer and Demian Gordon, “The History and Current State of motion capture”, Motion Capture Society, 2023 ↩
Andrew Lizsowski, “How Prince of Persia’s Groundbreaking Animations Were Created,” Gizmodo, April 1, 2020 ↩
The failed MoCap scene in Total Recall was eventually completed using rotoscoping. See: Ian Failes, “The Failed Motion Capture That Still Resulted in One of ‘Total Recall’s’ Best Scenes,” befores & afters, June 1, 2020 ↩
Peter Travers, “The Polar Express,” Rolling Stone, November 18, 2004 ↩
William Brown, “Beowulf: The Digital Monster Movie,” Animation 4, no. 2 (July 1, 2009): 154 ↩
Julie A. Turnock, The Empire of Effects: Industrial Light & Magic and the Rendering of Realism (Austin: University of Texas Press, 2022), 3 ↩
For example: Bhrigu K. Lahkar et al., “Accuracy of a Markerless Motion Capture System in Estimating Upper Extremity Kinematics during Boxing,” Frontiers in Sports and Active Living 4 (2022). Markerless MoCap is not, thus far, commonly used in cinematic production. ↩
Ian Failes, “‘Computer Pajamas’: The History of ILM’s IMocap,” befores & afters, September 9, 2019 ↩
Mike Seymour, “Making Thanos Face the Avengers,” Fxguide, May 7, 2018 ↩
On the making of Dawn of the Planet of the Apes see: “Dawn of the Planet of the Apes – VFX Sizzle | Visual Effects + | Wētā FX,” 21 July, 2014; on The BFG see: Carolyn Giardina, “‘The BFG’ Visual Effects Whiz Joe Letteri Makes the Case for Motion-Capture Performances,” The Hollywood Reporter, July 1, 2016; on Thanos, see: “How VFX Teams Brought Josh Brolin’s Thanos to Life in ‘Infinity War’” The Hollywood Reporter, January 25, 2019 ↩
ABCP, “Benedict Cumberbatch – Behind-the-Scenes of The Hobbit: Desolation of Smaug,” YouTube, September 24, 2016 ↩
Lisa Bode, Making Believe: Screen Performance and Special Effects in Popular Cinema (Rutgers University Press, 2017), 6 ↩
Aleka McAdams, Stanley Osher, and Joseph Teran, “Crashing Waves, Awesome Explosions, Turbulent Smoke, and beyond: Applied Mathematics and Scientific Computing in the Visual Effects Industry,” Notices of the AMS 57, no. 5 (2010): 614–23 ↩
See: “Weta Digital’s Tissue System,” Fxguide (blog), January 30, 2013 ↩
On the rendering of Avatar: The Way of Water, for example, see, Sebastian Moss, “Avatar: The Way of Water Was Rendered in Amazon Web Services,” Data Center DynamicsDecember 6, 2022 ↩
Kristen Whissel, Spectacular Digital Effects: CGI and Contemporary Cinema (Durham, NC: Duke University Press, 2014) 12 ↩
Ibid. ↩
Tom Gunning, “The Cinema of Attractions: Early Film, Its Spectator and the Avant-Garde” Wide Angle 3, no.4 (1986), pp.63-70. On the connection between Gunning’s notion of the “cinema of attractions” and digital visual effects, see: Vivian Sobchack, “‘Cutting to the Quick’: Techne, Physis, and Poiesis and the Attractions of Slow Motion,” in Wanda Strueven ed., The Cinema of Attractions Reloaded (Amsterdam: Amsterdam University Press, 2006) 337–51. ↩
“Alita: Battle Angel VFX Breakdown – Cinematography -Weta Digital,” Vfxexpress, November 21, 2020 ↩