Volumetric capture is in its infancy but Microsoft and Intel are leading the land grab with new studios and new customers
Words Julian Mitchell
When the VFX Oscar-winning The Jungle Book arrived on our screens a couple of years ago, we were amazed at how realistic the digital recreation of our world was. Currently The Lion King is being given the same live capture treatment and the entire toolset is being revamped again for virtual cinematography, to include at its base, VR. The production team are creating the set, animation, puppeteering, lighting and virtual camera and are able to walk around in the set as if it were real.
They can even capture Steadicam, handheld, dolly and crane moves in a live action style via a conventional dolly or other camera platform. Of course, they still have all the tools of CG and the feel of analogue live action operating and camera movement.
But in this exciting new world of capture there are other ways of achieving a pathway into the digital world. If you haven’t heard about volumetric capture, it is a way of capturing true 3D video that you can watch from any angle. We’re really talking here about creating a hologram but one you can insert into VR, AR, 3D or even 2D programming. You can then take your sequence, which can be anything from a golfer swinging to a dancer pirouetting, and place it in 2D space for commercials and maybe drama. But the real purpose of volumetrics is to create digital humans who look and act like real ones when placed in immersive content. Other ways of creating digital humans involve CGI built around performance capture, which is open to the ‘uncanny valley’ weirdness that we’ve all seen.
Large corporations, including Microsoft and Intel, are already practising the art of volume recording. Microsoft has built three volumetric studios, two in the USA and one in South London. Intel Studios has just opened in Los Angeles and features a 10,000 sq ft volumetric-capture dome producing immersive media with brands, sports teams and filmmakers.
Yush Kalia from Dimension, which has licensed the Microsoft technology, is already servicing a very wide range of clients who are keen to bring holographic content into their programming. “Our partner companies were striving to create virtual humans in virtual worlds and the tie-up with Microsoft has made that happen,” she explains. “We were the first licensees of this technology outside of Microsoft’s head office and the industries interested in using it are numerous including healthcare, theatre, education, sports and the ones you would expect to be early adopters like gaming, advertising and marketing.
“We’ve put this technology out there and are inviting people to come and have a look. We’re definitely not saying we know who all the customers types are and where they are from.”
So, what do you need to be able to capture video in this way? Yush gives us a tech rundown: “In the studio we have 106 cameras, 53 of which are RGB and the other 53 are infrared for the depth; 106 cameras need 106 matched prime lenses from Kowa. We can record at 30fps, 60fps and we’re doing tests up to 90fps. We light the room with Cineo lighting so that we can eliminate as many shadows as possible so it’s completely flat lit and this allows us to be creative with the lighting afterwards.
“We do record the audio with eight directional audio channels from eight shotgun microphones. With the audio we don’t do anything fancy except capture as a .wav file. Our sound engineers can make spatial mixes from those uncompressed files if needed.
“We capture at 10Gb/s of Raw footage, so a 10Gb per 30sec mesh sequence which is an .obj (a file format that supports both polygonal objects and free-form objects) and texture sequence is then rendered and subsequently compressed so that 10Gb/s capture can be presented in Mb/s speeds. That gives you the ability to stream and be embedded in an app to play on native devices for instance.”
When you’re capturing at such extreme data rates then you can select your destination from movie VFX and commercials down to VR headsets with their much lower data handling abilities. “For VFX, TV and film we would hand off .obj sequences and texture sequences, the .pngs. For VR, for example, we would expect a 20k to 40k polymesh sequence with a 2k texture sequence. It’s important to remember that the output file is encoded as a .mp4. Mobile resolution would work from a 5k to 10k polymesh sequence with a 1k texture sequence as a .png.”
If you look at the Dimension website you can see the early adopters of volumetrics are advertising agencies, performance related companies and some broadcasters. “With 106 camera views you can be incredibly creative where your camera goes,” says Yush. “A couple of music artists have shot their content for their music video in the rig; that’s their performance in the rig. They’re exporting that and putting it into a separately created music video bed if you like. You can decide afterwards how you’re going to view that content. You can move the camera around as you wish and if it’s a VR project you can decide where to stand as a viewer.
“The creatives that we’re talking to are very excited about the fact that they can be quite creative after the fact. For example with the ability to pause a scene on your mobile phone screen with your finger and to pan around as it’s paused, if you have a basketball player jumping in the air you can stop that, move around, and then watch him finish his shot.”
A new customer for Dimension is NCAM, which is a growing company in the AR space for broadcasting and movies. They booked Dimension to produce a photorealistic golfer to be incorporated into a production for a broadcaster.
Nic Hatch from NCAM explains the process: “We want to have photorealistic human beings that we can put through our other technologies, so you can walk around what looks like a real live 3D person but actually they’re just not there. So we volume-captured a golfer who is to talk with a real person but on a virtual golf course.
“The presenter would wonder what kind of shot you could make approaching the green for instance,” explains Nic, “then this virtual, volume-captured golfer appears and talks you through a shot. Our technology with its depth abilities lets us walk around this virtual golfer and we can replay his swing and interact with it.
“Volume capture is not quite there yet, it’s not bad but just needs to be higher resolution. There are some fundamental flaws that need ironing out. But it’s early days and volume capture is really interesting. At the moment it’s not a live event, it’s not like live motion capture.”
In our next issue we will be reporting on new capture techniques including NCAM’s new products and the latest from the world of virtual sets.