Science

This Augmented Reality App Could Make Instant CGI a Reality

Inside Pinscreen's highly secretive new tech

Aug. 16, 2016

Augmented reality and face-swapping seem to be hot candidates for the next mobile apps; just look at Facebook’s purchase of MSQRD or Snapchat’s acquisition of Looksery. But one AR startup called Pinscreen, currently backed to the tune of $1.8 million from Lux Capital and Colopl VR Fund, is looking to do things differently in this area. Inverse spoke exclusively to Pinscreen founder Hao Li about what might be ahead for this secretive company.

You might not have heard about Hao Li, but you’ve probably seen this USC Assistant Professor’s research in action in several blockbuster films. While working at visual effects studio Industrial Light & Magic, Li helped develop some tools that enabled real-time performance capture. The idea was that an actor could sit in front a computer with a web camera, act out a scene, and then that performance would be translated — in real-time — to a CG character.

Li has carried out extensive research in other areas too, including the crafting of 3D models of real people in real-time, and capturing important parts of their likeness, especially faces and hair. Most recently, Li showed off research that he and others had conducted in capturing a performance from a person with just a single camera, and translating that to a digital character.

At first, this doesn’t sound so novel. There are already a number of face tracking solutions in development, both for mobile apps and much more sophisticated systems such as Faceshift (purchased by Apple in 2015). But many of these rely on depth sensors, which Li’s technology does not require. That approach, he says, is very different.

“If you have a depth sensor, all you need is to optimize parameters of a face model so that the model fits the 3D input as close as possible,” explains Li. “But in the case of a pure RGB input, the entire world is projected onto a two dimensional image without known camera parameters such as focal lengths. So an accurate 3D face model needs to be inferred from this projected image and be able to handle a wide range of lighting conditions as well as appearances of different subjects.”

Hence the importance of Li’s research on 3D avatars. When mapping a human’s face, a lot of mapping programs have into trouble dealing with visual obstacles like hair and glasses. To avoid that problem, Pinscreen “built a deep convolutional neural network that can learn how to segment a face region in a fully unconstrained image.”

So, what is Pinscreen using this crazy advanced technology for, anyway?

“We will be a new type of social media/communication platform with some interesting AR capabilities,” Li says. “It is not going to be an app like Snapchat lenses or MSQRD, which is for most people only interesting for a few minutes.”

So, that’s not much more than buzzwords for now. But, looking deeper, his prior work suggests that they’re hoping to develop the ability to track someone’s face in spite of any hair, glasses or other objects that are obstructing the view of the single camera. Add in some augmented reality, and the sky’s the limit for movies, social media, and games.

It does sound rather groundbreaking, but it’s important to note that Li and his team aren’t the only ones researching in this area. Disney recently presented its take on real-time facial performance capture, as did some other researchers with a tool called Face2Face. Theres also the aforementioned Faceshift, along with Facebook and Snapchat, who have also begun demonstrating their wares.

So what makes Pinscreen’s tech different? Li suggests his company’s solution is “a lot more robust and can infer more accurate 3D models, since we explicitly handle occlusions.”

“Our most important innovation is a technology that allows us to build a complete 3D head model including hair from any image automatically,” he states. “This type of task traditionally requires a skilled CG modeler and rigger to produce, but we can generate this fully automatically. The latest papers presented at SIGGRAPH (the leading computer graphics and interactive techniques conference) also requires multiple input images as input — we focus on the solution with the minimal input requirement, a single 2D image.”

Of course, until people see and use any app from Pinscreen, the jury will remain out about its capabilities. The difference here, though, is that Li certainly has a solid history of research and innovation behind him, especially in the crafting of digital avatars.

Li is clearly confident of Pinscreen’s possibilities in the social media space. “As our tech will democratize the generation of 3D avatars, it will have obvious applications in games, VR/AR applications for immersive communication, or AI agents,” he says, “but I think we will create the coolest application ourselves, which is the social medial platform we are developing.

“Most importantly,” he adds, “we want to build something that allows people to create really interesting content without the need for todays expensive VFX pipeline and can use really fun AR content to connect with each other.”

It sounds as if Pinscreen will include the range of technologies Li and his team have been working on. But, he says, that is only part of the plan. “The tech demo is really only a feature of what we are planning to use,” he notes. “Pinscreen will be something much bigger and the tech will be an interesting feature in the beginning. We will also be quite different than other social media platform.”

For now, Pinscreen is still a bit of a mystery and it looks like we will just have to wait to see what they have in store.