Ever since cameras first existed people have been pointing the lens at themselves to snap so-called selfies—a term first coined back in 2002. At this point they’re long overdue for an upgrade, so a team of researchers have created what could be the next-generation of selfies: easy to capture 3D models of yourself dubbed nerfies that let others see what’s happening all around you.
Why do we take and share selfies? Is it vanity? A desperate need for the praise and approval of our peers? Whatever the reason, they represent a carefully controlled and curated glimpse of our lives, but why not share more? Researchers from the University of Washington and Google Research have come up with a way to create a more comprehensive snapshot of a given moment in time. Instead of just looking at a 2D image, nerfies allow the viewer to zoom and pan around the subject in 3D, but without requiring any special equipment; just a camera-equipped smartphone and some CPU processing power.
Capturing a scene in 3D usually requires special hardware that includes a LIDAR scanner (Light Detection and Ranging) that uses lasers to measure the distances to objects, allowing a 3D representation to be recreated. Smartphones like the iPhone 12 Pro and 12 Pro Max now ship with LIDAR sensors built-in, but those devices aren’t exactly cheap and accessible. To create a nerfie, a user simply has to record a video of themselves on a smartphone from many different angles in a process that has them waving the device back and forth in front of them while ensuring they always stay in frame.
Creating a 3D model from this video data uses a method called Neural Radiance Fields (or NeRF, for short) that takes multiple images of an object from various angles and uses all of that two-dimensional data to calculate and generate a three-dimensional representation that can be manipulated and viewed from different perspectives afterwards. The problem with the NeRF method is that it requires the subject being captured to remain completely still during the entire process. For inanimate objects that’s not an issue, but for humans, who are always subtly moving, the NeRF method often takes advantage of a large camera array that snaps images of a person from multiple angles all at the same time. But like LIDAR hardware, a camera array can be expensive and cumbersome.
Simply having someone wave a smartphone back and forth in front of themselves while taking a video is a much easier way to generate stills from multiple angles, but the process can take quite a few seconds to complete, and it means the subject is constantly moving, despite their best efforts to remain as still as possible. To solve this, the research team developed a new method they call Deformable Neural Radiance Fields (or D-NeRF, for short) that is able to compare frames to determine how much the subject has moved between them, and then automatically calculate the necessary deformations so that the imperfect two-dimensional image data that’s extracted can be adjusted and still be used to create an accurate and interactive 3D model.
One day, assuming nerfies actually catch on, someone looking at the photo of a fancy meal shared on Instagram would potentially be able to pan around and examine the restaurant itself. Or, if an amateur fashionista shared a nerfie of themselves trying on a new top, others would be able to adjust the position of the camera to see the matching pants that went with it. It’s a technology that could potentially provide an entirely new perspective on social media, but at the same time, as many of us probably take video calls while secretly wearing pajamas under our desks, perhaps nerfies offer a look at our lives that’s a little too invasive to be comfortable.