There are a thousand reasons to hate hopping on a video call every 10 minutes to chat with your co-workers, among them the ugly video compression artifacts that can occasionally make your face completely unrecognizable. Nvidia has a potential solution to the problem, but instead of fixing compression algorithms, it wants to use neural networks to render a recreation of your face in real time.
As with all video streamed across the internet, from YouTube to Netflix, compression algorithms are used to reduce the amount of bandwidth needed so that video calls always happen in real-time regardless of the speed of a user’s internet service provider. These algorithms use many tricks, from reducing color fidelity, to dropping frames and re-interpolating them later, to even minimizing the resolution of the video, which is what often leads to people sometimes looking like they’re calling on a late ‘90s webcam. Video compression algorithms will slowly improve over time, offering better quality with smaller file sizes, but Nvidia has demonstrated a solution that offers remarkable improvements right now.
It’s no secret that neural network-powered video processing tools are now capable of some impressive feats that, until recently, would have required the skills of a talented visual effects artist. In addition to convincing face swaps, these tools are also able to enhance stills and videos, generating views from different angles where a camera wasn’t originally positioned, or create entirely original footage of a person doing or saying something they haven’t before. There are good reasons to be concerned about the nefarious uses of these tools, but just as many reasons to be excited about their potential useful applications.
Nvidia is calling this new application AI video compression and instead of sending a stream of video across the internet at 15 or 30 frames per second, it only sends a smaller number of frames from specific time intervals, known as keyframes. Watching just these keyframes played back on the other end would look like a stuttered slideshow, so the system also analyzes, extracts, and shares data about the position and motion of specific points on the subject’s face, which is a trickle of data by comparison. On the receiving end, a neural network powered by a capable graphics card uses that point data to generate additional frames in between the keyframes, resulting in full-motion video with smooth playback again, and without any ugly visual artifacts commonly associated with over-compressed video.
Not only do the results on the receiver’s end look better, Nvidia’s researchers estimate that the bandwidth needed to stream video using AI video compression could be reduced to as little as one-tenth of the bandwidth needed for videos compressed with popular standards like H.264. It potentially means that even if you had to dial in to a video call on your smartphone with spotty reception, you’d still look as good as if you were sitting in the office with a fast, reliable connection, and you wouldn’t quickly max out your monthly bandwidth limit either.