Software developed by researchers at Cornell University and Google Research may revolutionize filmmaking by allowing filmmakers to stabilize shaky footage, change viewpoints, and create various effects without shooting any new footage. The software, called DynIBar, synthesizes new views using pixel information from the original video, and it can even handle moving objects and unstable camerawork. This advancement is a significant improvement over previous efforts, which often resulted in blurry or glitchy output and limited video length.
According to Noah Snavely, a research scientist at Google Research and associate professor of computer science at Cornell Tech, While this research is still in its early days, I’m really excited about potential future applications for both personal and professional use. Snavely presented the work at the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, where it received an honorable mention for the best paper award. The lead author on the study was Zhengqi Li, Ph.D. ’21, from Google Research.
Previous methods for rendering new views of still scenes involved reconstructing the 3D shape and appearance of objects in a 2D image. However, DynIBar takes it further by also estimating how objects move over time. This creates a complex mathematical problem due to considering all four dimensions. To simplify the problem, the researchers used an image-based rendering approach developed in the 1990s, which allows complex scenes and longer videos to be handled more efficiently.
Despite its significant potential, the software currently takes several hours to process just 10 or 20 seconds of video, even on powerful computers. Therefore, it may be more suitable for use in offline video editing software in the near future. The next challenge for the researchers is developing methods to render new images when pixel information is lacking, such as when the subject moves too quickly or the viewpoint needs to be rotated 180 degrees. One possible solution is incorporating generative AI techniques to fill in these gaps, similar to text-to-image generators.
The researchers are excited about the future applications of DynIBar for both personal and professional use. However, it may take some time before these features are integrated into commercial video editing tools or smartphones due to the software’s processing time constraints. Nonetheless, the software’s potential to stabilize footage, change viewpoints, and create various effects without shooting new footage brings promising possibilities to the world of filmmaking.
The code for DynIBar is freely available, although it is still in the early stages of development. The researchers hope that future advancements will allow for even more complex scenes and longer videos to be processed efficiently. They envision a future where generative AI techniques can help fill in missing pixel information, further enhancing the capabilities of the software. This research presents a significant step forward in the field of view synthesis methods and opens up new possibilities for filmmakers and video editors.