
In an effort poised to enhance the functionality of autonomous vehicles and AR/VR technology, a team from MIT and Meta has introduced a computer vision technique capable of generating 3D models that include areas typically concealed from view. This technique, named PlatoNeRF, uses shadow analysis from a single camera position to model obstructed parts of a scene. This method may one day allow self-driving cars to "see" and react to conditions beyond their immediate visual field and help augmented and virtual reality systems map environments without the need for physical measurement, as reported by MIT News.
The research taps into the potential of single-photon lidar technology, known for its precise mapping capabilities by timing the reflection of light pulses. People are already familiar with lidar as a tool in advanced driver assistance systems. The PlatoNeRF process involves capturing light that has bounced twice, thereby gathering additional data about a scene's depth and the shadow-produced shapes from obstructions. Tzofi Klinghoffer, an MIT graduate student linked to the MIT Media Lab, stated that this synergy of multibounce lidar and machine learning opens up many new opportunities to explore.
The efficiency of the PlatoNeRF system lies in its usage of neural radiance fields (NeRF), a type of machine-learning model adept at interpolating scenes, which when combined with multibounce lidar yields more accurate scene reconstructions, an insight shared by Ramesh Raskar, an associate professor at MIT and leader of the Camera Culture Group. Comparing PlatoNeRF with existing methods, researchers have asserted superiority particularly in scenarios with lower resolution sensors, suggesting a higher feasibility for real-world implementation where commercial devices commonly feature such sensors.
According to the research team led by Klinghoffer and under the guidance of Raskar, the PlatoNeRF method surpasses alternatives that use either lidar alone or NeRF with a color image. The technique's ambition is well-founded, building on principles established by a pioneering camera capable of "seeing" around corners—technology conceptualized by the same MIT group over a decade ago. Newer iterations include enhancements that could potentially improve safety in various industries and everyday technology, including smartphones.
The research team plans to track light beyond two bounces and use advanced learning techniques with color image data to improve texture capture, increasing realism and detail in reconstructions. David Lindell, an assistant professor at the University of Toronto, noted that revisiting shadows with lidar significantly improves accuracy in revealing hidden geometry, highlighting the importance of combining smart algorithms with common sensors.