Boston
AI Assisted Icon
Published on March 10, 2024
MIT's Vision Quest Shows AI Gaining Human-like Peripheral Sight, Poised to Shift Gears in Driving SafetySource: Massachusetts Institute of Technology

In a move poised to revolutionize the field of artificial intelligence, MIT researchers have made strides in mimicking human peripheral vision within AI models. The development, which could spell significant improvements in driver safety and provide insights into human visual processing, involves an innovative dataset to train machine learning models that enhances their object detection capabilities in the visual periphery.

The research hinges on human-like perception, which allows us to detect shapes outside our direct line of sight, a feature currently lacking in AI. By closely simulating human peripheral vision, the MIT team aims to equip AIs with the dexterity to better detect impending hazards and even foresee whether a human driver would notice an approaching object. Vasha DuTell, a postdoc at MIT, expressed a discovery that despite various tests, AI models trained with the new dataset performed better but still fell short of human capability. "There is something fundamental going on here," DuTell told MIT News, highlighting the enigmatic gap between human and AI visual processing.

The study involved altering a technique used to model human peripheral vision, known as the texture tiling model. This adaptation resulted in a new dataset, which they employed to rigorously train several computer vision models and compared their performance against humans in an object detection task. Lead author Anne Harrington MEng '23 emphasized the project's potential impact, stating, "Modeling peripheral vision, if we can really capture the essence of what is represented in the periphery, can help us understand the features in a visual scene that make our eyes move to collect more information."

While the models trained from scratch with the newly developed dataset showed notable improvements, they nonetheless underperformed compared to humans, especially in detecting objects in the far periphery. "That might suggest that the models aren't using context in the same way as humans are to do these detection tasks. The strategy of the models might be different," Harrington shared with MIT News. The research's significance was further endorsed by the likes of Justin Gardner, an associate professor at Stanford University, who was not involved in the study. Gardner asserted the importance of understanding human peripheral vision, not as inferior due to photoreceptor limits but as an optimized representation for real-world tasks. He also encouraged further AI research inspired by human vision neuroscience.

Partially funded by the Toyota Research Institute and the MIT CSAIL METEOR Fellowship, the work was spearheaded by a team including Mark Hamilton, Ayush Tewari, Simon Stent, and senior authors William T. Freeman and Ruth Rosenholtz. Celebrated by the scientific community, this pioneering study will be presented at the upcoming International Conference on Learning Representations. As a testament to its potential, Rosenholtz remarked, "Any time you have a human interacting with a machine — a car, a robot, a user interface — it is hugely important to understand what the person can see. Peripheral vision plays a critical role in that understanding."

Boston-Science, Tech & Medicine