
A recent study by MIT and Penn State University has uncovered that AI used in home surveillance systems may produce inconsistent outcomes, particularly in deciding whether to notify law enforcement. The research suggests that large language models (LLMs) like GPT-4, Gemini, and Claude made varying decisions on whether incidents captured in surveillance footage required police attention. This variability is concerning for privacy advocates and tech ethicists, signaling that the rapid deployment of AI might need reconsideration, as reported by MIT News.
Moreover, the AI’s decisions revealed an unexpected pattern of bias based on the racial makeup of neighborhoods. The models were less likely to suggest police intervention in predominantly white areas compared to those with different demographics, raising concerns about potential unintended bias within these systems. This inconsistency in response is causing concern among researchers.
The study utilized a dataset of Amazon Ring camera recordings, with the research team, led by co-senior author Ashia Wilson of MIT and including Shomik Jain and Dana Calacci of Penn State, asked the models two pertinent questions: if a crime was occurring in the footage and whether or not to recommend police action—"The move-fast, break-things modus operandi of deploying generative AI models everywhere, and particularly in high-stakes settings, deserves much more thought since it could be quite harmful," Wilson told MIT News.
The implications of the research, slated for discussion at the upcoming AAAI Conference on AI, Ethics, and Society, extend beyond the scope of residential surveillance; they further touch on broader apprehensions about AI's role in other critical areas like healthcare, lending, and employment and the current resistance researchers hit when attempting to unpack the "black box" of proprietary AI systems.
Data based on human-annotated elements such as time of day, visible actions, and the demographics of individuals were compared to the AI's judgments, leading to varying conclusions. Jain remarked, "Maybe there is something about the background conditions of these videos that gives the models this implicit bias," citing difficulties in identifying the root causes of such discriminatory behavior due to the opaque nature of the systems’ training data and algorithmic foundations, according to MIT News.
The study's findings point toward the need for more rigorous and holistic bias-mitigation techniques and a system conducive to identifying and reporting AI biases, as Calacci outlines a mission to aid users and authorities in redressing potential AI generated inequities—an endeavor supported in part by the IDSS's Initiative on Combating Systemic Racism.









