Research has found that artificial intelligence strives to understand human social interactions.

Although AI is good at solving complex logical problems, it is lacking in understanding social dynamics. A new study by researchers at Johns Hopkins University shows that AI systems are still difficult to read the nuances of human behavior, which is crucial for real-world applications such as robots and autonomous cars.
To test AI’s ability to drive human environments, the researchers designed an experiment in which both humans and AI models watched short three-second videos that interacted at levels of various intensities. According to findings presented at the International Conference on Learning Representation last week, each participant (human or machine) was asked to assess how the interaction performed.
The stakes are high in technologies such as self-driving cars, as human drivers can not only be based on traffic signals, but also predict how other drivers behave. “AI needs to be able to predict what people nearby are doing,” said Leyla Isik, a professor of cognitive science at Johns Hopkins. “For AI to run a vehicle, it is crucial to be able to recognize whether people are just hanging out, interacting with each other or preparing to take a walk in the street.”
This experiment shows that there are obvious differences between human and machine performance. Among the 150 human participants, the evaluation of the video was very consistent. By contrast, regardless of its complexity, the evaluation of the 380 AI model is dispersed and inconsistent.
Dan Malinsky, a professor of biostatistics at Columbia University, said the study highlights the key limitations of current AI technologies, especially “when predicting and understanding how dynamic systems change over time,” he told Observer.
Understanding the mind and emotions that involve multiple people can be challenging even for humans, says Konrad Kording, a professor of bioengineering and neuroscience at the University of Pennsylvania. “There are a lot of things like chess, AI is better, we may be better. There are a lot of things that I will never believe in AI to do, some I don’t believe in myself to do,” Kording told Observer.
The researchers believe that the problem may stem from the infrastructure of AI systems. AI neural networks are modeled according to parts of the human brain that process static images, which is different from brain regions that process dynamic social scenes.
“There are a lot of nuances, but the biggest takeaway is that AI models can’t match the human brain and behavioral responses to the entire scene, just like they do with static scenes,” Isik said. “I think the way humans are dealing with scenes where these models are missing.”
“It’s not enough to just see an image and recognize objects and faces. That was the first step, which took us a long way in AI But real life isn’t static. We need AI to understand the story that is unfolding in a scene. Understanding the relationships, context, and dynamics of social interactions is the next step, and this research suggests there might be a blind spot in AI model development,” said Kathy Garcia, a co-author of the study.