Why AI can’t understand a flower the way humans do

Why AI can’t understand a flower the way humans do

Credit: Unsplash/CC0 Public Domain

Even with all its training and computer power, an artificial intelligence (AI) tool like ChatGPT can’t represent the concept of a flower the way a human does, according to a new study.

That’s because the large language models (LLMs) that power AI assistants are usually based on language alone, and sometimes with images.

“A large language model can’t smell a rose, touch the petals of a daisy or walk through a field of wildflowers,” said Qihui Xu, lead author of the study and postdoctoral researcher in psychology at The Ohio State University.

“Without those sensory and motor experiences, it can’t truly represent what a flower is in all its richness. The same is true of some other human concepts.”

The study is published in the journal Nature Human Behaviour.

Xu said the findings have implications for how AI and humans relate to each other.

“If AI construes the world in a fundamentally different way from humans, it could affect how it interacts with us,” she said.

Xu and her colleagues compared humans and LLMs in their knowledge representation of 4,442 words—everything from “flower” and “hoof” to “humorous” and “swing.”

They compared the similarity of representations between humans and two state-of-the-art LLM families from OpenAI (GPT-3.5 and GPT-4) and Google (PaLM and Gemini).

Humans and LLMs were tested on two measures. One, called the Glasgow Norms, asks for ratings of words on nine dimensions, such as arousal, concreteness and imageability. For example, the measure asks for ratings of how emotionally arousing a flower is, and how much one can mentally visualize a flower (or how imageable it is).

The other measure, called Lancaster Norms, examined how concepts of words are related to sensory information (such as touch, hearing, smell, vision) and motor information, which are involved with actions—such as what humans do through contact with the mouth, hand, arm and torso.

For example, the measure asks for ratings on how much one experiences flowers by smelling, and how much one experiences flowers using actions from the torso.

The goal was to see how the LLMs and humans were aligned in their ratings of the words. In one analysis, the researchers examined how much humans and AI were correlated on concepts. For example, do the LLMs and humans agree that some concepts have higher emotional arousal than others?

In a second analysis, researchers investigated how humans compared to LLMs on deciding how different dimensions may jointly contribute to a word’s overall conceptual representation and how different words are interconnected.

For example, the concepts of “pasta” and “roses” might both receive high ratings for how much they involve the sense of smell. However, pasta is considered more similar to noodles than to roses—at least for humans—not just because of its smell, but also its visual appearance and taste.

Overall, the LLMs did very well compared to humans in representing words that didn’t have any connection to the senses and to motor actions. But when it came to words that have connections to things we see, taste or interact with using our body, that’s where AI failed to capture human concepts.

“From the intense aroma of a flower, the vivid silky touch when we caress petals, to the profound joy evoked, human representation of ‘flower’ binds these diverse experiences and interactions into a coherent category,” the researchers say in the paper.

The issue is that most LLMs are dependent on language, and “language by itself can’t fully recover conceptual representation in all its richness,” Xu said.

Even though LLMs can approximate some human concepts, particularly when they don’t involve senses or motor actions, this kind of learning is not efficient.

“They obtain what they know by consuming vast amounts of text—orders of magnitude larger than what a human is exposed to in their entire lifetimes—and still can’t quite capture some concepts the way humans do,” Xu said.

“The human experience is far richer than words alone can hold.”

But Xu noted that LLMs are continually improving and it’s likely they will get better at capturing human concepts. The study did find that LLMs that are trained with images as well as text did do better than text-only models in representing concepts related to vision.

And when future LLMs are augmented with sensor data and robotics, they may be able to actively make inferences about and act upon the physical world, she said.

Co-authors on the study were Yingying Peng, Ping Li and Minghua Wu of the Hong Kong Polytechnic University; Samuel Nastase of Princeton University; and Martin Chodorow of the City University of New York.

More information:
Large language models without grounding recover non-sensorimotor but not sensorimotor features of human concepts, Nature Human Behaviour (2025). DOI: 10.1038/s41562-025-02203-8

Provided by
The Ohio State University


Citation:
Why AI can’t understand a flower the way humans do (2025, June 4)
retrieved 4 June 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.




Source link

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every week.

We don’t spam! Read our privacy policy for more info.

More From Author

Pistorius: Allies to further strengthen Ukraine’s military defence

Pistorius: Allies to further strengthen Ukraine’s military defence

Andre Lotterer aims at rallying after “dream” test and “mind-blowing” passenger ride

Andre Lotterer aims at rallying after “dream” test and “mind-blowing” passenger ride

Leave a Reply

Your email address will not be published. Required fields are marked *