‘ChatGPT isn’t even close to being intelligent,’ says Tel Aviv University expert

The technology is still “far from human cognition,” and although it will have useful applications, there are dangers involved, warns Prof. Roni Katzir.

By Viki Auslender, CTech via Calcalist

In 1773, the French engineer Jacques de Vaucanson introduced the “digesting duck.” The machine was made of copper, plated with gold and included over four hundred parts designed to reproduce every bump and bone. The movements of the machine were modeled according to studies and were meant to imitate natural ducks, and the machine – so it was described – walked like a duck, quacked like a duck, stretched its neck, flapped its wings and ate corn from the operator’s palm like a duck. Finally and the highlight – it defecated like a duck.

Vaucanson explained that all the processes were “copied from nature” and the food was digested as it happens in “real animals” using a small chemical laboratory that he placed in the heart of the machine. He would later use his invention to prove to researchers in the field of physiology that the digestive process is actually mechanical.

The duck became a sensation and a testimony to the ability of a talented engineer to reproduce a mechanism that is at the base of the life process. The philosopher Voltaire, who was captivated by the blur that the duck embodied between natural and synthetic life, said at the time that if it weren’t for the duck, there would be “nothing to remind us of the glory of France”, and called Vaucanson “the rival of Prometheus,” the Titan who gave man fire. Years later, the mechanism of the fraud was revealed: the corn did not continue down the neck but remained at the base of the mouth tube, and green-painted breadcrumbs were released from a separate container.

Exactly 250 years have passed since the digestive duck show, and engineers continue to invent machines and along the way, as is their habit, get confused by the machines they built, and poetically attribute to them what they cannot do – imitation of life. This time on the operating table are language, consciousness and human intelligence that are being reconceptualized using natural language models. The most famous of them — ChatGPT, grabbed the headlines and the public imagination in recent months, and according to some, became evidence of a machine that controls language and possesses an intelligence that is fundamentally similar to that of a human.

“Given the breadth and depth of GPT-4’s capabilities, we believe it can reasonably be considered an early (and still incomplete) version of an artificial general intelligence (AGI) system,” a team of researchers from Microsoft, the largest investor in OpenAI, wrote about GPT-4. “[The models are] the most powerful technology that humanity has yet developed,” OpenAI CEO Sam Altman said (apparently forgetting antibiotics, electricity, the light bulb, the Internet, etc.) of the technology that he believes will ultimately represent “the collective power, and creativity, and will of humanity.”

“ChatGPT isn’t even close to being intelligent,” Prof. Roni Katzir says emphatically in an interview with Calcalist. “People recognize patterns and generalize very well – in a sense we are born scientists – but ChatGPT and the other current models do not understand and are not good at finding generalizations. By the way, it is not that there is any reason to think that machines cannot become intelligent at some point in the future, but the current models are really not going in the right direction. They are a successful progress at the useful level, but not beyond that.”

Katzir, a theoretical and computational linguist who heads the Laboratory for Computational Linguistics at Tel Aviv University, can be found these days like a handful of other linguists in a rearguard battle. As engineers build models like ChatGPT, Facebook’s Galaxy or Google’s Bard, he and his colleagues are forced to fend off increasing claims that what we thought about language, intelligence or how language is acquired is wrong.

“Even after these models have read the entire Internet, huge amounts of information that are hard to imagine,” adds Katzir and explains, “they still fail to understand simple aspects of syntax that children understand after a very short time.”

The models called “large language models” (or LLMs) are statistical tools for predicting words in a sequence. That is, after training on large data sets, the models guess in a sophisticated way which word is likely to come after which word (and sentence after sentence). Today’s models are particularly large and have been trained on data from diverse sources such as Wikipedia, collections of books and raw web page scans. This intensive learning allows the model, given a certain input text, to probabilistically determine what vocabulary it knows will come next. Although the model knows how to build convincing combinations, the result it produces is not related and does not intentionally communicate any idea about the world.

Still, and at least according to OpenAI and others, these models show a glimmer of understanding or logic, a resemblance to human intelligence. On a daily basis, these chats are already attributed with “creativity”, which apparently redefines human cognition, and at Google a programmer was fired after he determined that the language model developed in the company was “conscious” and tried to hire a lawyer for him so that it would represent his interests. As mentioned, Microsoft explained in a press release that “GPT-4 achieved a form of general intelligence [as] evidenced by its core mental abilities (such as reasoning, creativity, and deduction).”

However, the idea that we are moving towards “artificial general intelligence” (AGI) is controversial, especially among linguists who explain that grammatical fluency, even if at a high level, is still a long way from a machine that can think.

‘The model doesn’t understand anything’

“The models are very good at training on a large collection of information,” Katzir continues, “each new model generation gets more information to train on than the previous generation and also more artificial neurons, and they are very good at collecting many statistics on observations of words. They don’t really understand what these sequences say and don’t know what they are talking about, but they are very good at gathering this information and creating statistics of what texts look like and then they throw all kinds of sequences back to us in a way that seems very convincing, because it is very similar to all kinds of things they have seen many times in all kinds of places.

“But the model doesn’t understand anything, and it doesn’t even try to understand. It’s an engineering tool with a useful purpose, like automatic completion. Humans, on the other hand, organize the information we receive in a way that allows us to formulate interesting and new things, often in a way that really doesn’t match the existing statistics, and we mostly know what we are talking about.”

And if they learn more? Will they be able to master the language like we can?

“We humans come prepared in advance for this task innately. We have a certain brain base that allows us to organize the information, and we have learning methods that allow us to learn well and generalize in systematic ways from little information. This is how children come within a few years of life to master language.

“These large engineering models are something very different. They train for what is equivalent to thousands of years of life, and still do not come close to the level of knowledge and understanding that children have.

“For example, every child who grew up in a Hebrew-speaking environment knows that the sentence ‘The boy Dina saw yesterday and Yossi will meet tomorrow is Danny’ is a good sentence in Hebrew, while ‘The boy Dina saw you yesterday and Yossi will meet tomorrow is Danny’ is a bad sentence. But when we tested a whole collection of current models, some of them trained on a collection of a magnitude larger than what children hear, the models preferred the second, worse sentence.

“It’s not that the models can’t learn the distinction between the two sentences given enough information. To the best of our knowledge, they can. But the fact that they don’t reach this distinction given the amount of information children receive shows that the models are very different from us. This is just one example, of course. Linguistic research provides many more examples of this kind.

“For all their engineering refinements, the models are simply too far from human cognition. They come to the learning task with a different representation system than ours and with a limited and different inference ability than ours, and the results reflect this, even after all the intense training they go through. Their abilities may be good enough to be an engineering tool, but you shouldn’t get confused and think that this is something that models us.”

We have seen that bots have started to be used to write books and some fear that AI is becoming so powerful that it poses an existential threat to the human race.

Useful tool for people wanting to do harm?

“I think that books written with ChatGPT will simply be bad books, because the model doesn’t really understand and isn’t really original. Maybe it can sometimes write seminar papers that at a superficial glance will look serious, but that’s fine, that’s not what worries me. It will only force the lecturers to find better ways to assess students’ knowledge and understanding.

“My concerns are different. One problem that is talked about a lot is that these models are environmentally damaging – their carbon footprint is very large. Also, because they are reading existing texts then they reproduce stereotypes and dark views. Personally, it is particularly worrying that a tool of this type can allow the distribution of fake information on an indescribable scale. Anyone who wants to can use it to flood social networks, Wikipedia and journals with false or biased information and biased or just confusing arguments. This can impair people’s ability to understand reality and make informed decisions, and the result is social damage and damage to the ability of democracies to exist.

“The danger right now is not that the model itself is intelligent and will want to do harm – this may happen in the future, but the existing models are not intelligent and do not want anything. The danger is that the model will be a useful tool for people who want to do harm. Such a tool requires regulation. Just as there is regulation regarding which virus is allowed to be engineered In laboratories, then regulation is also needed regarding these such things.”

According to Katzir, the blurring between what these models do and human intelligence, whether intentional or not, is related to the confusion between concepts and goals. To clarify the point, Katzir brings us back for a moment to the world of Vaucanson’s images.

“Science and engineering are two different things. In most fields it is not something that confuses us. For example, we can do science and study how birds fly, and we can do engineering and build flying machines. Most of us know that these are two different things. But when it comes to language, many people get confused even though it’s the same distinction. Linguistics, as a cognitive science, studies the mechanism that humans have in their heads and is responsible for human linguistic ability. It’s like studying birds. And engineering builds models like ChatGPT. It’s like building flying machines. It can be very useful to build airplanes, but that’s something different, a completely separate project from understanding human language.

“Nobody thinks that because they built a better plane then they solved the question of how birds fly. Engineering is not trying to understand how the world works. It’s trying to build a useful tool. In the case of the models like ChatGPT, it’s a tool that can help us write emails and lines of code faster. We admire ChatGPT because it’s a significant improvement over previous engineering tools we’ve known. The texts it completes really look very good. I think there’s also some pleasure in imagining that this thing is intelligent, though that’s a mistake of course.

“Anyway, just as an airplane doesn’t explain birds to us, ChatGPT does not explain to us how the linguistic ability of humans works.”