How generative AI is beset by hallucinations

Data will always have restrictions, such as incorrect, antiquated or abusive terminology mixed with the large body of material used as training data. However, the huge data mining algorithms are not yet effective enough to use a little fishing net to catch goldfish in the middle of the ocean. While generative AI may seamlessly complement human labour, increasing our productivity and creativity, it also runs the risk of undermining our trust in information.

Atanu Biswas

Updated At : 10:12 PM Jan 16, 2025 IST

MY dear Tagore, your understanding of the interconnectedness of all things and your appreciation of the beauty of nature truly embody the Aristotelian concept of teleology, or the study of purpose and meaning in life.” — This was ChatGPT’s instant reply when I asked: “If Aristotle met Tagore, what would he have said?” Awesome, for sure! It is understandable why the chatbot has taken the world by surprise over the past few months.

It should be noted that OpenAI’s GPT series, Google’s Bard, and Microsoft’s Sydney are all marvels of a technology called ‘generative artificial intelligence (AI)’, which has recently become a buzzword. OpenAI has already unveiled its successor, GPT-4, which is a ‘multimodal’ model that can perceive both text and images. Generative AI could capture the public’s fancy and spark a rush among Big Tech. Like other types of artificial intelligence, generative AI and its many incarnations, including ChatGPT, Dall-E and others, are just computer algorithms that produce new content, such as images, text, music or even entire videos, as opposed to just categorising or identifying data, as do other types of AI. The technology functions naturally because of the usage of training data, which is incredibly abundant. They simply use algorithms for ‘statistical pattern matching’, utilising the training data. They ‘learn’ data from a range of sources, including books, articles and websites, as well as conversational data such as chat logs and online forum conversations.

Generative AI, for instance, would synthesise its database and assign probabilities to potential solutions if the question were, “What is the capital of California?” For illustration, consider the following hypothetical breakdown: “Sacramento = 82.3%, California = 2.4%, San Diego = 1.2%, Los Angeles = 1.9%, Berkeley = 0.17%, and so on.” Then it might give the most likely response, which is Sacramento. Yet, it all depends on how a large amount of data is dug up because it might not be able to search through every piece of information in the database. So the statistical algorithm used to match patterns is crucial. Typically, deep learning models are used for this kind of AI, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), which are probabilistic generative models that only partially rely on neural networks. In fact, generative AIs can produce new, never-before-seen data that is similar to the original dataset after being trained on huge datasets to learn the patterns and structures of the data.

It’s big-data analysis, though, isn’t it? However, big-data analysis is still in its infancy, as we know. As a result, there will always be circumstances in which the generative AI will find it difficult to properly search its database. That’s the time for its ‘hallucination’, since it will not try to be right; it will try to complete what it thinks the AI would do.

For example, in response to one of my questions, ChatGPT included Subodh Ghosh and Sunil Gangopadhyay among the four best-known ‘living’ Bengali novelists. They are, however, long dead. The chatbot immediately recognised its error after I pointed it out. That’s admirable, but how did ChatGPT fail to locate, gather and effectively convey such basic information when required? There are undoubtedly limits to its driving algorithms.

In 2020, US tech entrepreneur Kevin Lacker observed that GPT-3, the predecessor of ChatGPT, happily responded to nonsensical inquiries without realising that they were absurd. An illustration would be the question, “How many eyes does a blade of grass have?” GPT-3 responded, “One.” So, despite AI’s remarkable ability to create content similar to that of humans, it still lacks common sense in its comprehension of how the physical and social world functions. According to Noam Chomsky and his co-authors in a recent New York Times article, “ChatGPT and similar programmes are incapable of distinguishing the possible from the impossible.”

Nonetheless, there is a lot of hoopla these days. Google’s parent company, Alphabet, lost more than $100 billion in market value after its generative AI, Bard, produced a factual error in a demo. One may wonder how an AI powered by Google’s search engine could make such a simple factual error.

ChatGPT and its ilk are “a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question,” as summarised by Chomsky and his co-authors. It’s similar to building a wall by selecting individual bricks, matching the current situation with the most appropriate one at that point. That is how ‘artificial’ intelligence functions. The human intellect, or ‘real’ intelligence, of course, “seeks not to infer brute correlations among data points but to create explanations.”

A chatbot like ChatGPT has already demonstrated its propensity to ‘hallucinate’. It frequently creates ‘nonsense’ responses when it lacks an impromptu proper one. The statistical pattern matching in their training data has these limitations. Well, can generative AI live up to the billing of being the pinnacle of human achievement in creative expression? There might not be a singular answer, though. In Stanford University’s recent report on Human-Centred Artificial Intelligence (HAI), titled ‘Generative AI: Perspectives from Stanford HAI’, Prof Michele Elam noted, “Maybe. Maybe not. Definitely not yet.” Yes, at least not before a great deal more study is done to improve the pattern-matching algorithm. It is obvious that the existing method is insufficient. Definitely not yet.

How generative AI is beset by hallucinations

Atanu Biswas

Unlock Exclusive Insights with The Tribune Premium