There is a deep disorder in the discourse of generative artificial intelligence, aka AI — or what I like to call sparkling intelligence, because everyone is using ✨✨✨ emojis and icons to signify AI. And, of course, intelligence can only be officially designated as artificielle if it comes from a properly designated region in France, #amirite? ChatGPT? Chatte, j’ai pété! Cat, I have farted!
All joking aside, the disorder in AI discourse is the way everyone keeps talking about hallucinations when AI makes mistakes, leading us to anthropomorphize AI and imagine that AI both experiences reality and sometimes loses it.
Anyone who knows how AI really works knows that’s wrong — AI is not hallucinating, because it’s not conscious. As much as chat interfaces may lead us to believe AI already is or is soon to be conscious, we are better off recognizing clearly how AI’s outputs are generated by technologies and practices organized by humans. We generated the training data (at least for now) and designed the algorithms that fuel AI outputs, and, most importantly, we decide what those outputs mean.
To address this blind spot in the discourse of AI, I’m pleased to announce that Anna Mills and I have published our call to change the terminology for AI’s mistakes from hallucination to mirage. Read our article, Are We Tripping? The Mirage of AI Hallucinations for all the great reasons why you should start thinking and talking about AI mirages instead of AI hallucinations.
As our argument becomes public, I want to gesture toward some additional thinking that we didn’t fit in the article.
Fixes for Mirages
The core of our article’s argument to shift AI terminology from hallucination to mirage is based on our understanding of the current capabilities of AI that are using large language and diffusion models to revolutionize the generation of synthetic media. It is widely recognized that these AI systems generate what have been called hallucinations — what we now call mirages — alongside what one might call their acceptable outputs. There have been many different attempts to reduce or eliminate mirages from these AI systems by modifying training data, model design, prompt engineering, linking language models to live search, and more.
Most of the attempts to address AI mirages that I have read about seem to be secondary systems that evaluate and adjust primary AI model outputs to make them more accurate — in some cases calling on even more AI to fix AI outputs. I expect that there will continue to be substantial technical advances in reducing unwanted AI mirages, but I don’t expect AI mirages to be entirely eliminated. We cite some papers in our article that make this very point using math and logic that is a bit over my head. Just as desert contexts continue to generate optical mirages, the AI we have today will continue to generate content mirages — in both cases because underlying material conditions are not changed by any intermediary mechanisms we might introduce to reduce the manifestation of mirages, in deserts or in AI outputs. The mirages don’t go away just because we filter them out.
I contend that the easiest and best thing we can do to deal with AI mirages now and in the future is understand how they come to be, so that we understand their provenance and learn to question all AI outputs at face value. After all, we don’t trust every human utterance, so why should we trust more in the machines?
AI Rainbows
While we argue for the terminology of AI mirages to reference AI mistakes in the article, that perspective may emphasize a negative valuation of AI outputs. Yet many artists and practitioners are leveraging AI mirages to generate, inspire, or contribute to positive, constructive, and productive works. Without diving too deeply here, I would suggest the idea of a term like AI rainbows to signify when AI outputs deviate from reality, and yet produce something of value. Just like mirages, rainbows are climatic events generated by the reality of specific physical contexts and perspectives — which also gestures toward the material circumstances when AI produces what we might call a rainbow. We humans tend to see rainbows as glorious and beautiful, while mirages are mistaken and illusory. We might use the term rainbows to talk about AI’s welcome outputs that stretch beyond our shared understanding of reality.
AI & Mental Health
I credit my daughter, Twyla Angell, for raising the idea for me that by calling AI mistakes hallucinations, we liken AI’s purely mechanistic outcomes to human mental states in ways that invoke and yet do not honor conditions of human mental health. It does not help us understand AI to think of it as a conscious entity that experiences irreality. Nor do I see how imagining AI as hallucinating helps us understand or engage beneficially with humans who are experiencing irreality. This is just another way in which talking about AI hallucinations is inaccurate and counterproductive.
AI & Disciplines
Yes, AI is cutting across all aspects of human life right now, but we might pause to consider how we talk about AI in different disciplines. For example, as above, talking about AI hallucinations in mental health contexts might have a very different impact than the same term might have in computer science. I would argue that the term AI mirage might be more appropriate in more contexts than the term AI hallucination because it is tied to actual physical phenomena rather than human mental states. However — as we call for in the article — it’s worth exploring how any of these terms might resound in any discipline. After all, physics may just seem more neutral than psychology because we think of physics as a basic natural science with less fraught discourse. Don’t get me started on the discourse of physics 🤣
Publishing & Preprints
I would be remiss not to remark on the delay and friction it took to publish this article, Are We Tripping? The Mirage of AI Hallucinations, which after a long process of exploring different publishing venues ended up as a preprint on SSRN. Many others have well covered how the delays in traditional scholarly publication hamper the dissemination and advance of research and ideas, so I will not belabor that topic here. However, I will note that the process to publish our article even as a preprint was filled with blockages and possible missteps.
Our road to preprint publication was filled with many barriers that were only overcome by circumstance and privilege. Our experience made it clear that we still have a long way to go to realize a truly frictionless open exchange of ideas.
Engage
All that said, we see this paper as just another entry in a continuing discussion around how we talk about AI. There are a few ways for you to get involved in the conversation:
Contribute to the Bibliography
First, you can view and suggest entries for our collaborative bibliography on this topic. Feel free to send me resources that you think should be in the bibliography. If you are a Zotero user and want to contribute directly, reach out and let me know your Zotero ID and I can add you to the mix.
Use the Data
Second, you can explore and reuse the shared data that Anna collected in her work to evaluate possible alternative terms for AI hallucination. As we offer in the paper, you are welcome to look through this openly licensed collection of proposed terms in the original online spreadsheet and copy it, add new terms, change the evaluation criteria, rescore the options, and/or change how the criteria are weighted — or use this openly licensed data in any other way you see fit as long as you include attribution following the data’s CC BY license.
Teach and Learn
Third, as we suggest in the paper, we believe that this conversation around terminology can be a fruitful entry into learning activities around AI, its technologies, and its relationship with human culture and society. We’d love to hear about how you have incorporated this kind of conversation into your teaching and learning. Share any stories or artifacts you may have about your activities and we will amplify.
Join the Conversation
Fourth, participate in the annotated conversation about this paper in its margins using the Hypothesis social annotation platform. View the conversation so far and join in once you get a free account from Hypothesis. Anna and I will also be talking about this paper and continuing the conversation on social media, where you can find us on Bluesky (Anna, Nate), LinkedIn (Anna, Nate), and Mastodon (Anna, Nate).
Thanks for your thorough and inclusive approach to publishing these thought-provoking ideas.
Two quick points about your paper: 1. The mirage term is still problematic in the sense that it throws the “fault” on the user, rather than highlighting the flaws of the LLM process (let’s leave the AI naming issue aside for now). This is reminiscent of the carbon footprint issue which tries to fault the consumer rather than highlight the power dynamic in a consumer society.
Which leads me to:
2. The issue of literacy is important, but it should not cloud the power dynamic at play here. No one is asking for this LLM-generated reality – it is being pushed by a top-down dynamic that we have seen since the industrial revolution and advertising. I would argue that, for example, google results were better before the intro of “AI summary” because the algorithms were attuned to crowd-sourcing rather than the russian roulette of LLM-produced “facts”. These are not mirages produced in the human brain – they are artifacts of an error-prone hijacking of social convention. When an encyclopedia makes a mistake, we issue errata to correct it. Putting all the onus on literacy short-circuits the process of trust.
Thanks for reading Kevin! We likely share many viewpoints on the larger roles of what I like to call ✨sparkling intelligence✨ plays in wider culture and society. All that is a bit beyond the more narrow focus of this paper, which has the goal of changing how we talk about one AI-related phenomenon, or at least fostering more conversation about AI terminology to support deeper understanding of how LLMs actually work.
On your first point: I disagree. One of the major points of our paper was to highlight the role the AI user/consumer plays in the evaluation of AI outputs. I do not believe AI is “at fault” when it produces a mirage, because in producing a mirage, AI is just doing what it does when it produces something even you and I would agree is a factual statement. I’m not really interested in questions of fault — in the case of this paper, I’m interested in how we can better understand how AI actually works by using better metaphors when we talk about it.
On your second point: I would argue that increasing AI literacies can help people understand the power dynamics in the development, adoption, and dissemination of AI. The point of using the “mirage” metaphor is in fact that mirages are precisely NOT produced in the human brain, but are real physical artifacts that may be interpreted by some witnesses as something they are not — just like AI mirages.
Congrats to you and Anna for this impeccably referenced treatment and also the shared process elements. And plus, there is a human voice in the writing. I can only imagine what it took to sell the publisher on the “Are we tripping” title.
As I read it seemed like the questions that came up were answered in the next section- even the challenge of turning the noun into the verb. It’s likely hard to change the vernacular broadly, “hallucinating” has been repeated too/so much to pull it back, but if I read it, its what we can do in our own language and thinking. I would like to think I stopped it a while ago as it just “did not feel right”.
There is something to how humans relate to the experience they are having. The conversational exchanges, with all the crap that lies behind it, still can be deceivingly feel like the mirage of conversation. In early exchanges I found myself yelling it words or insulting sarcastically the responses I was getting. I have some thoughts back to (tech age showing) the times in SecondLife when people were embarrassed or shamed when changing their avatars appearance they were seen in the virtual world as naked (block pixels). It’s hilarious but I know in talking to people that the embarrassment was real.
The question I have left it- if its a mirage when GenAI returns “wrong” or “non factual” or gibberish responses, what is the name when it returns something we might not argue with, be it “right” or “sensible”? Isn’t it always a mirage because the process and methods that return the result are exactly the same? Or is that the rainbow?
Thanks again, looking forward to more conversations and hearing what feedback you both get.
#CogDogPbzzragOybttvat2025
Thanks for these comments Alan! I hope that at the very least, some people will do what you suggest and change their thinking and words to be more thoughtful and accurate about AI. We’ll see how widely “mirage” catches on!
We do call out the same question you have left: “It is, of course, possible to classify all AI outputs as mirages because all AI outputs are generated by the same processes, regardless of whether we might deem them to be mirages or not. The need for a general term like mirage or hallucination is to enable us to differentiate between those AI outputs that don’t match our expectations of facticity or realism and those that do…” So I agree, AI is using the same processes to produce all its outputs — mirage or not. For me, the differentiation is left in the eye of the beholder — us, the thirsty travelers across the terrain of information. Depending on our judgment, we might deem an AI output to be a mirage, a rainbow, or acceptable. If we call all AI outputs “mirages” all the time, then we lose the opportunity to label those outputs that disappoint…
I like the rainbows idea, and I buy the basic argument that mirage is better than hallucination.
It all feels like part of a larger shift from deterministic to non-deterministic computing. Including the machine learning principle of quantization, in which systems are improved by decreasing the specificity required of their inputs in order to generate value. Extreme specificity is expensive – can we get what we need for a particular use case with less of it?
To draw an analogy, maybe we humans need to watch out we don’t spend too much time trying to sharpen our AI lenses to be mirage-free. We could instead allow the machines to say “you know what I mean” and get on with making great and knowing use of their imprecise outputs. With the need for precision determined on a per-use-case basis. And maybe we’re already doing that. So maybe the kids are alright!
Thanks for bringing ideas of quantization and specificity to the conversation Marshall!
While I like the idea of humans working to be better equipped to “handle” AI mirages — this is of course the literacies we call for in the paper — I do think about what we might need to do to make sure those literacies happen in a world where folks have AI at hand throughout their lives. I can imagine a “kid” being very differently “alright” if, for example, they honed their intellectual skills in classic high school debate activities than they might be if, for example, they honed their intellectual skills in an environment where nearly infinite, seemingly authoritative content were always just a button click away…