Impressionistic image of a dark alley in a gritty city. A shadowy figure in a hat and trench coat lurks as if waiting for his next victim. The colors are dark and muted, creating a moody and foreboding atmosphere.

(Image by DALL-E. Prompt: An impressionistic image representing ‘the problems of generative AI.’ The colors are dark and muted, creating a moody and foreboding atmosphere. The scene is a dark alley in a gritty, noir-style city. A shadowy figure with a hat and trench coat lurks as if waiting for his next victim.)

This is the third in a four-part series of posts on IA and AI, mainly inspired by talks at IAC24, the information architecture conference. Read Part 1 and Part 2.

In this post, I’m riffing on the work of several IAC24 speakers, predominantly presentations by Emily Bender and Andrea Resmini.

In Part 2 of this series, I looked at the benefits of Generative AI. The use cases at which GenAI excels are generally those in which the AI is a partner with humans, not a replacement for humans. In these cases, GenAI is part of a process that a human would do anyway and where the human is ultimately in control of the output.

But GenAI output can have numerous issues, from hallucinating (making up false information) to generating content with biases, stereotypes, and other subtle and not-so-subtle errors. We can be lulled into believing the AI output is true because it’s presented as human-like language, so it seems like a human-like process created it. But although GenAI can create convincing human-like output, at its core it’s just a machine. It’s important to know the difference between what we think AI is and what it actually is, and between what we think it’s capable of and what it’s actually capable of.

AI isn’t human

Here is basically what Generative AI does: it makes a mathematical representation of some large corpus of material, such as text or images. It then makes new versions of that material using math. For example, if you feed an LLM (large language model) a lot of examples of text and have it run a bunch of statistics on how different words relate to each other, you can then use those statistics to recombine words in novel ways that are eerily human-like. A process called fine-tuning will teach the LLM the boundaries of acceptable output, but fine-tuning is necessarily limited and can’t always keep GenAI output error- or bias-free. (And the process may introduce new biases.) LLMs produce statistically-generated, unmoderated output that sounds human-generated.

Therein lies the main problem: Generative AI produces human-like output while not being human, or self-aware, or anything like an actual thinking thing. As Andrea Resmini put it in his talk at IAC24, “AI is technology that appears to be intelligent.” Emphasis on “appears.” Since GenAI output seems human but isn’t, it tricks our brains into believing that it’s more reliable and trustworthy than it actually is.

Humans want to see human qualities in everything. We anthropomorphize; we invest objects in the world with human attributes. We see human faces in things where they don’t exist.[1] We get easily wrapped up in the experience of seeing human qualities in other things and forget that the thing itself is just a thing; we’ve projected our own experience onto it.

It’s not too dissimilar from experiencing art. We bring meaning — our filters, our subjective experience — to the experience of art. We invest art with meaning. There is no experience of art without our participation in it. Your experience of a piece of culture might be very different from mine because of our different life experiences and thinking patterns and the different ways we interact with the work.

In the same way, when we participate in AI experiences we invest those experiences with meaning that comes from our unique subjective being. GenAI isn’t capable of delivering meaning. We bring the meaning to the experience. And the meaning we’re bringing In the moment is based on how we’ve learned to interact with other humans.

In a recent online article, Navneet Alang wrote about his experience of asking ChatGPT to write a story. He sought an explanation for his sensation of how human-like the story felt:

Robin Zebrowski, professor and chair of cognitive science at Beloit College in Wisconsin, explains the humanity I sensed this way: “The only truly linguistic things we’ve ever encountered are things that have minds. And so when we encounter something that looks like it’s doing language the way we do language, all of our priors get pulled in, and we think, ‘Oh, this is clearly a minded thing.’”

AI simulates human thought without human awareness

Attributing human thought to computer software has been going on long enough to have a name: The ELIZA Effect. ELIZA was an early computer program that would ask simple questions in response to text input. The experience of chatting with ELIZA could feel human-like – almost like a therapy session – though the illusion would eventually be broken by the limited capacity of the software.

GenAI is a way better ELIZA, but it’s still the same effect. When we impute human qualities to software, we’re making a category error: we’re equating symbolic computations with the ability to think.

GenAI can simulate the act of communicating intent and make statements that appear authoritative. But, as Dr. Bender said in her IAC24 keynote, AI doesn’t have intent. It can’t know what it’s saying, and it’s very possible for it to say something very confidently that is very wrong and possibly very harmful.

GenAI engines lack (at least) these important human qualities:

  • humility - the ability to say “I don’t know”
  • judgment — the ability to say “that’s not possible” or “that’s not advisable” or “that’s a racist question”, or to evaluate and question data that doesn’t seem right[2]
  • the ability to learn from feedback mechanisms like pain receptors or peer pressure
  • sentience - the ability to feel or experience through senses
  • cognition - the ability to think

Those last points are really important: Computers can’t think and they can’t feel. They can’t know what they’re doing. They are machines made of code that can consolidate patterns of content and do some fancy math to recombine those patterns in novel ways. No more. And it’s right to be skeptical and maybe even scared of things that can’t reason and have no empathy but that can imitate external human behavior really well. They are not human, just human-like.

And so here’s the problem: if we confuse the human-like output from a GenAI engine with actual human output, we will have imputed all sorts of human-like qualities to that output. We will expect it to have passed through human filters like judgment and humility (because that’s what a human would do, and what we’re interacting with appears human). We will open ourselves up to accepting bad information because it sounds like it might be good. No matter that we might also get good information, if we’re not able to properly discern the bad information, we make ourselves vulnerable to all sorts of risks.

AI training data, however, is human… very, very human

So, human-like behavior without human thought behind it is a big problem with GenAI. Another major problem has to do with the data that’s been used to train the large language models GenAI engines are based on. And for the moment I’m setting aside the ethical questions around appropriating people’s work without notice or compensation. Let’s just focus for now on the issue of data quality.

GenAI scales the ability to create text and images based on what humans have already created. It does this by ingesting vast quantities of human-created content, making mathematical representations of probabilities inherent in that content, and then replaying those patterns in different combinations.

Through this process, GenAI reflects back to us a representation of ourselves. It’s like looking in a mirror in a well-lit bathroom, in a way, because it reveals not only our best qualities but also our flaws and biases. As Andrea Resmini pointed out, data is always dirty. It’s been created and edited by humans, and so it has human fallibility within it. Whatever reality was inherent in the training data will be reflected in the GenAI output. (Unless we take specific steps to hide or moderate those flaws, and those don’t always go well.)

If GenAI is sometimes a bathroom mirror, it’s also sometimes a funhouse mirror, distorting the interactions that it produces due to deficiencies in the training data (and because it is not self-aware and can’t correct itself). For any question that matches a sufficient amount of training data, the GenAI agent can sometimes give a reasonably accurate answer. Where there is little or no training data, a GenAI agent may make up an answer, and this answer will likely be wrong.

It’s garbage in, garbage out. Even if you get the occasional treasure, there’s still a lot of trash to deal with. How do we sort out which is which?

Context cluelessness and psychopathy

So, a GenAI engine might give you a response that’s 100% correct or 0% correct or anywhere in between. How do you know where on the spectrum any given response may lie? Without context clues, it’s impossible to be sure.

When we use search engines to find an answer to a question, we get some context from the sites we visit. We can see if it’s a source with a name we recognize, or if obvious care has been taken to craft a good online experience. And we can tell when a site has been built poorly, or is so cluttered with ads or riddled with spelling and grammar errors that we realize we should move on to the next source.

GenAI engines smooth all of its sources out like peanut butter on a slice of white bread, so that those context clues disappear into the single context of the AI agent.

And if the agent is responding as if it were human, it’s not giving you human clues to its veracity and trustworthiness. Most humans feel social pressure to be truthful and accurate in most situations, to the best of their abilities. A GenAI agent does not feel anything, much less social pressure. There is nothing in a GenAI engine that can be motivated to be cautious or circumspect or to say “I don’t know.” A GenAI agent is never going to come back to you the next day and say, “You know, I was thinking about my response to you and I think I might have gotten it wrong.”

Moreover, a human has tells. A human might hedge or hesitate, or look you in the eye or avoid your gaze. A GenAI agent won’t. Its tone will be basically the same regardless of the truth or accuracy of its response. This capacity to be completely wrong while appearing confident and authoritative is the most disturbing aspect of GenAI to me.

As GenAI gets more human-like in its behaviors it becomes a more engaging and convincing illusion, bypassing our BS detectors and skepticism receptors. As a result, the underlying flaws in GenAI become more dangerous and more insidious.

Are there even more problems with GenAI? Yeah, a few…

I’ve focused here on some high-level, systemic issues with GenAI, but there are many others. I would encourage you to read this Harvard Business Review article titled “AI’s Trust Problem” for an excellent breakdown of 12 specific AI risks, from ethical concerns to environmental impact. If AI’s lack of humanity alone isn’t enough to make you cautious about how you use it, perhaps the HBR article will do the trick.

All that said…

For all its faults, Generative AI is here to stay, like it or not. Businesses are gonna business, and we humans love a technology that makes our lives easier in some way and damn the consequences. And, as I pointed out in Part 2, GenAI can be truly useful, and we should use it for what it’s good at. My point in going through the pros and cons in such detail is to help set up some structure for how to use GenAI wisely.

So, if we accept that this type of AI is our new reality, what can we do to mitigate the risks and make it a better, more useful information tool? Is it possible information architects have a big role to play here? Stay tuned, dear reader… These questions and more will be answered in the final part of this series, coming soon to this very blog.

  1. This is called pareidolia.  ↩

  2. At IAC24, Sherrard Glaittli and Erik Lee explored the concept of data poisoning in an excellent and entertaining talk titled “Beware of Glorbo: A Use Case and Survey of the Fight Against LLMs Disseminating Misinformation ↩