In earlier days of generative artificial intelligence’s existence, laptops would take hours to chug through cumbersome code as AI models slowly learned to write, spell and ultimately output strange and hilarious Halloween costumes, pickup lines or recipes. Optics researcher Janelle Shane was intrigued enough by one list of such recipes—which called for ingredients such as shredded bourbon and chopped water—to put her own laptop on the task. Since 2016 she has blogged through such neural networks’ rapid advancement from endearingly bumbling to surprisingly coherent—and, sometimes, jarringly wrong. Shane’s 2019 book You Look Like a Thing and I Love You broke down how AI works and what we can (and can’t) expect from it, and her recent posts on her blog AI Weirdness have explored image-generating algorithms’ bizarre output, ChatGPT’s attempts at ASCII art and self-criticism and AI’s other rough edges. Scientific American talked with Shane about why a spotless giraffe stumps AI, where these models absolutely should not be used and whether a chatbot’s accuracy can ever be fully trusted.
[An edited transcript of the conversation follows.]
How has generative AI changed in the years you’ve been training and playing around with chatbots?
On supporting science journalism
If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
There is a lot more commercial buzz about AI than there was when I first got into it. In those days, Google Translate was, I think, one of the first big commercial applications that people would see out of this whole machine-learning AI constellation of techniques. There was a hint that there might be more out there, but in those days it was definitely more the domain of researchers.
Some things haven’t changed, [such as] the tendency of people to read deeper meaning to the text you get out of these techniques. We’ll see meaning in the random flopping of a leaf blowing across the sidewalk.... As the text has gotten more complex, [hype is] making it into major op-eds and major newspapers. As these tools get more accessible, we have also been seeing more of a tendency of people to try it for everything and see what sticks.
And that provides even more fodder for your blog, right?
I’ve always focused on the differences between how AI generates text and how humans write because to me, that is where you can come across something interesting and unexpected and something novel.... Seeing all of these glitchy answers and weird text generation is [also] a fun way to get some intuition to take with you. This is what you can remember if you’re trying to think, “Ah, yes, can I use this to label all of the images in my presentation so I don’t have to write accessible captions?” The answer is, it will produce labels, but you really need to check them over because of all these glitches.
It’s one thing to say, hey, it’s not completely accurate. It’s another thing to keep in mind the story of the spotless giraffe. There was a giraffe born in a zoo in Tennessee [in 2023] with no spots. The last [known] time that had happened was before the Internet, so the Internet had [hardly any] pictures of a spotless giraffe. It was very interesting to see how all these image-labeling algorithms would describe this giraffe and include descriptions of a spotted coat because that was just expected.
This is an example of something unexpected that this algorithm had not had a chance to memorize or skate past or hide this lack of deeper understanding. Suddenly you have this case that exposes that it’s not really looking at the spots. This is why glitch art is important, why these mistakes are important.
You also occasionally point out places generative AI excels—I’m thinking in particular of a post where you asked GPT-3 to answer questions as if it were secretly a squirrel, showing how it can demonstrate a fictional internal life.
I really wanted to poke holes in the argument that if these text generators can describe the experience of being sentient AI, they must be sentient AI, because that was, and still is, a narrative that’s going around: “Look, it said it’s sentient and has thoughts and feelings and doesn’t just want to be put to work generating text.” That is a distressing thing to see come out of text generation. I did want to make the point that although AI can describe the experience of being a squirrel, that doesn’t mean that it’s actually a squirrel.
Do you feel like there has been an actual big qualitative change in generative AI, or has the journey from chopped water to secret squirrels felt incremental?
Just like in a string of predictive text, what happens next follows from what happened before. So in that sense, it’s been incremental, but there have sure been a lot of increments—millions of dollars’ worth of compute time. And a whole global industry will do that to a project and do that to a technology. So it’s definitely grown and changed. On the other hand, the kinds of mistakes that you see out of these algorithms are the same that have been present in them going back to the very beginning. And that was one of the things that made me willing to write a book about AI in 2019, when things were still changing so quickly: I could still see these undercurrents, these through lines that were remaining the same.
You named your book after an AI-generated pickup line that’s so strange it circles around to be charming. Would an AI pickup line today have that same charm, or would it just be a depressing version of an Internet pickup line?
I suspect now it would be a depressing remix of Internet pickup lines. It would be hard to get a unique one because it would have memorized so many of them from the past.
I really dearly love the glitchy, half-garbled text from these early recurrent neural networks that were running on my laptop. There’s something about that simplicity and sheer messed-up-ness of the text that’s just sidesplittingly funny to me. ChatGPT, [Gemini] and all these text generators that people have available to play with now—it’s almost a shame that they are generating such coherent text.
I feel like that coherence, too, is a little scary in that I see people ask AI something like “Is so-and-so poisonous to dogs?” I know it’ll answer you, but please don’t ask it that!
Exactly. There are so many examples of toxicologists saying, “Okay, this specific advice is dangerous.... Do not do this.” And it can just come out of the algorithm. Because it’s so coherent, and often because it’s packaged as something that looks up information, people are being led to trust it. There are some infamous AI-generated mushroom-hunting books that contain downright dangerous advice. I did not predict that people would be generating them and selling them in order to make a buck, not really caring how much people’s time is wasted or [that they would put people] in actual danger.... I [didn’t predict] how willing people would be to use text that was glitchy or not really correct or sort of a waste of time to read—how there would be a market for that.
Would you foresee generative AI eventually being accurate?
The way that we’re trying to use these algorithms now as a way of retrieving information is not going to lead us to correct information, because their goal during training is to sound correct and be probable, and there’s not really anything fundamentally tied back to real-world accuracy or to exactly retrieving and quoting the correct source material. And I know people are trying to treat it this way and sell it this way in quite a lot of applications. I think it’s a fundamental mismatch between what people are asking for, this information retrieval, and what these things are actually trained to do, which is to sound correct.
Anything you’re doing where having the correct answer would be important is not a good use for generative AI.
A lot of the surface nastiness has been smoothed away by fine-tuning and extra training, but it hasn’t fundamentally changed the input data we gave these algorithms. It’s still there and still measurable and still having influences on what we’re getting out of them.
Does generative AI hype edge out other good uses of AI?
There are plenty of people who are quietly getting on with the job of using AI techniques for useful things that they couldn’t solve in other ways. For example, in drug-discovery research, that’s been a pretty big success because you can use more tailor-made AI techniques to try out different combinations of drugs and come up with promising formations, and then, crucially, go and test those in the lab and find out if they’re actually going to pan out.
People are also applying these models to cases where a little bit of inaccuracy is okay. I’m thinking of, for example, voicemail transcription. If it’s inaccurate enough, you’ve got to listen to it, fine, but you get the gist without having to sit through a regular voicemail. These kinds of small AI applications, I think, are where the value actually is and where I think long-term success might be.
AI transcription software is really useful, but now the version I use also gives these little auto-generated action points based on the discussion as if you’re in a work meeting, regardless of if that makes any sense in context. I’m just talking about someone’s research, not setting an agenda!
I’d be interested to see what homework it decides to assign based on this interview—if it tells you to go get to work chopping that water.