Skip to main content

What Finnegans Wake Teaches Us about AI

Mapping out the latent spaces where machines learn

"Any sufficiently advanced technology is indistinguishable from magic," wrote Arthur C. Clarke, the author of 2001: A Space Odyssey, in his 1962 book Profiles of the Future: An Inquiry into the Limits of the Possible. So it is with artificial intelligence (AI). We type a command into Adobe Firefly, for instance, and, Abracadabra! Out comes an image of a cat in a tuxedo, in the style of Modigliani. How that happens, the exact process of going from prompt to picture, not even the platform's developers can really explain.

That's because AI doesn't work the way traditional computing does. Instead of writing software that is a set of exact instructions, AI developers code a set of parameters, feed the model immense amounts of data, and tell it to learn by trial and error. What happens inside the "black box" when all that learning is going on is a mystery the masters of AI seem to have relatively little interest in exploring.

Nina Beguš, PhD '20, a postdoctoral researcher at the University of California (UC), Berkeley’s Center for Science, Technology, Medicine, and Society, thinks that's a mistake. As AI becomes ever more interwoven in daily life—perhaps faster than any technology in human history—Beguš says it's critical to get a better understanding of AI's "latent space," the hidden mathematical area that processes information and eventually produces an output.

In a new paper for Antikythera: Journal for the Philosophy of Planetary Computation, a peer-reviewed journal published in parallel with the Antikythera book series by MIT Press, Nina and Gašper Beguš, PhD '18, director of UC Berkeley's Speech and Computation Lab, partnering with the artist collective Metahaven, begin to map out this space. Using general adversarial networks (GANs), a type of AI that poses two neural networks against one another to produce data, and James Joyce's notoriously difficult novel, Finnegans Wake, the authors were able to get a picture of AI in the process of learning. They showed, as Nina Beguš describes in the interview below, that GANs acquire language much like humans do.

What pointed you in the direction of doing this research, and what was the question that you were trying to answer? 

There were a couple of reasons why Gašper and I started working together with our Metahaven collaborators. One motivation was to tell people about GANs and the exciting work Gašper has been doing in his Biological and Artificial Language Lab. When people talk about AI, they generally think about large language models (LLMs), but this is a very different way of doing AI. It doesn't have the same energy costs; you can actually build and probe your own models in an academic lab.

We wanted to bring this different approach to a wider audience, including humanistic and artistic communities. There is a lot of value in using these models for research. When an academic researcher sees an error produced by a model, that is what is interesting to us. While someone in industry might discard an error because they are working toward a viable product, in academia, an error is exactly what you are looking for. It makes you ask: why is it producing it this way? This brings us to latent spaces.

Latent spaces are becoming a prominent topic in machine learning, interpretability, and humanistic research. While many in the humanities work on latent spaces within the visual arts, we are using them for language and speech. These models are data agnostic—Gašper has even used them to study whale communication. They help with scientific discovery, and that will be the focus of our next paper.

While someone in industry might discard an error because they are working toward a viable product, in academia, an error is exactly what you are looking for. It makes you ask: why is it producing it this way? 

What is "latent space"? 

Every machine learning and deep learning model has an inner geometry that establishes itself during the training process. This geometry represents abstract relations; it maps similar or related things together and keeps unrelated things further apart. Latent spaces are so vast and have so many dimensions—thousands of vectors—that they are basically invisible and unimaginable to humans.

It is important to study them because they are what you might call "epistemic frameworks." They shape what an AI can see, imagine, and produce. We wanted to bring more attention to this interior because so much of the public discourse focuses on the exterior. Although a latent space is non-narratable, we are trying to make it navigable.

Image
Gasper and Nina Begus
Gašper Beguš, PhD ’18, and Nina Beguš, PhD ’20
/
Photo by Matevž Granda

So, it’s an inherent structure in an AI model that shapes what the AI can know? 

Yes, exactly. Every model has a different latent space based on its training data and how it was trained. It is literally an inner topography. It is very architectural and has a physicality to it. We use the analogy of the interior of the Statue of Liberty: people see it from the outside and can visit the top, but they don’t see what is holding it together. That is what we are trying to uncover by working with GANs.

It is easier to do this with GANs because they are smaller models than LLMs, which devour the entire internet. GANs can be trained on just a couple of hundred English words. We can then look at the interiority of these layers in artificial neural networks to see which "neurons" are sparking when the model performs a certain task.

What is the broader definition of a generative adversarial network? 

The name tells you there are a couple of networks in there that are adversarial. It is like having a teacher (the discriminator network) and a student (the generator network).  Production and perception are baked in, with no humans intervening.

One important exception is that GANs are the only models where the generator—the network that produces the output—never actually sees the data. It only learns based on feedback from the discriminator. It starts from scratch, beginning with noise and no knowledge of what it is supposed to produce. We set it up not just to fool the discriminator, but to learn how to be informative and encode structured information in audio. Unlike LLMs, which use tokenization, audio is continuous, and the phonetics of our speech are already continuous.

In the lab, we create what we call "artificial babies." These are GANs learning human speech, much like we do. We’ve been taken aback by the similarities, including the same kinds of errors. They start from zero, go through a "babbling stage," and finally produce a word or a “nonce word”: a term that doesn’t yet exist but is plausible, or is used only once.

How does the model "imagine" new words with so little data? 

They understand how English phonology works with very little data. For example, a model might produce "carrot" as a word even though it was never trained on it. In one case, Gašper trained a model on only eight English words: ask, carry, dark, greasy, like, suit, water, and year. From that, the model could produce "start" or "sart"—a plausible word that follows English phonology but isn't in our vocabulary. They don't produce non-plausible words; they figure out the rules quickly.

We call this “imagitation”: imitation plus imagination. At first, it is basic imitation, approximating what they get from the environment, like a child learning from parents. They extrapolate underlying principles and then push those regularities into novel combinations. While LLMs are mostly imitation, GANs are very much about imagination.

Why did you choose Finnegans Wake for this experiment?

Historically, machines were industrial; you pushed a button or wrote a program, and the machine performed the expected output as an extension of human activity. With machine learning, we have created machines, but we don't fully know what they are capable of. You have to probe and experiment to discover what they are actually doing.

This is where "artificial humanities" comes in. These models are much more qualitative and cultural than they might seem. We can directly apply humanistic methods—reading, metaphor, and interpretation—to study these computationally established structures.

James Joyce in particular tested the limits of the novel and human language. He walked the line of the speakable in a novel that was meant to be read aloud. Joyce tasked himself with inventing an idiom for Finnegans Wake, calling it "the writing of the night" because he was looking at the interior of language. He famously said that a great part of human existence cannot be rendered sensible by "wide-awake language" or "cut-and-dried grammar".

He destabilized the boundary between signal and noise. Language in computational settings often feels too polished—a world of perfect Newtonian physics. We wanted to explore the layer of language that is not quite explicit or externalized. We claim that Joyce really charted the latent space of language in this novel. 

We wanted to explore the layer of language that is not quite explicit or externalized. We claim that [James] Joyce really charted the latent space of language in this novel.

Can you describe in layperson’s terms the experiment where you trained the AI exclusively on Finnegans Wake

We used a text-to-speech audiobook version of the novel rather than the text itself, so the AI—which we call "FinneGAN"—was trained solely on the novel’s audio. It started learning from scratch. We published the model so that everyone can try it out.

We also added a transcriptor model to the GAN. This second neural network acts like an "adult" to the childlike GAN, trying to make sense of its utterances. But it often misrepresents what the GAN is saying. You see FinneGAN's sentences transcribed in multiple ways—sometimes in English, sometimes Irish, or just onomatopoeics and glossolalia. It foregrounds the limits of interpretability for speech that isn't quite externalized. The FinneGAN model pushed the novel even further into entropy and those pre-externalized layers of language.

It’s almost like speech play. We wanted to show how one can intentionally deviate from the rules of "proper" language to create novel or even illogical words. GANs are excellent at that. They tap into a distinct aesthetic and intellectual pleasure that we, as humans, get from tinkering with language—not just in literature, but in everyday speech. It was important to demonstrate that computation can also be playful, surprising, and even endearing.

Image
Architecture Biennale
Architecture Biennale where Nina and Gašper Beguš exhibited and presented Latent Spacecraft.

What did the results of your experiment tell you about machine learning on the one hand and James Joyce on the other?

We paralleled artificial neural networks (GANs) with biological neural networks (brains) and Joyce’s non-narratable style, peeking into the internal layers of how speech is produced. We demonstrated that every abstraction has a physical reality; you can see the electrical activity in both biological and artificial networks.

From a more philosophical standpoint, we found that there is no formalism to this. Children don’t use formalism in speech play, and neither does Joyce. By using this GAN-like imagination, he approaches a pre-verbal world where one knows some rules of the "game" and explores the rest. Joyce is a giant in literary studies, but he uses these childlike operations and imaginations—much like our computational models. It shows that these processes are more biological and more akin to how humans acquire language than what we see with other types of AI.

We demonstrated that every abstraction has a physical reality; you can see the electrical activity in both biological and artificial networks.

How did the Finnegans Wake experiment specifically illustrate the way culture influences AI models?

Literature can pick up yet-unarticulated realities. Joyce was writing 100 years ago, and he was likely picking up on a new way of looking at reality, similar to the birth of quantum physics. We see this often; for instance, George Bernard Shaw was picking up on the nascent sciences of instilling language into machines in Pygmalion.

This paper exists because of the synthetic insight afforded by talking to people outside of your own field. Today, we have a hyper-segmentation of disciplines: engineers on one side of the river, humanities and social sciences on the other. AI is actually wonderful for breaking that down. It gives you the courage to explore the unknown.

There is significant public concern about AI seeping into every area of life. What is your biggest concern, and can this type of research help address it?

AI feels very "of the moment," but it has a deep history, from the Eliza Doolittle trajectory to the Turing test. If you look at the history of technology, things become a little less scary. You see the same concerns repeated: Socrates was famously against writing because he thought it would ruin dialogue, and there were accusations that the typewriter would disable human thinking.

However, we are making a major mistake today by leaving AI development almost entirely to the for-profit sector. Building AI models in an academic lab allows you to ask different questions. My disappointment is that there seems to be only one "marketable" form of AI out there, when that isn't the case.

We are in the knowledge production sector. One major task for us is to create our own models. I hope this paper inspires people to pursue that. Ultimately, the future of AI is a series of human decisions.

Harvard Griffin GSAS Newsletter and Podcast

Get the Latest Updates

Subscribe to Colloquy Podcast

Conversations with scholars and thinkers from Harvard's PhD community
Apple Podcasts Spotify
Simplecast

Connect with us