Analysis: while the technology is relatively new, the trajectory for AI-written stories isn't necessarily upwards

In a bid to control the influx of AI-written books on its platform, Amazon recently restricted authors to self-publishing no more than three books per day. The discovery of pirated novels in the training data of Generative AI language models has caused consternation amongst writers, whose work has been used to train the models that could be used to replace them. Not surprisingly, the encroachment of generative AI was at the heart of the recent Hollywood writers' strike.

These developments illustrate just how powerful large language models have become in the last couple of years. OpenAI's GPT language models and similar systems can produce fluent prose that could plausibly have been written by what OpenAI CEO Sam Altman rather sinisterly calls the "median human".

One potential way for the careful reader of non-fiction works to detect that the text is AI-generated is the presence of "hallucinations", fake yet plausible references or claims. The propensity to hallucinate may make large language models unreliable information sources, but making stuff up doesn’t pose the same problem for fiction, where the capacity for invention is a positive.

We need your consent to load this YouTube contentWe use YouTube to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences

From the Wall Street Journal, your AI questions from AI hallucinations to befriending chatbots answered

In 2021, we gave a language model a set of story openers, or prompts, and let it continue the story. Two years ago, Generative AI language models were available but not as mainstream as they are today. These language models were trained to predict the next word in a sequence so story continuation is quite a natural way to use them.

The prompts were taken from a creative writing Reddit forum called WritingPrompts, where people post a story opening and other people complete the story. Many of the prompts have a science-fiction theme, such as a scientist discovers that misbehaving printers are the only way a species of aliens have found to communicate with humans.

The first thing we found was that the AI-generated stories became more repetitive and nonsensical the longer they went on. In fact, they were so bad that there was no point even asking people to rate their quality. We therefore constrained the language model to generate micro-stories in the form of a short paragraph. We then surveyed people to see what they thought of them. Since these language models do not tend to make grammatical mistakes, we did not ask people to judge the grammar, but rather we focused instead on judging whether the stories were coherent and interesting.

We need your consent to load this YouTube contentWe use YouTube to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences

From CNN, reporter Donie O'Sullivan shows people an AI-generated political ad, but can they tell it's fake?

What did people think of the AI-generated stories? In a word, meh. On average, they received middling scores for both coherence and interest. People tended to find the human-written prompt more interesting than the AI-generated story.

This is the story that was rated most highly:

Human-written Prompt: In the future, Disney purchases an entire planet and makes the whole thing into one giant amusement park.

AI-generated Story: The problem is that a large portion of the population hates the park, hates its corporate overlords, hates its theme park, and is starting to revolt. At the park itself, the people are divided. Some people are happy, some people are not. One thing that makes this story different from movies like the Matrix is that the people who live at the park have no control over the things that are happening there.

The generated story continues the prompt in a plausible enough way, although on closer reading, the inconsistent use of the word 'park' undermines its coherence.

We need your consent to load this rte-player contentWe use rte-player to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences

From RTÉ News, why is ChatGPT generating so much chatter?

Let's fast forward to 2023, when the language model has undergone a second phase of training, on top of the next-word-prediction training, called Reinforcement Learning from Human Feedback. In this second training phase, the model "learns’’ from people how to better follow instructions and how to avoid offensive language.

So what does ChatGPT produce when given the same story prompt and asked to continue the story?

In a bold move, Disney acquired an entire planet, christened "WonderWorld." They transformed it into an otherworldly amusement park, where enchanted forests merged with futuristic cities. Visitors marvelled at alien creatures and explored surreal landscapes. WonderWorld became a beacon of imagination, uniting people from across the galaxy in the spirit of joy and wonder. Disney's dream had transcended worlds, creating a timeless legacy in the cosmos.

This is the type of blandly descriptive text that we have come to expect from the tool that is causing headaches for educators all over the world. It generates cliched and hackneyed phrases ("timeless legacy", "enchanted forests"), and has a much more positive (and arguably more boring) tone than the story produced by the 2021 language model. ChatGPT’s "smooth-brained politeness" - to quote writer Belinda McKeon - is related to the second phase of training, whereby OpenAI asked people to rate model responses based on their helpfulness and harmlessness, and then fed these ratings back into the training process.

We need your consent to load this YouTube contentWe use YouTube to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences

From Bloombert, inside OpenAI, the company behind AI smash hits ChatGPT and Dall-E

Another result of the second training phase is the model’s ability to follow instructions. This means that a lot of power lies in how we instruct the model. Prompting a large language model has become an art, and potentially a new career in itself. Instead of just providing the opening lines of a story and expecting the model to continue it by repeatedly predicting the next word, we can, for example, tell it what kind of story we want it to write.

Here’s ChatGPT’s response when we ask it to write a funny story. Funny haha or funny peculiar? You decide.

In a galaxy far, far away, Disney did the unthinkable: they bought an entire planet! And not just any planet, but a wild, untamed one. They called it "Mickeytopia." But when Mickey tried to shake hands with the local alien inhabitants, they mistook him for a snack. Now, it's a planet-sized game of hide-and-seek, with aliens chasing Disney characters through the jungles. Space tourists cheer from orbiting roller coasters. Who knew that buying a planet would turn into the galaxy's greatest adventure?

The future of creative writing is human

Another difference between the 2021 and 2023 models is that the later models are much better at producing longer passages of text and the stories do not degenerate into nonsense after the first paragraph. This means that they could be used to write short stories or even novels.

But whether anyone will actually want to read these stories is debatable. While it’s true that the technology is relatively new (the transformer neural network used to train these models was only invented in 2017), the trajectory for AI-written stories isn’t necessarily upwards. AI researchers will not be able to train on artists’ work as freely as they have done up to now. AI-written text may then find its way back into future models’ training data in a garbage-in-garbage-out loop. The future of creative writing is human.


The views expressed here are those of the author and do not represent or reflect the views of RTÉ