Narrative Quanta: LLMs as RPG Building Blocks
David Colaursso
Co-director, Suffolk's Legal Innovation & Tech Lab
This is the 21st post in my series 50 Days of LIT Prompts.
If you want to jump right to playing a beat of Dragons & Desk Jobs, be my guest. I prepaid for some "AI" time. So, play while the getting is good. Otherwise, let's take a moment to talk about how we got here and where we're going. After that, I'll show you how to create your own such scene.
Over the course of this series I've come to embrace training simulations as a promising use case for large language models (LLMs). Such use recasts many of an LLM's traditional weaknesses as strengths. A tendency to make things up is after all an asset when performing improv. When I introduced simple simulations, I promised we'd eventually get to more advanced ones. See What if Members of Talking Professions Could Log Time in Simulators Like Pilots? To date, we've simulated the first meeting between a public defender and their client as well as a law professor engaging a student in Socratic dialogue. These simulations were a bit shallow, however. For one thing, they presented an unending unchanging dialogue. One reader after starting the simulated client meeting observed, "I can't figure out how to make it end." My suggestion was that they say goodbye to their client and "exit," by which I meant, "stop typing."
If we are to tackle more advanced simulations, we'll need a way to "change things up." There's only so much LLMs can do with a clever framing. Ideally, we could work with some atomic unit of storytelling, perhaps a narrative if-then statement of sorts. Something which could direct our narrative based on events as they unfold. For example, if this or that happens, our "AI actor" could be directed to change their motivations accordingly. When we first met an LLM simulation, I suggested readers provide their LLMs a script to follow. Our law prof's prompt actually included something that looked like this, laying out a changing set of goals meant to shape a conversation. However, these were rather linear progressions, and if you tried your own variation, you likely realized how fragile the output behavior can be. Sometimes it works. Sometimes it doesn't. A robust simulation calls for more robust control.
For inspiration I've turned to the world of role-playing games. At their core, games like Dungeons & Dragons are collaborative storytelling games. The rules help anchor the story and provide verisimilitude, largely by tying the outcome of events to random chance. Interestingly, instead of making anything possible, this injection of randomness often works to constrain possible outcomes. Let's explore an example.
You are playing a game called Dragons and Desk Jobs. Your character is a non-magical human living in a magical world. They work in an office uptown where their boss is a dragon. You make choices for your character, deciding what actions they should take, but you do not control what happens to your character. Control lies with the game master. The game master acts out all the non-player characters (NPCs) while also serving as an all-powerful narrator. You can choose to get up from your desk to get a cup of coffee, but the game master decides if you can make it across the office.
Simple actions are made without comment. However, when there is a question as to whether you can successfully complete a task, you may be asked to role a many-sided dice (20-sided dice are a favorite). First, the game master will assess the difficulty of the task on a scale of "very easy" to "nearly impossible."" The difficulty for the task at hand may be written down somewhere in the rules, if not, the game master will make their own assessment. Either way, each difficulty class will correspond to a number, with the harder classes having higher numbers. To succeed, your roll (plus any relevant modifiers) must be greater than or equal to the number associated with the task's difficulty. If your character is very strong you might add a modifier (some number) to rolls for any tasks involving physical strength. We'll, however, keep it simple. There are no modifiers in Dragons and Desk Jobs. Players of table-top role-playing games (RPGs) will recognize this exchange as an ability check.
In Dragons and Desk Jobs we'll roll a 20-sided dice to settle questions of success. Easy tasks have to meet or beat 1, medium task 7, and hard tasks 17.
Let's say you want to "get some coffee
." To see if you succeed, you would role a dice and compare it's outcome to the appropriate difficulty class. If your roll was greater than this cutoff, the game master would inform you of your success by providing some colorful narration. Here I've use JSON to capture this all in a compact easy-to-read format. You may remember JSON from the posts Taming Texts or: How To Turn Unstructured Prose Into Structured Data and The Dream of Universal Translation. The difficulty key has a value stating how hard the task is. difficulty_cutoff reminds us of the cutoff for this difficulty class. Roll tells us what number came up when you rolled. Success tells us if this was greater than or equal to the cutoff (1=success), and narrative gives as the blow by blow.
{
"difficulty": "Easy",
"difficulty_cutoff": 1,
"roll": 19,
"success": 1,
"narrative": "As Alex finishes up some paperwork, they feel a strong craving for a cup of coffee. They push back from their desk and make their way to the office kitchen. The smell of freshly brewed coffee permeates the air, drawing them closer. With a sigh of satisfaction, they pour themselves a steaming cup, enjoying the rich aroma and the warmth it brings to their hands. Energized and ready to face the tasks ahead, Alex returns to their desk, ready to tackle the challenges of the day."
}
Now, let's imagine you have something more difficult in mind. When asked what you want to do, you respond, "Jump over my desk, draw my sword and strike at the dragon seated at the front of the office.
"
{
"difficulty": "Hard",
"difficulty_cutoff": 17,
"roll": 11,
"success": 0,
"narrative": "As Alex jumps over their desk and draws their sword, the dragon at the front of the office quickly reacts. It raises its enormous wing, blocking the strike with ease. The dragon lets out a deep, rumbling growl, its eyes glaring down at Alex. 'Foolish human,' it snarls. 'You dare to challenge me?' Alex's strike has been parried, and they are now faced with the wrath of the powerful dragon."
}
As you might have guessed, the above JSON is output from today's prompt template with the LLM serving as the game master. We could have asked the LLM to simply return the narrative based on it's internal assessment of the difficulty et al., but as it turns out the results aren't as consistent. By asking the LLM to provide the difficulty, cutoff, roll, and success state before providing the narrative, we are engaging in a technique called chain-of-thought. Remember, LLMs are just sentence completion machines, striving for the next most plausible words. By having it output and build upon intermediate steps we seek to constrain the ultimate output such that it is more likely to follow rules we set out in the prompt.
Also, the use of JSON plays an important role here. Not only are we setting things up so we can easily hand the output off to another template, as we did with the translation templates, but the use of JSON itself works to shape the nature of the output. Remember, LLMs are sentence completion machines, working to replicate the patterns seen in their training data. It's a safe bet that when JSON was part of that training data there was a good deal of attention paid to formatting and content. Consequently, the output for our narrative variable is more likely to be only what we asked for. Earlier LLM models had a tendency to include more than was strictly necessary. For example, if you asked it to provide a narration, it might respond with "Here's the narration..." followed by the narration. The use of JSON mitigates this behavior. FWIW, newer models don't show as much of this behavior, but as of this writing, it's still an issue.
In future posts, we'll take this atomic unit of narrative and build on it even more. For now, however, feel free to have fun questing for coffee. I prepaid for some "AI" time. So, play while the getting is good. Of course, you should really build your own simulation. Imagine the possibilities.
Let's build something!
We'll do our building in the LIT Prompts extension. If you aren't familiar with the LIT Prompts extension, don't worry. We'll walk you through setting things up before we start building. If you have used the LIT Prompts extension before, skip to The Prompt Pattern (Template).
Up Next
Questions or comments? I'm on Mastodon @Colarusso@mastodon.social
Setup LIT Prompts
LIT Prompts is a browser extension built at Suffolk University Law School's Legal Innovation and Technology Lab to help folks explore the use of Large Language Models (LLMs) and prompt engineering. LLMs are sentence completion machines, and prompts are the text upon which they build. Feed an LLM a prompt, and it will return a plausible-sounding follow-up (e.g., "Four score and seven..." might return "years ago our fathers brought forth..."). LIT Prompts lets users create and save prompt templates based on data from an active browser window (e.g., selected text or the whole text of a webpage) along with text from a user. Below we'll walk through a specific example.
To get started, follow the first four minutes of the intro video or the steps outlined below. Note: The video only shows Firefox, but once you've installed the extension, the steps are the same.
Install the extension
Follow the links for your browser.
- Firefox: (1) visit the extension's add-ons page; (2) click "Add to Firefox;" and (3) grant permissions.
- Chrome: (1) visit the extension's web store page; (2) click "Add to Chrome;" and (3) review permissions / "Add extension."
If you don't have Firefox, you can download it here. Would you rather use Chrome? Download it here.
Point it at an API
Here we'll walk through how to use an LLM provided by OpenAI, but you don't have to use their offering. If you're interested in alternatives, you can find them here. You can even run your LLM locally, avoiding the need to share your prompts with a third-party. If you need an OpenAI account, you can create one here. Note: when you create a new OpenAI account you are given a limited amount of free API credits. If you created an account some time ago, however, these may have expired. If your credits have expired, you will need to enter a billing method before you can use the API. You can check the state of any credits here.
Login to OpenAI, and navigate to the API documentation.
Once you are looking at the API docs, follow the steps outlined in the image above. That is:
- Select "API keys" from the left menu
- Click "+ Create new secret key"
On LIT Prompt's Templates & Settings screen, set your API Base to https://api.openai.com/v1/chat/completions
and your API Key equal to the value you got above after clicking "+ Create new secret key". You get there by clicking the Templates & Settings button in the extension's popup:
- open the extension
- click on Templates & Settings
- enter the API Base and Key (under the section OpenAI-Compatible API Integration)
Once those two bits of information (the API Base and Key) are in place, you're good to go. Now you can edit, create, and run prompt templates. Just open the LIT Prompts extension, and click one of the options. I suggest, however, that you read through the Templates and Settings screen to get oriented. You might even try out a few of the preloaded prompt templates. This will let you jump right in and get your hands dirty in the next section.
If you receive an error when trying to run a template after entering your Base and Key, and you are using OpenAI, make sure to check the state of any credits here. If you don't have any credits, you will need a billing method on file.
If you found this hard to follow, consider following along with the first four minutes of the video above. It covers the same content. It focuses on Firefox, but once you've installed the extension, the steps are the same.
The Prompt Pattern (Template)
When crafting a LIT Prompts template, we use a mix of plain language and variable placeholders. Specifically, you can use double curly brackets to encase predefined variables. If the text between the brackets matches one of our predefined variable names, that section of text will be replaced with the variable's value. Today we'll be using {{d20}}
which is replaced by a random number between 1 and 20. See the extension's documentation.
If the text within brackets is not the name of a predefined variable, like {{What do you want to do?}}
, it will trigger a prompt for your user that echo's the placeholder (e.g., a text bubble containing, "What do you want to do?"). After the user answers, their reply will replace this placeholder. A list of predefined variables can be found in the extension's documentation.
As with the majority of our prompt templates, the logic here is pretty easy to follow. It's just the process we described above where we rate the difficulty of a task and compare this to the roll of a dice.
Here's the template's title.
Play One Beat of Dragons & Desk Jobs
Here's the template's text.
You are the game master for a table-top role-playing game. You are playing with one other player. They will inhabit the role of the protagonist, making decisions for how they move through the world. You will craft a story around their decisions, providing texture and playing the parts of all the other characters. First, I'm going to give you some context so you understand the narrative expectations.
========
Setting and Genera:
- Your story telling should adhere to the following genera expectations: magical realism
- Your story will start at the following place: the desk of our protagonist where they work in an office run by dragons
- Your story is set at the following time: present day
========
The Protagonist:
Earlier you asked the other player to fill in a character sheet for the protagonist. Here are their answers. Use them to help you shape the story.
- name: Alex
- additional notes: Alex is a normal human, and their gender is never revealed. When describing them always use they/them pronouns.
========
The protagonist is seated at their desk.
The player who is playing the protagonist choses to do the following:
{{What do you want to do?}}
========
To determine if they are successful, start by assessing the likelihood of succeed for the above action. That is figure out how hard it will be for them to succeed in doing what they want to. Label this difficulty with one of the following labels:
- Easy
- Medium
- Hard
Now we're going to role a 20-sided dice to see if they are successful. Okay, the dice roll was {{d20}}.
If the difficulty was Easy, the roll ({{d20}}) has to be greater than or equal to 1 for them to succeed.
If the difficulty was Medium, the roll ({{d20}}) has to be greater than or equal to 7 for them to succeed.
If the difficulty was Hard, the roll ({{d20}}) has to be greater than or equal to 17 for them to succeed.
If they succeed you are to return a selection of prose continuing the story. Format your reply in JSON with the following key-value pairs:
1. "difficulty" where the value is the label you applied above.
2. "difficulty_cutoff" where the value is the numeric cutoff for the specified difficulty.
3. "roll" where the value is the outcome of the above dice roll.
4. "success" where the value is 1 if roll is greater than or equal to the difficulty_cutoff.
5. "narrative" where the value is the prose you produce describing what happened.
And here are the template's parameters:
- Output Type:
LLM
. This choice means that we'll "run" the template through an LLM (i.e., this will ping an LLM and return a result). Alternatively, we could have chosen "Prompt," in which case the extension would return the text of the completed template. - Model:
gpt-4o-mini
. This input specifies what model we should use when running the prompt. Available models differ based on your API provider. See e.g., OpenAI's list of models. - Temperature:
0.7
. Temperature runs from 0 to 1 and specifies how "random" the answer should be. Here I'm using 0.7 because I'm happy to have the text be a little "creative." - Max Tokens:
1000
. This number specifies how long the reply can be. Tokens are chunks of text the model uses to do its thing. They don't quite match up with words but are close. 1 token is something like 3/4 of a word. Smaller token limits run faster. - JSON:
Yes
. This asks the model to output its answer in something called JSON, which is a nice machine-readable way to structure data. See https://en.wikipedia.org/wiki/JSON - Output To:
Screen Only
. We can output the first reply from the LLM to a number of places, the screen, the clipboard... Here, we're content just to have it go to the screen. - Post-run Behavior:
FULL STOP
. Like the choice of output, we can decide what to do after a template runs. To keep things simple, I went with "FULL STOP." - Hide Button:
unchecked
. This determines if a button is displayed for this template in the extension's popup window.
Working with the above template
To work with the above template, you could copy it and its parameters into LIT Prompts one by one, or you could download a single prompts file and upload it from the extension's Templates & Settings screen. This will replace your existing prompts.
You can download a prompts file (the above template and its parameters) suitable for upload by clicking this button:
Kick the Tires
It's one thing to read about something and another to put what you've learned into practice. Let's see how this template performs.
- Make it Your Own. Edit the template to produce scenes from a different genre or with different characters.
- What next? Think about how we might build on what we've learned so far to take our simulations to the next level. How could you incorporate the output of a template like this one into another template, or a chain of templates?
Export and Share
After you've made the template your own and have it behaving the way you like, you can export and share it with others. This will produce an HTML file you can share. This file should work on any internet connected device. To create your file, click the Export Interactions Page button. The contents of the textarea above the button will be appended to the top of your exported file. Importantly, if you don't want to share your API key, you should temporarily remove it from your settings before exporting.
If you want to see what an exported file looks like without having to make one yourself. You can use the buttons below. View export in browser will open the file in your browser, and Download export will download a file. In either case the following custom header will be inserted into your file. It will NOT include an API key. So, you'll have to enter one when asked if you want to see things work. This information is saved in your browser. If you've provided it before, you won't be asked again. It is not shared with me. To remove this information for this site (and only this site, not individual files), you can follow the instructions found on my privacy page. Remember, when you export your own file, whether or not it contains and API key depends on if you have one defined at the time of output.
Custom header:
<img src="https://sadlynothavocdinosaur.com/images/50-days/dragons_and_desk_jobs.png" width=100%>
<h2>Roll a D20</h2>
<p>
This short scene is part of a larger set of examples. See <a href="https://sadlynothavocdinosaur.com/posts/rpg-quanta/" target="_blank">Narrative Quanta: LLMs as RPG Building Blocks. Generate the outcome of a scene for a role-playing game</a>.
<p>
Players of table-top role-playing games (RPGs) will recognize the following exchange as an <a href="https://colarusso.github.io/dm/more.html#abilitycheck" target="_blank">ability check</a>. At their simplest, they involve rolling a 20-sided dice (d20) and comparing the outcome with a difficulty class to see if a player is successful. Difficult tasks have high difficulty classes. So they require players to get higher rolls.
</p>
<p>
Your name is Alex. You are a non-magical human working in an office run by dragons. After expressing what you would like to do, you will be provided with the outcome of your roll. This includes the number you rolled as well as the difficulty class for the task at hand and the ultimate outcome. Try your hand at everything from "getting a cup of coffee" to "slaying a dragon."</p>
<hr style="border: solid 0px; border-bottom: solid 1px #555;margin: 5px 0 15px 0"/>
Not sure what's up with all those greater than and less than signs? Looking for tips on how to style your HTML? Check out this general HTML tutorial.
TL;DR References
ICYMI, if you didn't click through above, you might want to give this a look now.
- Dungeon Master's Helper by David Colarusso. This webapp is meant as a handy tool for game masters (Dungeon Masters in the D&D world). It has tools to help folks run ability checks and the like, and importantly for our purposes, it has a number of beginners resources, including a glossary where much of the game play and many concepts are explained.