Constructed Criticism
David Colaursso
Co-director, Suffolk's Legal Innovation & Tech Lab
This is the 49th post in my series 50 Days of LIT Prompts.
Simulation has been a large part of this series. See e.g., What if Members of Talking Professions Could Log Time in Simulators Like Pilots?, If a 3L Can Automate Law Students, I Can Automate Law Profs—Robo Socrates Lives!, Summon the Demon, My Students Have Been Using an Interactive Tool for Reflective Journaling & I Love It!, and The Library of Unwritten Books. I like this use case for large language models (LLMs) because it plays to their strengths, or depending on how you look at it, avoids their weaknesses. Sure LLMs make things up, but that's a plus if you ask them to do improv, and to the extent they regurgitate societal biases they present openings to unpack such biases and how they will effect students practice. Today we'll build a template to help provide feedback for those participating in a simulation.
Previously, we've limited our exploration of these simulated experiences to the simulations themselves. That is, when we introduced a client simulation, that simulation stoped short of providing feedback, assuming that would come later, and though I think the best performance review would come from a clinical instructor it's worth considering the possibility that we could create something like our interactive style guide or unit testing for writing to evaluate student performance on simulations. Yes, we're going to provide simulated feedback on the content of a simulation. To be clear, as I've said before, the idea is to help with formative assessment. This feedback should start, not end the discussion. It's like grammar check for arguments—merely a suggestion. Cue the Xzibit meme.
To show off today's template, we'll start by revisiting our simulated Devil's Advocate. Below is a bit of text I wrote followed by the advocate's response to that text, along with the ensuing back and forth I had with the LLM.
The modern authoritarian practice of “flood[ing] the zone with shit” clearly illustrates the dangers posed by bullshitters—i.e., those who produce plausible sounding speech with no regard for accuracy. Consequently, the broad-based concern expressed over the rise of algorithmic bullshit is both understandable and warranted. Large language models (LLMs), like those powering ChatGPT, which complete text by predicting subsequent words based on patterns present in their training data are, if not the embodiment of such bullshitters, tools ripe for use by such actors. They are by design fixated on producing plausible sounding text, and since they lack understanding of their output, they cannot help but be unconcerned with accuracy. Couple this with the fact that their training texts encode the biases of their authors, and one can find themselves with what some have called mansplaining as a service.
I stopped the dialogue where I did because this series really is my start at an answer to the question, "How do you think we can balance the need for this critical discourse with the importance of fostering innovation and progress in AI technology?" That being said, here's how today's template responded to being fed the text above.
### Speaker 1's Viewpoint:
Speaker 1 is critical of Large Language Models (LLMs) like ChatGPT, arguing that they inherently produce "bullshit" in the technical sense—text that sounds plausible but is produced without concern for accuracy. This speaker emphasizes the dangers of such technology, including the perpetuation of biases and the risk of automation bias, where people might over-rely on AI-generated content. They argue that the creators of these technologies have a responsibility to ensure their safe use, likening the distribution of AI tools without safeguards to giving guns to children.
### Speaker 2's Viewpoint:
Speaker 2 takes a more optimistic view of AI, questioning whether the issue lies not with the technology itself but with how it is used. They suggest that education and media literacy could play a significant role in mitigating the risks associated with AI-generated content. Speaker 2 also proposes that the seductiveness of AI-generated text could potentially lead to more critical engagement with digital content. They acknowledge the current challenges but emphasize the potential for society to adapt and improve the use of technology over time.
### Feedback for Speaker 1:
Speaker 1 presents a strong argument grounded in the potential ethical and societal risks posed by AI. However, their argument could be strengthened by acknowledging the potential benefits of AI and discussing how these can be achieved without ignoring the risks. This would make their position more balanced and potentially more persuasive to those who are optimistic about AI. Additionally, providing specific examples of harm caused by AI-generated content could make their argument more tangible.
### Feedback for Speaker 2:
Speaker 2 offers a valuable perspective on the potential for positive outcomes from AI technology. However, their argument could benefit from more directly addressing the concerns raised by Speaker 1. For instance, they could discuss specific measures that could be implemented to mitigate the risks of bias and misinformation in AI-generated content. Additionally, providing evidence for their claim that AI could lead to more critical engagement with digital content would strengthen their argument.
### Determining a "Winner":
Both speakers present compelling arguments, but Speaker 1's argument is more cogent and compelling due to its focus on immediate and tangible risks associated with AI, as well as the ethical responsibility of creators. While Speaker 2 raises important points about potential positive outcomes and the role of education, their argument lacks the specificity and urgency of Speaker 1's concerns. Therefore, based on the logical reasoning and rhetoric presented, Speaker 1 has the more compelling argument. However, it's important to note that the "winner" in such a debate is subjective and depends on the values and priorities of the audience.
As an asyncronus lesson, I can see this interaction being a pretty powerful means of getting students to do the hard work, and as Ted Chiang observed that's the important thing.
If there is any lesson that we should take from stories about genies granting wishes, it's that the desire to get something without effort is the real problem. Think about the story of "The Sorcerer's Apprentice," in which the apprentice casts a spell to make broomsticks carry water but is unable to make them stop. The lesson of that story is not that magic is impossible to control: at the end of the story, the sorcerer comes back and immediately fixes the mess the apprentice made. The lesson is that you can't get out of doing the hard work. The apprentice wanted to avoid his chores, and looking for a shortcut was what got him into trouble.
If you missed some of the earlier posts on simulations, or if you just want to take some of them for a spin, you can find live demos at the following links:
- First Client Meeting (simulated criminal defendant)
- The Paper Chase 2.0 (includes both a simulated student—a gunner—and a simulated law prof)
- An AI-Mediated Secular Daily Reflection Based on the Ignatian Examen (simulated facilitator)
- An AI Word Processor (includes a simulated Devil's Advocate)
- The Library of Unwritten Books (includes the simulated inhabitants of "more worlds than stars in the sky")
That being said...
Let's build something!
We'll do our building in the LIT Prompts extension. If you aren't familiar with the LIT Prompts extension, don't worry. We'll walk you through setting things up before we start building. If you have used the LIT Prompts extension before, skip to The Prompt Pattern (Template).
Up Next
Questions or comments? I'm on Mastodon @Colarusso@mastodon.social
Setup LIT Prompts
LIT Prompts is a browser extension built at Suffolk University Law School's Legal Innovation and Technology Lab to help folks explore the use of Large Language Models (LLMs) and prompt engineering. LLMs are sentence completion machines, and prompts are the text upon which they build. Feed an LLM a prompt, and it will return a plausible-sounding follow-up (e.g., "Four score and seven..." might return "years ago our fathers brought forth..."). LIT Prompts lets users create and save prompt templates based on data from an active browser window (e.g., selected text or the whole text of a webpage) along with text from a user. Below we'll walk through a specific example.
To get started, follow the first four minutes of the intro video or the steps outlined below. Note: The video only shows Firefox, but once you've installed the extension, the steps are the same.
Install the extension
Follow the links for your browser.
- Firefox: (1) visit the extension's add-ons page; (2) click "Add to Firefox;" and (3) grant permissions.
- Chrome: (1) visit the extension's web store page; (2) click "Add to Chrome;" and (3) review permissions / "Add extension."
If you don't have Firefox, you can download it here. Would you rather use Chrome? Download it here.
Point it at an API
Here we'll walk through how to use an LLM provided by OpenAI, but you don't have to use their offering. If you're interested in alternatives, you can find them here. You can even run your LLM locally, avoiding the need to share your prompts with a third-party. If you need an OpenAI account, you can create one here. Note: when you create a new OpenAI account you are given a limited amount of free API credits. If you created an account some time ago, however, these may have expired. If your credits have expired, you will need to enter a billing method before you can use the API. You can check the state of any credits here.
Login to OpenAI, and navigate to the API documentation.
Once you are looking at the API docs, follow the steps outlined in the image above. That is:
- Select "API keys" from the left menu
- Click "+ Create new secret key"
On LIT Prompt's Templates & Settings screen, set your API Base to https://api.openai.com/v1/chat/completions
and your API Key equal to the value you got above after clicking "+ Create new secret key". You get there by clicking the Templates & Settings button in the extension's popup:
- open the extension
- click on Templates & Settings
- enter the API Base and Key (under the section OpenAI-Compatible API Integration)
Once those two bits of information (the API Base and Key) are in place, you're good to go. Now you can edit, create, and run prompt templates. Just open the LIT Prompts extension, and click one of the options. I suggest, however, that you read through the Templates and Settings screen to get oriented. You might even try out a few of the preloaded prompt templates. This will let you jump right in and get your hands dirty in the next section.
If you receive an error when trying to run a template after entering your Base and Key, and you are using OpenAI, make sure to check the state of any credits here. If you don't have any credits, you will need a billing method on file.
If you found this hard to follow, consider following along with the first four minutes of the video above. It covers the same content. It focuses on Firefox, but once you've installed the extension, the steps are the same.
The Prompt Pattern (Template)
When crafting a LIT Prompts template, we use a mix of plain language and variable placeholders. Specifically, you can use double curly brackets to encase predefined variables. If the text between the brackets matches one of our predefined variable names, that section of text will be replaced with the variable's value. Today we'll be using {{highlighted}}
. See the extension's documentation.
The {{highlighted}}
variable contains any text you have highlighted/selected in the active browser tab when you open the extension.
To use this template select the text fed into the Devil's Advocate sim and the ensuing dialogue. Then trigger the template. That's it.
Here's the template's title.
And the winner is...
Here's the template's text.
In a moment I'm going to give you a transcript of two people "debating" a set of issues. Unfortunately, the speakers aren't labeled. So, you'll have to infer them from the content of the text. You can think of them as Speaker 1 and Speaker 2. I would like you to very briefly summarize each speaker's viewpoint and provide feedback on how each could better approach their arguments. The text will appear below between section breaks that look like this "====".
Here it is.
====
{{highlighted}}
====
Now provide your feedback. Focus primarily on logical reasoning and rhetoric. Based only on the text of the above argument, determine a "winner" based on who had the most compelling and cogent argument.
And here are the template's parameters:
- Output Type:
LLM
. This choice means that we'll "run" the template through an LLM (i.e., this will ping an LLM and return a result). Alternatively, we could have chosen "Prompt," in which case the extension would return the text of the completed template. - Model:
gpt-4o-mini
. This input specifies what model we should use when running the prompt. Available models differ based on your API provider. See e.g., OpenAI's list of models. - Temperature:
0
. Temperature runs from 0 to 1 and specifies how "random" the answer should be. Since we're seeking fidelity to a text, I went with the least "creative" setting—0. - Max Tokens:
1000
. This number specifies how long the reply can be. Tokens are chunks of text the model uses to do its thing. They don't quite match up with words but are close. 1 token is something like 3/4 of a word. Smaller token limits run faster. - JSON:
No
. This asks the model to output its answer in something called JSON. We don't need to worry about that here, hence the selection of "No." - Output To:
Screen + clipboard
. We can output the first reply from the LLM to a number of places, the screen, the clipboard... Here, I've chosen the screen and clipboard so the results will be ready to paste where we like. - Post-run Behavior:
CHAT
. Like the choice of output, we can decide what to do after a template runs. Here we want to be able to follow up with additional prompts. So, "CHAT" it is. - Hide Button:
unchecked
. This determines if a button is displayed for this template in the extension's popup window.
Working with the above template
To work with the above template, you could copy it and its parameters into LIT Prompts one by one, or you could download a single prompts file and upload it from the extension's Templates & Settings screen. This will replace your existing prompts.
You can download a prompts file (the above template and its parameters) suitable for upload by clicking this button:
Kick the Tires
It's one thing to read about something and another to put what you've learned into practice. Let's see how this template performs.
- Customize your feedback. Tweak the prompt to focus on different aspects of the simulation or to work with another one of our simulations.
TL;DR References
ICYMI, here are blubs for a selection of works I linked to in this post. If you didn't click through above, you might want to give them a look now.
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. There's a lot of history behind this paper. It was part of a chain of events that forced Timnit Gebru to leave Google where she was the co-lead of their ethical AI team, but more than that, it's one of the foundational papers in AI ethics, not to be confused with the field of "AI safety," which we will discuss later. It discusses several risks associated with large language models, including environmental/financial costs, biased language, lack of cultural nuance, misdirection of research, and potential for misinformation. If you want to engage critically with LLMs, this paper is a must read.
- Will A.I. Become the New McKinsey? by Ted Chiang. This article explores the potential risks and consequences of artificial intelligence (A.I.) in relation to capitalism. Chiang suggests that A.I. can be seen as a management-consulting firm, similar to McKinsey & Company, which concentrates wealth and disempowers workers. He argues that A.I. currently assists capital at the expense of labor, and questions whether there is a way for A.I. to assist workers instead of management. Chiang also discusses the need for economic policies to distribute the benefits of technology appropriately, as well as the importance of critical self-examination by those building world-shaking technologies. He concludes by emphasizing the need to question the assumption that more technology is always better and to engage in the hard work of building a better world. Summary based on a draft from our day one template.