Sadly Not, Havoc Dinosaur

A Rose by Any Other Name

Define a selected word, phrase, idiom, or initialism on a webpage

Headshot of the author, Colarusso. David Colaursso

This is the 2nd post in my series 50 Days of LIT Prompts.

I Google the definitions of words with surprising frequency, mostly as part of my writing process. As a dyslexic, it's mostly a backstop for spellcheck. After it has reassured me that I'm only using real words, I tend to use text-to-speech to read what I've written. If a word sounds off, I look it up to make sure it's really the word I intended. It's also good for double checking homophones (e.g., I thought I spelled "discrete", but this seems to be "discreet"). After this idiosyncratic use, I find myself using search to discover the meanings of opaque anacronyms and initialisms, usually when I wonder off into some unfamiliar part of cyberspace. What does this person mean when they say "SOP?" Is that business speak? After wading through alphabet soup, the next most common hunt for meaning I find myself on is that of an old looking to understand the kiddos. How cringe? And I have to say, search has worked for me, but sometimes it takes some high-level Google-fu to track down what I'm looking for. What if I could just select a word or phrase, click a button, and get a definition, be it for a word, idiom, or initialism? Well, I'm happy to say today's prompt template does just that.

Of course, before we dive in, it's worth taking a moment to consider the implications of using such definitions. What's in a name? Though the essence of a thing remains unchanged whatever we call it, the definitions attached to these labels matter. "The judicial conception of lexical meaning—i.e., what judges think about what words mean, or, more importantly, how judges arrive at the meaning of contested terms—is often outcome determinative. Vast fortunes or years of confinement may balance precariously on the interpretation of a single word," opens The Dictionary Is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning. That note, the name legal scholars give to a common form of student-written paper, helped crystalize the use of corpus linguistics as an aid in discovering the meanings of words as they existed at some point in the past, an important endeavor for the textualists who purport to see no higher source of meaning when engaging in statutory interpretation. Why this detour into corpus linguistics, isn't this series about AI? Yes, but at the end of the day corpus linguistics, like Large Language Models (LLMs) relies heavily on the co-occurrence of words. The idea is that we can discover the meaning of a word by examining how it is used. What to know what the founders meant by "carrying a firearm," look at what people were writing at the time and start counting. That sounds strangely analogous to how we build an LLM. It is here that we will engage with an important truth about LLMs, they encode the semantic meaning of words as they exist in their training data. To ask if this reflects a word's true meaning is to ask whether such can be found in that training data. The answer you arrive at will likely color your impressions of both LLMs and the practice of legal corpus linguistics, not to mention your assesment of whether this match/mismatch is a feature or a bug.

In On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 the authors note the well-established fact that LLMs encode a number of biases, e.g., "stereo-typical and derogatory associations along gender, race, ethnicity, and disability status". This seems intuitive given what we know of LLMs. They work to predict the next word that best fits their training data. They are mirrors of sorts, encoding the bias found in the texts they consume. Which is all easy enough to say, but how? How are these sentence completion machines made? The answer won't fit in this post, but if you give me a couple of weeks, I think we can get there using bite-sized explainers like what follows.

Logistic Regression

The first step on this journey is to understand something called logistic regression. Yes, there's going to be some math, but there won't be a quiz, and I give you permission to skim over things if you like. That being said, if you can get through this, the payoff will be big. Like, "understand what this AI thing actually is" big. Anyhow, logistic regressions can be used to predict how likely something is to happen. Imagine I've been keeping track of snow days. I have a table showing how much snow fell and whether class was canceled. My first column shows the inches of snow fall and the second whether there was a snow day. 1 = snow day, 0 = no snow day. If I treat the first column as X and the second column as Y, I can plot these observations. See below. You'll notice that the snow days all lie on Y=1 and are off to the right. Just by looking at the plot, I can see that all the snow days occur when the accumulation is more than 4 inches. What a logistic regression does is provide us with a way to better describe this division. The blue line is that traced out by the equation and values you see to its left. If I fiddle with the values of B0 and B1, I can change the shape of the blue line, something called a sigmoid, shifting it left-right and making it more or less S-like. If you want to have a go at changing the values, you can do so here. It's not important for us to know how exactly, but logistic regression fiddles with those values until there's a pretty good fit with the data. Once this is done, we can enter in a value for X (the amount of snow) and get out the value of Y. Remember Y=0 is no snow day and Y=1 is a snow day. So, it stands to reason that Y=0.5 corresponds to a 50% chance of there being a snow day. Y=0.75? A 75% chance. That's it, logistic regression from thirty thousand feet.

It might not be clear how we get from predicting snow days based on snow fall to predicting words, but trust me we'll get there. Until then, let's build something!

We'll do our building in the LIT Prompts extension. If you aren't familiar with the extension, don't worry. We'll walk you through setting things up before we start building. If you have used the LIT Prompts extension before, skip to The Prompt Pattern (Template).

Up Next

Questions or comments? I'm on Mastodon @Colarusso@mastodon.social


Setup LIT Prompts

7 min intro video

LIT Prompts is a browser extension built at Suffolk University Law School's Legal Innovation and Technology Lab to help folks explore the use of Large Language Models (LLMs) and prompt engineering. LLMs are sentence completion machines, and prompts are the text upon which they build. Feed an LLM a prompt, and it will return a plausible-sounding follow-up (e.g., "Four score and seven..." might return "years ago our fathers brought forth..."). LIT Prompts lets users create and save prompt templates based on data from an active browser window (e.g., selected text or the whole text of a webpage) along with text from a user. Below we'll walk through a specific example.

To get started, follow the first four minutes of the intro video or the steps outlined below. Note: The video only shows Firefox, but once you've installed the extension, the steps are the same.

Install the extension

Follow the links for your browser.

  • Firefox: (1) visit the extension's add-ons page; (2) click "Add to Firefox;" and (3) grant permissions.
  • Chrome: (1) visit the extension's web store page; (2) click "Add to Chrome;" and (3) review permissions / "Add extension."

If you don't have Firefox, you can download it here. Would you rather use Chrome? Download it here.

Point it at an API

Here we'll walk through how to use an LLM provided by OpenAI, but you don't have to use their offering. If you're interested in alternatives, you can find them here. You can even run your LLM locally, avoiding the need to share your prompts with a third-party. If you need an OpenAI account, you can create one here. Note: when you create a new OpenAI account you are given a limited amount of free API credits. If you created an account some time ago, however, these may have expired. If your credits have expired, you will need to enter a billing method before you can use the API. You can check the state of any credits here.

Login to OpenAI, and navigate to the API documentation.

Once you are looking at the API docs, follow the steps outlined in the image above. That is:

  1. Select "API keys" from the left menu
  2. Click "+ Create new secret key"

On LIT Prompt's Templates & Settings screen, set your API Base to https://api.openai.com/v1/chat/completions and your API Key equal to the value you got above after clicking "+ Create new secret key". You get there by clicking the Templates & Settings button in the extension's popup:

  1. open the extension
  2. click on Templates & Settings
  3. enter the API Base and Key (under the section OpenAI-Compatible API Integration)

Once those two bits of information (the API Base and Key) are in place, you're good to go. Now you can edit, create, and run prompt templates. Just open the LIT Prompts extension, and click one of the options. I suggest, however, that you read through the Templates and Settings screen to get oriented. You might even try out a few of the preloaded prompt templates. This will let you jump right in and get your hands dirty in the next section.

If you receive an error when trying to run a template after entering your Base and Key, and you are using OpenAI, make sure to check the state of any credits here. If you don't have any credits, you will need a billing method on file.

If you found this hard to follow, consider following along with the first four minutes of the video above. It covers the same content. It focuses on Firefox, but once you've installed the extension, the steps are the same.


The Prompt Pattern (Template)

When crafting a LIT Prompts template, we use a mix of plain language and variable placeholders. Specifically, you can use double curly brackets to encase predefined variables. If the text between the brackets matches one of our predefined variable names, that section of text will be replaced with the variable's value. Today we'll meet our second predefined variable, {{highlighted}}. See the extension's documentation.

The {{highlighted}} variable contains any text you have highlighted/selected in the active browser tab when you open the extension. This prompt pattern is pretty straight forward. Highlight a word or words; run the template, and get a definition. Where the definition comes from, less so. Yesterday, we expressed great skepticism for prompts that sought to generate replies longer than themselves. What's different here? Maybe nothing. Maybe the definitions of words are so tightly constrained as to mitigate the appearance of hallucinations, or more likely, we don't expect to use these definitions as part of a court opinion and can tolerate a little noise. See above. As was true yesterday, we'll be in a better position to answer questions after we've had a chance to kick the tires.

Here's the template text.

Define the following word/phrase: {{highlighted}}

And here are the template's parameters:

Working with the above template

To work with the above template, you could copy it and its parameters into LIT Prompts one by one, or you could download a single prompts file and upload it from the extension's Templates & Settings screen. This will replace your existing prompts.

You can download a prompts file (the above template and its parameters) suitable for upload by clicking this button:


Kick the Tires

It's one thing to read about something and another to put what you've learned into practice. Let's see how this template performs.


TL;DR References

ICYMI, here are blubs for a selection of works I linked to in this post. If you didn't click through above, you might want to give them a look now.