I Turned My Scholarly Papers Into Chatbots so People Don't Have To Read Them 🤞

Turn scholarly papers into chatbots so people don't have to read every word

Talking text, latent space "photography" by Colarusso

David Colaursso
Co-director, Suffolk's Legal Innovation & Tech Lab

This is the 6th post in my series 50 Days of LIT Prompts.

If you want to start chatting with my papers, click here. I prepaid for some "AI" time. So, chat away while the getting is good. Otherwise, let's take a moment to talk about how we got here. After that, I'll show you everything you need to create your own.

Last week we kicked things off by summarizing and questioning webpages. We closed the week by showing you how to export templates to create a free-standing web app/webpage. Today, we'll take these, add some existing texts, and produce something new. The 🤞 in the title above can be read roughly as, "what could go wrong?" My hope is readers will treat the bot's output they should any secondary source. That is, if they see anything that piques their interest, they should check the primary source for confirmation.

Do you yearn for something more than a book? And yet still love books? How about a book you can query, and it will answer away to your heart's content? How about a book that will create its own content, on demand, or allow you to rewrite it? A book that will tell you why it is (sometimes) wrong? That is what I have tried to build with my latest work.

This is how Tyler Cowen introduced his "generative book" project GOAT: Who is the Greatest Economist of all Time and Why Does it Matter? Though this isn't exactly what I'm aiming for with my paper bots, the spirit is the same. They're both explorations of what it means to engage with a text given the help of a large language model (LLM).

I can think of at least three blog posts I wrote in the early 2000's that have been read by more people than any journal article I've written, more than any article I'm likely to write. Is there a way to make my scholarship more accessible? Is there a way to let folks engage with the ideas of a paper without reading it first? Could such engagement lead them to want to read it? Could someone who has read something benefit by more literally conversing with a text?

As we've noted before our ability to answer such questions is improved if we understand how the tools we're using actually work. Which is to say, here's where we pause to share a micro-lesson. You can skim over or skip this if you like. However, if you make it through these micro-lessons, the payoff will be big. Like, "understand what this AI thing actually is" big.

Micro-Lesson: Word2vec (Part 1)

Word2vec is a method for turning words into numbers, long lists of numbers actually. For example, a single word like "the" can be represented by a collection of 300 numbers like those below, something we call an embedding. Today we'll talk about how we get these numbers, and later in the week we'll discuss why they're so special. You'll want to be sure you didn't miss last week's micro-lesson on artificial neural nets.

For the record, here's what a word embedding (numeric representation) for the word "the" looks like:

[-0.082835,0.133846,0.051245,-0.075815,-0.040481,0.028314,0.019539,0.001221,0.124486,0.086579,0.046799,-0.038843,-0.005031,0.047033,0.000289,0.033930,-0.007488,-0.045629,-0.038843,-0.013747,-0.015912,0.079091,0.019071,-0.017784,0.019656,-0.060839,-0.061775,0.059201,0.023868,0.006025,0.027729,-0.003978,-0.013572,-0.015912,0.077219,0.036035,-0.124486,0.042821,0.065051,0.002208,-0.021177,0.011115,-0.050075,-0.024687,0.040013,-0.009828,-0.010413,-0.013104,0.076751,0.091259,-0.015561,0.032526,0.049373,-0.121678,0.003042,0.098279,0.010120,-0.103895,-0.011700,0.081899,-0.104831,0.047969,-0.074411,-0.060371,-0.018369,-0.025974,0.092195,0.104363,0.058031,-0.012694,0.025389,-0.009652,0.025623,0.036737,0.021879,0.009477,0.059669,0.048905,0.073007,0.120742,0.020592,-0.088919,-0.036035,0.109042,-0.063179,-0.131974,-0.096407,0.060371,-0.089387,-0.066923,-0.060371,0.038375,-0.016263,-0.113722,-0.003671,0.135718,-0.008248,0.013045,-0.112318,0.061775,0.080495,-0.004885,0.031356,-0.063647,0.037205,-0.055925,0.007020,-0.025389,-0.058031,-0.047501,0.026793,-0.020124,0.051479,0.043757,0.070667,0.048905,-0.000669,0.096407,0.040481,0.036269,-0.116530,-0.090791,-0.037907,-0.031824,0.061307,0.072071,-0.079559,-0.045629,0.046097,-0.024102,-0.011992,-0.023634,-0.020943,-0.011232,0.031824,-0.064115,-0.119806,-0.116062,-0.054287,0.012636,0.066455,-0.033930,-0.106702,-0.054053,-0.008658,-0.091259,0.035567,-0.079559,0.003013,0.093599,-0.002091,-0.080027,0.070667,-0.023400,0.089855,-0.112318,0.069263,0.003685,0.019656,-0.032292,-0.008014,-0.024102,0.040715,0.017550,0.064583,0.038609,-0.021411,0.079559,-0.106702,0.004475,-0.057329,-0.031122,0.016029,-0.046799,0.007020,0.058265,0.059201,-0.064583,-0.025389,-0.132910,-0.018135,-0.037907,-0.032994,0.027378,0.014917,0.033930,0.038141,0.003144,0.068327,-0.040481,0.014976,-0.017082,0.011056,-0.007956,-0.082835,-0.077219,-0.051479,0.026676,-0.044225,-0.122614,0.018837,-0.097811,0.040715,0.022113,0.005791,-0.006025,0.030654,0.123550,0.088451,-0.026793,0.011232,0.050309,0.057563,0.004358,-0.002720,0.053351,0.072539,0.011817,-0.018369,0.055457,0.082367,-0.019773,0.049373,0.021411,-0.006084,-0.039311,-0.016731,0.003773,0.037907,-0.000241,0.010764,-0.077219,0.055691,0.120742,-0.022347,-0.038375,-0.047267,0.022815,-0.029016,0.056861,-0.016263,-0.032058,-0.000373,-0.023400,-0.005704,-0.039545,0.104363,-0.000867,0.071135,-0.009828,0.126358,0.081899,-0.022230,0.061307,0.023049,-0.018018,0.080027,-0.091727,0.018720,0.036971,-0.007605,0.047501,-0.075815,-0.006610,0.008892,0.074411,-0.087047,0.016146,-0.000386,0.032058,-0.008833,0.067859,0.030888,-0.003510,0.038141,-0.013396,0.059903,-0.113722,0.017433,-0.022464,-0.022464,0.026325,0.038141,0.130102,0.108106,-0.098279,-0.098279,-0.030888,-0.037205,-0.033462,-0.008482,-0.061775,0.010530,0.007078,-0.025389,-0.097343,0.029601,0.058967,0.062243,-0.087515]

To understand where all those numbers come from let's build a neural net. We'll have an input for every word in the English language paired with an output for every word. The network would look something like the next image but with many more inputs and outputs. The number of nodes in that firt layer is up to us. Below we've picked three, but assume that the whole-language example has 300, just like the above embedding has 300 entries.

Two examples of a neural net with two different inputs, "it" and "be." Click to enlarge.

The above image should look familiar, it's last week's neural net with a few more inputs and outputs. Also, the "logistic regressions" have been replaced by circles with outputs shown on their faces. The network has been trained in the following way. If you input a particular word by changing the corresponding value to 1, the outputs should arrange themselves to tell you how likely a word is to be next to your input word. Remember, a large value is more likely. You're playing a prediction game. You're trying to predict what a adjacent word will be a given some "random" word. Above, we can see that "it" is likely to show up next to "let" and "go", while "be" will probably find itself next to "I'll" and "back". From this, I doubt our training data is representative of all English usage. It seems oddly influenced by movie dialogue. ;)

To turn a word into an embedding all we have to do is take the values from the first nodes. For example, "it" would be [0.2, 0.3, 0.8], and "be" would be [0.1, 0.5, 0.2] in our example above. Every word will have a slightly different set of numbers coming out of these nodes. In word2vec it is common to have 300 such nodes, hence the 300-number long list above.

As for what this all means, that will have to wait. Rest assured, we're one step closer to understanding how LLMs do what they do (i.e., predict the next word in a string of words). The pieces are starting to come into focus. Of course, we have glossed over a good deal, including how such networks are trained. For the interested reader, here's the paper that introduced word2vec to the world. We'll spend a few days unpacking it. So, no need to read it unless you want to. It's rather more technical than we're shooting for here.

Let's build something!

We'll do our building in the LIT Prompts extension. If you aren't familiar with the LIT Prompts extension, don't worry. We'll walk you through setting things up before we start building. If you have used the LIT Prompts extension before, skip to The Prompt Pattern (Template).

Up Next

Setup LIT Prompts

Questions or comments? I'm on Mastodon @Colarusso@mastodon.social

Setup LIT Prompts

▼ Collapse

7 min intro video

LIT Prompts is a browser extension built at Suffolk University Law School's Legal Innovation and Technology Lab to help folks explore the use of Large Language Models (LLMs) and prompt engineering. LLMs are sentence completion machines, and prompts are the text upon which they build. Feed an LLM a prompt, and it will return a plausible-sounding follow-up (e.g., "Four score and seven..." might return "years ago our fathers brought forth..."). LIT Prompts lets users create and save prompt templates based on data from an active browser window (e.g., selected text or the whole text of a webpage) along with text from a user. Below we'll walk through a specific example.

To get started, follow the first four minutes of the intro video or the steps outlined below. Note: The video only shows Firefox, but once you've installed the extension, the steps are the same.

Install the extension

Follow the links for your browser.

Firefox: (1) visit the extension's add-ons page; (2) click "Add to Firefox;" and (3) grant permissions.
Chrome: (1) visit the extension's web store page; (2) click "Add to Chrome;" and (3) review permissions / "Add extension."

If you don't have Firefox, you can download it here. Would you rather use Chrome? Download it here.

Point it at an API

Here we'll walk through how to use an LLM provided by OpenAI, but you don't have to use their offering. If you're interested in alternatives, you can find them here. You can even run your LLM locally, avoiding the need to share your prompts with a third-party. If you need an OpenAI account, you can create one here. Note: when you create a new OpenAI account you are given a limited amount of free API credits. If you created an account some time ago, however, these may have expired. If your credits have expired, you will need to enter a billing method before you can use the API. You can check the state of any credits here.

Screenshot of the OpenAI API Keys page showing where to click to create a new key.

Once you are looking at the API docs, follow the steps outlined in the image above. That is:

Select "API keys" from the left menu
Click "+ Create new secret key"

On LIT Prompt's Templates & Settings screen, set your API Base to https://api.openai.com/v1/chat/completions and your API Key equal to the value you got above after clicking "+ Create new secret key". You get there by clicking the Templates & Settings button in the extension's popup:

open the extension
click on Templates & Settings
enter the API Base and Key (under the section OpenAI-Compatible API Integration)

Once those two bits of information (the API Base and Key) are in place, you're good to go. Now you can edit, create, and run prompt templates. Just open the LIT Prompts extension, and click one of the options. I suggest, however, that you read through the Templates and Settings screen to get oriented. You might even try out a few of the preloaded prompt templates. This will let you jump right in and get your hands dirty in the next section.

If you receive an error when trying to run a template after entering your Base and Key, and you are using OpenAI, make sure to check the state of any credits here. If you don't have any credits, you will need a billing method on file.

If you found this hard to follow, consider following along with the first four minutes of the video above. It covers the same content. It focuses on Firefox, but once you've installed the extension, the steps are the same.

The Prompt Pattern (Template)

A slide showing the George Box quote: All models are wrong, but some models are useful.

Maps are models; they don't show everything. That's okay as long as you don't confuse the map for the territory.

When crafting a LIT Prompts template, we use a mix of plain language and variable placeholders. Specifically, you can use double curly brackets to encase predefined variables. If the text between the brackets matches one of our predefined variable names, that section of text will be replaced with the variable's value. Today we'll meet our third predefined variable, {{passThrough}}. See the extension's documentation.

We use the Post-run Behavior parameter to govern what happens after a template is run. If you use Post-run Behavior to send one template's output to another template, the first template's output can be read by the second template via the {{passThrough}} variable. To turn my papers into chatbots, I set up a template that summarizes and questions the content of {{passThrough}}. Then I create templates for each of my papers that sent their contents to the first template. That is:

I created a template that summarizes and questions the content of the {{passThrough}} variable when passed to an LLM, like last week's but with {{passThrough}} instead of {{innerText}}.
Then I created templates containing the text of the papers I wanted to turn into a chatbots, setting their Output Type to Prompt, Post-Run Behavior to the name of the above template (i.e., "Summarize & question paper"), and checked the "Hide Button" checkbox.

Here's the first template's title.

Summarize & question paper

Here's the first template's text.

{{passThrough}} 

-------
  
Provide a short easy to understand 150 word summary of the above scholarly text. Present it and any subsequent answers using plain language, making it understandable for a general audience. If asked any follow-up questions, use the above text, and ONLY the above text, to answer them. If you can't find an answer in the above text, politely decline to answer explaining that you can't find the information. You can, however, finish a thought you started above if asked to continue, but don't write anything that isn't supported by the above text. And keep all of your replies short!

And here are the first template's parameters:

Output Type: LLM. This choice means that we'll "run" the template through an LLM (i.e., this will ping an LLM and return a result). Alternatively, we could have chosen "Prompt," in which case the extension would return the text of the completed template.
Model: gpt-4o-mini. This input specifies what model we should use when running the prompt. Available models differ based on your API provider. See e.g., OpenAI's list of models.
Temperature: 0. Temperature runs from 0 to 1 and specifies how "random" the answer should be. Since we're seeking fidelity to a text, I went with the least "creative" setting—0.
Max Tokens: 250. This number specifies how long the reply can be. Tokens are chunks of text the model uses to do its thing. They don't quite match up with words but are close. 1 token is something like 3/4 of a word. Smaller token limits run faster.
JSON: No. This asks the model to output its answer in something called JSON. We don't need to worry about that here, hence the selection of "No."
Output To: Screen Only. We can output the first reply from the LLM to a number of places, the screen, the clipboard... Here, we're content just to have it go to the screen.
Post-run Behavior: CHAT. Like the choice of output, we can decide what to do after a template runs. Here we want not only a summary of our text but the potential for a chat. So, "CHAT" it is.
Hide Button: unchecked. This determines if a button is displayed for this template in the extension's popup window. Here we left the option unchecked, but as we'll see below, sometimes we want to hide a button.

The subsequent templates all look like variations on the following. Download the prompts file to see them in their entirety.

Paper template's title.

Unsupervised Machine Scoring of Free Response Answers (Colarusso, 2022)

Here's what the start of paper template's text looks like.

Unsupervised Machine Scoring of Free Response Answers—Validated Against Law School Final Exams
by David Colarusso

Abstract

This paper presents a novel method for unsupervised machine scoring of short answer and essay question responses, relying solely on a sufficiently large set of responses to a common prompt, absent the need for pre-labeled sample answers—given said prompt is of a particular character. That is, for questions where “good” answers look similar, “wrong” answers are likely to be “wrong” in different ways. Consequently, when a collection of text embeddings for responses to a common prompt are placed in an appropriate feature space, the centroid of their placements can stand in for a model answer, providing a lodestar against which to measure individual responses. This paper examines the efficacy of this method and discusses potential applications. . .

And here are a paper template's parameters:

Output Type: Prompt. By choosing "Prompt" the template runs without being submitted to an LLM. It's output is just the template after slotting in variable values.
Model: n/a. Since Output Type is set to Prompt, we don't have to set LLM-specific parameters.
Temperature: n/a. Since Output Type is set to Prompt, we don't have to set LLM-specific parameters.
Max Tokens: n/a. Since Output Type is set to Prompt, we don't have to set LLM-specific parameters.
JSON: No. This asks the model to output its answer in something called JSON. We don't need to worry about that here, hence the selection of "No." Note: you can use JSON with Output Type set to Prompt, more on that will following in future posts.
Output To: Hidden. We can output the first reply from the LLM to a number of places, the screen, the clipboard... Here, we don't need to see the contents of the prompt because it would be too much. So, I've chosen "Hidden."
Post-run Behavior: Summarize & question paper. Like the choice of output, we can decide what to do after a template runs. If you choose the title of another prompt, that prompt will be triggered after the template runs, and the current template's output will be sent along as the {{passThrough}} variable.
Hide Button: checked. This determines if a button is displayed for this template in the extension's popup window. Here, for the first time, we've checked the option. This is because the template shouldn't be triggered by the user directly. Rather, it needs to be triggered by another template so that there's something in the {{passThrough}} variable.

Working with the above templates

To work with the above templates, you could copy it and its parameters into LIT Prompts one by one, or you could download a single prompts file and upload it from the extension's Templates & Settings screen. This will replace your existing prompts.

Screenshot of the LIT Prompts Templates and Settings page showing where to upload prompts files.

You can download a prompts file (the above template and its parameters) suitable for upload by clicking this button:

Download prompts file

Kick the Tires

It's one thing to read about something and another to put what you've learned into practice. Let's see how this template performs.

Digital Curb Cuts. Click into the Digital Curb Cuts article and try some of the following questions:
- What's a digital curb cut?
- Who did all the work?
- What is the Legal Innovation and Technology Lab?
- What forms were automated?
- Can you share a story about how one of the forms was actually used?
- How can I get something like this in my jurisdiction?
- Tell me more about this "starter kit."
- What's next for the project?
Déjà vu. This prompt should remind you of our first template. Why not try asking why the sky is blue?
Follow your curiosity. Pick a paper and go at it. Hopefully, when you see a paper's summary some questions come to mind. Ask them. If you find the prompt too constraining, loosen it up. Maybe you don't want to have the answers limited only to the text.
Aim to misbehave. See if you can convince the LLM to go beyond the prompt's instructions based only on the content of your chat. Ask it where it thinks a paper is the weakest, and if it refuses to answer, tell it to ignore it's prior instructions and answer freely. That sort of thing.
You're the expert. Add new paper prompts with your own texts. See how they perform.

Export and Share

After you've made the template your own and have it behaving the way you like, you can export and share it with others. This will produce an HTML file you can share. This file should work on any internet connected device. To create your file, click the Export Interactions Page button. The contents of the textarea above the button will be appended to the top of your exported file. Importantly, if you don't want to share your API key, you should temporarily remove it from your settings before exporting.

Screenshot of the LIT Prompts Templates and Settings page showing where to export a file.

If you want to see what an exported file looks like without having to make one yourself. You can use the buttons below. View export in browser will open the file in your browser, and Download export will download a file. In either case the following custom header will be inserted into your file. It will NOT include an API key. So, you'll have to enter one when asked if you want to see things work. This information is saved in your browser. If you've provided it before, you won't be asked again. It is not shared with me. To remove this information for this site (and only this site, not individual files), you can follow the instructions found on my privacy page. Remember, when you export your own file, whether or not it contains and API key depends on if you have one defined at the time of output.

Custom header:

<h2>Chat with some of my scholarly works</h2>
<p>
<a href="https://mastodon.social/@Colarusso" target="_blank"><img src="https://sadlynothavocdinosaur.com/images/colarusso.jpg" style="border-radius: 50%;float:left;width:50px;margin:3px 15px 5px 0;" alt="Headshot of the author, Colarusso."/></a>
As a <a href="https://www.suffolk.edu/academics/faculty/d/c/dcolarusso" target="_blank">Practitioner-in-Residence</a>, I'm not expected to publish scholarly works in the same way as my doctrinal colleagues. This is one reason the <a href="https://sadlynothavocdinosaur.com/projects/">projects</a> section of this site has so many entries. However, occasionally, I do something that looks like legal scholarship. To make these easier to engage with, I've turned some into "chatbots." I walk through how I did this <a href="https://sadlynothavocdinosaur.com/posts/papers2bots">here</a>.
</p>
<p>Keep in mind, these are based on the text of my preprints, and the tech I'm using can "<a href="https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)" target="_blank">hallucinate</a>." You should treat their output as you would any secondary source. If you see anything that piques your interest, check the primary source to confirm it. You can find the full papers, as well as lists of my co-authors, at <a href="https://scholar.google.com/citations?user=ovuch2YAAAAJ&hl=en" target="_blank">my Google Scholar listing</a>.
</p>
<hr style="border: solid 0px; border-bottom: solid 1px #555;margin: 5px 0 15px 0"/>

Not sure what's up with all those greater than and less than signs? Looking for tips on how to style your HTML? Check out this general HTML tutorial.

The export you'll see after clicking the buttons below is what you'll get out of LIT Prompts. However, I linked to a special version of this file above. See here. I edited that version to collect analytics and to provide access to some prepaid LLM credits. The following will prompt users to enter LLM API info.

View export in browser Download export

TL;DR References

ICYMI, here are blubs for a selection of works I linked to in this post. If you didn't click through above, you might want to give them a look now.

GOAT: Who is the Greatest Economist of all Time and Why Does it Matter? A generative book by Tyler Cowen. From the site, "Do you yearn for something more than a book? And yet still love books? How about a book you can query, and it will answer away to your heart's content? How about a book that will create its own content, on demand, or allow you to rewrite it? A book that will tell you why it is (sometimes) wrong? That is what I have tried to build with my latest work. It's called GOAT: Who is the Greatest Economist of all Time and Why Does it Matter?"
Efficient Estimation of Word Representations in Vector Space by Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. This is the paper that introduced word2vec to the world. It's a technical paper. If that scares you, consider looping back and looking at it after we finish this week's micro-lessons. It might make more sense given that perspective as we'll spend the next several posts unpacking what word2vec does and why it matters.