Sadly Not, Havoc Dinosaur

Wait, Have I Read This Book Already?

In which I extract and clean data from 10 years of emails and online receipts to reconstruct a list of 400+ books because I don't want to start a "new" book only to realize I've read it before

Headshot of the author, Colarusso. David Colaursso

This is the 37th post in my series 50 Days of LIT Prompts.

Do yourself a favor. Write down the books you've read. Otherwise, you'll find yourself wondering if that book looks familiar because you put it on a list to read or because you've read it already. Unfortunately, I've never had such a list. :(

So, today I created a set of LIT Prompts to help me conduct some digital archeology. They extract and clean data from old emails and online histories. So far, today alone I've been able to reconstruct about ten years' worth of online book orders, resulting in a list of 400+ books. Eventually, I'll need to add physical books I didn't order online, but I can always consult my shelves and this seems like less of a problem for such books. Anywho, once the templates were ready to go, the workflow looked like this.

Extract & Clean

  1. See if you can download a list of books you've checked out or purchased from each of the places you get books. I found about 200 books this way, but this list only went back to 2017 and didn't cover all the places I get books.
  2. Run the Clear scratch template described below. This replaces the LIT Prompts' Scratch Pad with the header for a simple csv file (i.e., "title","author","read")
  3. For those book providers who didn't offer a downloadable csv, navigate to the history section of your account, select all (ctr-A), and run the Get book info from on-page reading history template. Navigate through the history page-by-page, selecting all and re-running the template. This will extract book names, authors, and dates from the page and add them as entries to the Scratch Pad in the form of a csv row. Note: I was able to do this with one of my library accounts, Amazon, and Audible though the current prompt didn't capture dates for Amazon's list. So, you might have to tweak things.
  4. For providers who didn't have online histories, search for emails from telling you a book is available (e.g., from:donotreply@library.example.com).
  5. Open each email in the above results, select all (ctr-A), and run the Get book info from email template. This will extract book names, authors, and dates from the email and add them to the Sratch Pad in the form of a csv row. See animated GIF below.
  6. Clean up the Scratch Pad to make sure the formatting is right (e.g., I had to add line breaks so each row was on a new line).
  7. Run the Save to file template.
  8. Open the resulting spreadsheet and clean things up as needed (e.g., some dates were clearly off as they were today's date). This is also where you can combine any csv files you got elsewhere.

FWIW, here's a a partial list of what I found. Mostly, I filtered out the books I've read with my kids since that really seemed like their lists, and this list misses most of the physical books I've read, which honestly isn't that many these days. Also, you'll notice that the spreadsheet includes links where you can find the books (I added this manually). If you want to filter the entries, you'll need to open the sheet.

The prompt templates are pretty straight forward. The real trick is setting things such that new data is appended to the Scratch Pad. Should you wish to go on your own digital dig, I wish you luck. Of course, this approach should be rather generalizable. So, don't feel like you have to stick to books. Maybe you want a nice structured list of other online histories...

Let's build something!

We'll do our building in the LIT Prompts extension. If you aren't familiar with the LIT Prompts extension, don't worry. We'll walk you through setting things up before we start building. If you have used the LIT Prompts extension before, skip to The Prompt Pattern (Template).

Up Next

Questions or comments? I'm on Mastodon @Colarusso@mastodon.social


Setup LIT Prompts

7 min intro video

LIT Prompts is a browser extension built at Suffolk University Law School's Legal Innovation and Technology Lab to help folks explore the use of Large Language Models (LLMs) and prompt engineering. LLMs are sentence completion machines, and prompts are the text upon which they build. Feed an LLM a prompt, and it will return a plausible-sounding follow-up (e.g., "Four score and seven..." might return "years ago our fathers brought forth..."). LIT Prompts lets users create and save prompt templates based on data from an active browser window (e.g., selected text or the whole text of a webpage) along with text from a user. Below we'll walk through a specific example.

To get started, follow the first four minutes of the intro video or the steps outlined below. Note: The video only shows Firefox, but once you've installed the extension, the steps are the same.

Install the extension

Follow the links for your browser.

  • Firefox: (1) visit the extension's add-ons page; (2) click "Add to Firefox;" and (3) grant permissions.
  • Chrome: (1) visit the extension's web store page; (2) click "Add to Chrome;" and (3) review permissions / "Add extension."

If you don't have Firefox, you can download it here. Would you rather use Chrome? Download it here.

Point it at an API

Here we'll walk through how to use an LLM provided by OpenAI, but you don't have to use their offering. If you're interested in alternatives, you can find them here. You can even run your LLM locally, avoiding the need to share your prompts with a third-party. If you need an OpenAI account, you can create one here. Note: when you create a new OpenAI account you are given a limited amount of free API credits. If you created an account some time ago, however, these may have expired. If your credits have expired, you will need to enter a billing method before you can use the API. You can check the state of any credits here.

Login to OpenAI, and navigate to the API documentation.

Once you are looking at the API docs, follow the steps outlined in the image above. That is:

  1. Select "API keys" from the left menu
  2. Click "+ Create new secret key"

On LIT Prompt's Templates & Settings screen, set your API Base to https://api.openai.com/v1/chat/completions and your API Key equal to the value you got above after clicking "+ Create new secret key". You get there by clicking the Templates & Settings button in the extension's popup:

  1. open the extension
  2. click on Templates & Settings
  3. enter the API Base and Key (under the section OpenAI-Compatible API Integration)

Once those two bits of information (the API Base and Key) are in place, you're good to go. Now you can edit, create, and run prompt templates. Just open the LIT Prompts extension, and click one of the options. I suggest, however, that you read through the Templates and Settings screen to get oriented. You might even try out a few of the preloaded prompt templates. This will let you jump right in and get your hands dirty in the next section.

If you receive an error when trying to run a template after entering your Base and Key, and you are using OpenAI, make sure to check the state of any credits here. If you don't have any credits, you will need a billing method on file.

If you found this hard to follow, consider following along with the first four minutes of the video above. It covers the same content. It focuses on Firefox, but once you've installed the extension, the steps are the same.


The Prompt Patterns (Templates)

When crafting a LIT Prompts template, we use a mix of plain language and variable placeholders. Specifically, you can use double curly brackets to encase predefined variables. If the text between the brackets matches one of our predefined variable names, that section of text will be replaced with the variable's value. Today we'll be using {{highlighted}}, and {{scratch}}. See the extension's documentation.

The {{highlighted}} variable contains any text you have highlighted/selected in the active browser tab when you open the extension. We'll use this to select text on the scree from which to pull our data.

The {{scratch}} variable contains the text in your Scratch Pad. Remember, the scratch pad is accessible from the extension's popup window. The button is to the right of the Settings & Templates button that you have used before. Here we use the fact that you can append things to the Scratch Pad to create our list of books before saving it to a file.

For a breakdown of how to use these templates, check out the workflow above: Extract & Clean.

Here's the first template's title.

Clear scratch

Here's the template's text.

"title","author","read"

And here are the template's parameters:

Here's the second template's title.

Get book info from email

Here's the template's text.

I'm about to show you an email from my library letting me know a book, or books, I requested is/are available. For each book, produce a single csv row with the following column data all encased in double quotes: title, author, date book was available. Be sure to add a carriage return/line break to the end of all rows. Note: the date should be formatted as "full month name, day, four-digit year" (e.g., July 4, 1776).

LIBRARY EMAIL

{{highlighted}}

---

Now provide the csv row or rows.

And here are the template's parameters:

Here's the third template's title.

Get book info from on-page reading history

Here's the template's text.

I'm about to show you a list of books from my reading history. For each book, produce a single csv row with the following column data all encased in double quotes: title, author, date book was available. Be sure to add a carriage return/line break to the end of all rows. Note: the date should be formatted as "full month name, day, four-digit year" (e.g., July 4, 1776). If no date is available return "unknown."

READING HISTORY

{{highlighted}}

---

Now provide the csv row or rows.

And here are the template's parameters:

Here's the fourth template's title.

Save to file

Here's the template's text.

{{scratch}}

And here are the template's parameters:

Working with the above templates

To work with the above templates, you could copy them and their parameters into LIT Prompts one by one, or you could download a single prompts file and upload it from the extension's Templates & Settings screen. This will replace your existing prompts.

You can download a prompts file (the above template and its parameters) suitable for upload by clicking this button:


Kick the Tires

It's one thing to read about something and another to put what you've learned into practice. Let's see how this template performs.