Generating red link links for Wikipedia

Red link listsOn Global Women Wikipedia Write-In Day, I was extremely impressed with user Dsp13’s lists of red links — lists of notable women that hadn’t yet been written about on Wikipedia. I used that page as a springboard to write about some notable women in American history, like the wonderful Agnes Surriage Frankland. Dsp13 took these lists of names from resources like Famous American Women: A Biographical Dictionary, signifying the notability of the listed names and giving editors a place to start their research.

I wanted to do something similar for printing history, a research interest of mine. Here is a red link list I cobbled together. As it turns out, there are tons of other red link lists, too! I’m not sure how other people are generating them — probably from databases or other lists already in digital form. (Any info?) But many good resources are in book form, sometimes keyboarded but likely scanned and OCR’d. To make my red link lists, I’m taking indexes from scanned books and generating lists in wiki format for my user page.

Index, wikified and listified
Messy OCR’d book index » cleaned up and put in wiki format » list on Wikipedia

Generating wiki lists from indexes

  1. Find an interesting, useful book with an index that’s been keyboarded or OCR’d. These will likely be on Gutenberg or Internet Archive. (Example.)
  2. Copy/paste the index into a plain text editing program like TextWrangler.
  3. Strip out the unnecessary stuff in the index (like page numbers), manually remove redundant/unimportant lines (optional), and format the list for Wikipedia (switch reversed names, put in *[[title]] formatting, split into columns). I wrote a couple of messy Python scripts for this step.
  4. Copy/paste the resulting text into your user page. 

Now you can get a quick visual of how many of these entities still need to be written up!

Note that some entities might be notable, and some might not be. And of course, some blue-linked wiki pages might not describe the right entity or might lead to a disambiguation page. Regardless, it’s a place to start!

One comment

  1. Mike Wood says:

    Great job on the Agnes Surriage Frankland article. We need more people like you to take care of those red links. Thanks again and hope you have had a friendly experience on Wikipedia.

Leave a Reply

Your email address will not be published. Required fields are marked *

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Anti-spam image