Category: How to

Generating red link links for Wikipedia

Red link listsOn Global Women Wikipedia Write-In Day, I was extremely impressed with user Dsp13’s lists of red links — lists of notable women that hadn’t yet been written about on Wikipedia. I used that page as a springboard to write about some notable women in American history, like the wonderful Agnes Surriage Frankland. Dsp13 took these lists of names from resources like Famous American Women: A Biographical Dictionary, signifying the notability of the listed names and giving editors a place to start their research.

I wanted to do something similar for printing history, a research interest of mine. Here is a red link list I cobbled together. As it turns out, there are tons of other red link lists, too! I’m not sure how other people are generating them — probably from databases or other lists already in digital form. (Any info?) But many good resources are in book form, sometimes keyboarded but likely scanned and OCR’d. To make my red link lists, I’m taking indexes from scanned books and generating lists in wiki format for my user page.

Index, wikified and listified
Messy OCR’d book index » cleaned up and put in wiki format » list on Wikipedia

Generating wiki lists from indexes

  1. Find an interesting, useful book with an index that’s been keyboarded or OCR’d. These will likely be on Gutenberg or Internet Archive. (Example.)
  2. Copy/paste the index into a plain text editing program like TextWrangler.
  3. Strip out the unnecessary stuff in the index (like page numbers), manually remove redundant/unimportant lines (optional), and format the list for Wikipedia (switch reversed names, put in *[[title]] formatting, split into columns). I wrote a couple of messy Python scripts for this step.
  4. Copy/paste the resulting text into your user page. 

Now you can get a quick visual of how many of these entities still need to be written up!

Note that some entities might be notable, and some might not be. And of course, some blue-linked wiki pages might not describe the right entity or might lead to a disambiguation page. Regardless, it’s a place to start!

Making a Superfish menu faster by removing animation

Superfish menu

I’m posting this because none of the other solutions I found through googling fixed our problem. Context: Superfish is a Drupal module for your nav menu that shows submenus on hover. Also, I don’t really know JavaScript/jQuery very well, just enough to fumble around and get solutions.

By default, the Superfish submenus fade into view when you hover on the menu’s title, which is usually a link itself. This is dumb. It’s so slow. In 100% of the usability study sessions I conducted with this default menu animation, the users clicked on the menu’s title right away and got annoyed when they saw the submenu begin to appear just as the browser loaded a new page. Ain’t nobody got an extra 400 milliseconds for a submenu! It should appear on hover instantly. Here’s how I disabled the slow animation.

  1. Changed line 118 in superfish. js, which originally looked like this…
    $ul.animate(o.animation,o.speed,function(){ sf.IE7fix.call($ul); o.onShow.call($ul); });
    …to this…
    $ul.show(0,function(){ sf.IE7fix.call($ul); o.onShow.call($ul);});
  2. Then, I went to admin/config/user-interface/superfish (you may have to give yourself the right permissions to configure Superfish) and deleted these files from the “Path to Superfish library” text box:
        jquery.hoverIntent.minified.js
        jquery.bgiframe.min.js

Most of the other solutions out there say to change the delay or speed values to 1 in lines 86-99 of superfish.js, or to set disableHI to false, but none of those solutions worked (although I kept them in the .js out of laziness to change it back).

Note: We’re using Drupal v.7 and Superfish v.7.x-1.9. It’s faster for us to call the jQuery library from Google than from our own server. As of this blog post, you can see our menu in action at the top of our library website.

Other than the default delay in showing submenus on hover, Superfish is awesome.

Drupal’s WSoD (White Screen of Death)

5:50pm, Friday

One Friday night a few weeks ago, all was peaceful here in the library. Everyone else had left, the lights were dimmed, and I was wrapping up a few last things before heading out to my weekend. I had done a few tweaks to the site’s dropdown menu CSS, and as I put on my scarf and coat, I casually pushed them from our development server’s Git repository to our remote master repo, then pulled the commits down to our production server.

I reloaded the library webpage.

It had gone completely blank.

As the panic slowly seeped into my bloodstream, I reloaded again and again, even looked at the source code — nothing, not even a space or error message.

Reverting

I had never rolled back any changes before, and the Git cheat sheet I have tacked to my wall didn’t  have enough information about undoing mistakes to make me comfortable about rushing off a command. I called our Drupal consultant, who answered his cell phone while driving and spoke in a calming voice about how this is why we use version control, just revert to a safe commit, and it will all be okay.

Our commits are logged and easily readable in an Unfuddle project, so I peered at that and picked out what I knew to be the previous, safe commit, and entered this commend:

sudo git reset --hard 5154951c5a3a6a9211ba68268c6159c51cdb5f58

Every StackExchange thread featuring this command also included dire warnings that had previously frightened me away from using it, but if you really do want to wipe out changes in your local repository (in this case, whatever had just been pulled down to our production server), this is how you do it.

The site came back up as it had been before, after maybe five or ten minutes of downtime. I breathed a little easier and left for the weekend.

Investigating

But why had it gone blank? This was what I had to look into when I got back. If I pulled the most up-to-date commits down from the remote repo again, the site would still blank out. (I knew because I tried, hoping the WSoD had been a fluke. It wasn’t.) There were 60 changed files in the commit, mostly CSS, PDFs, and files for two non-essential modules. Even weirder, why was the up-to-date dev site totally fine? Until we fixed whatever was wrong on the production site, we’d have to pause development.

Drupal’s help pages have a list of common problems that case the White Screen of Death. It’s thorough but not complete. We went about troubleshooting at times when site use was low, so a few seconds of downtime wouldn’t be too disruptive. We still couldn’t tell if it was a server problem or something in those 60 files, so we started with these:

  1. Out of PHP memory?
    • Already at 128 MB, double recommended amount
  2. No more space on virtual machine?
    • Have around 30 GB left, not the problem
  3. Restart production server?
    • Pulled from remote repo, then restarted server, still WSOD
  4. PHP execution time limit too low?
    • Dev is 600 seconds, production is 30; changed to 600s, still WSOD
  5. Change settings file to display error when site goes blank after pull
    • Message on blank site: Fatal error: require_once(): Failed opening required ‘…/sites/all/modules/admin_menu/admin_views/plugins/views_plugin_display_system.inc’ (include_path=’.:/usr/share/php:/usr/share/pear’) in …/includes/bootstrap.inc on line 3066

This error matched our server logs and Drupal error reports. The file that required opening had been deleted in the toxic commit, but at first it didn’t seem like that would be the problem. The Admin Views module is only visible to logged-in administrative users who want a more tricked-out menu bar at the top of their screens — why would it bring the site down?

In exasperation, I disabled the Admin Views module and tried again to pull down — and voilà, the site was still there, updated, and looked fine. Apparently, that was all I had to do: turn off the module causing problems so the site code wouldn’t quit out on me.

If it were a more essential module (not just one for a few admins’ convenience), we would have had to look into this issue further. For now, having caused enough headaches for myself, I’ll leave well enough alone.

Related post:
Using Drupal and Git for a library website

Using Drupal and Git for a library website

The Lloyd Sealy Library website uses Drupal 7 as its content management system and Git for version control. The tricky thing about this setup is that you can keep track of some parts of a Drupal site using Git, but not all. Code can be tracked in Git, but content can’t be.

Code

  • theme files (CSS, PHP, INC, etc.)
  • the out-of-the-box system
  • all modules
  • any core or module updates (do on dev, push to production)

Content

  • anything in the Drupal database:
    • written content (pages, blog posts, menus, etc.)
    • configurations (preferences, blocks, regions, etc.)

Here’s our workflow:

Git and Drupal workflow diagram

Code: Using Git to push code from dev to production is pretty straightforward. I was a SVN gal, so getting used to the extra steps in Git took some time to learn. I used video tutorials made by our consultants at Cherry Hill as well as Lynda.com videos. (For those new to using version control, it’s a mandatory practice if you manage institutional websites. Using version control between two servers lets you work on the same content simultaneously with other people and roll out changes in a deliberate manner. Version control keeps track of all the changes made over time, too, so if you mess up, you can easily revert your site back to a safe version.)

Content: Keeping the content up to date on both servers is a little hairier. We use the Backup and Migrate module to update our dev database on an irregular schedule with new content made on the production server. The only reason to update the dev database is so that our dev and production sites aren’t confusingly dissimilar. Additionally, some CSS might refer to classes newly specified in the database content. The schedule is irregular because the webmaster, Mandy, and I sometimes test out content on the dev side first (like a search box) before copying the content manually onto the production site.

Why have a two-way update scheme? Why not do everything on dev first, and restore the database from dev to production? We want most content changes to be publicly visible immediately. All of our librarians have editor access, which was one of the major appeals of using a CMS that allowed different roles. Every librarian can edit pages and write blog posts as they wish. It would be silly to embargo these content additions.

Help: A lot of workflow points are covered in Drupal’s help page, Building a Drupal site with Git. As with all Drupal help pages, though, parts of it are incomplete. The Drupal4Lib listserv is very active and helpful for both general and library-specific Drupal questions.

Non-Drupal files: Lastly, we have some online resources outside of Drupal that we don’t want clogging up our remote repository, like the hundreds of trial transcript PDFs. These aren’t going to be changing, and they’re not code. The trial transcript directory is therefore listed in our .gitignore file.

Any Drupal/Git workflow tips?

Building a database directory in Drupal 7

We use Drupal 7 as the CMS behind our library website. It’s robust and flexible, but has a notoriously steep learning curve on the back end. One thing we struggled with at first was how best to direct our users to the databases they have access to. (At this time, CUNY doesn’t have a discovery layer, so students must find articles by choosing a database first and searching within.)

Making a good directory for our 200+ databases required getting familiar with Content Types (specifying categories of content, like ‘database’ or ‘blog post’) and Views (how these fields are presented, like displaying a URL as a link). When you make a new content type, behind the scenes, Drupal adds more tables to its core database, where every field you pick is a column and every piece of content you add is a row.

The full page of our database directory looks like this:

database screenshot
Screenshot of database directory (click for full-size image or see for yourself)

Each database we subscribe to has its own row of information and links, like this:

Info about one database
Info about one database

Read more