Welcome to Journals Online
home page
sign up for JournalsOnline.  You know that you want to.
write!
about journals online
Links to Diary and Journal sites, or sites I just plain like.



Written by codetalk

CGI-Less CGI's, and Why you want them. A technique to preserve the URL space while reaping the benefits of dynamic content

The CGI-Less CGI

Creating a dynamic site has many advantages, flexibility, efficiency, etc. One way of creating a dynamic site is by using CGI programs, creating URL's with content like 'showcontent.cgi' or 'survey.pl.' This technique has two major problems. First, most search engines ignore urls with '.cgi' in them, so you don't get your content indexed. The second and more severe problem is that URLs that include details of how the content was generated create an ugly URL namespace, and are just plain ugly.

Esthetics matter.

See Cool URIs don't change" by Tim Berners-Lee (creator of the web, ya'know).

There are lots of ways around these problems. In this document I detail the technique that I have selected. Journals Online is about stories. I wanted to create a clean URL namespace that accentuates the primacy of story. My original URL's look like:

http://www.journalsonline.com/code/showstory.cgi/users/rgibson/Mex2000-Story_020800.html
This is not a URL that supports the idea that story is king! This is a URL that says 'hey, look at all this cool geek stuff. Code is cool,' before dribbling off into an incoherent argument about Linux distributions or something.

I feel that the '/users/rgibson/Mex2000-Story_020800.html' portion of this URL is clean, or clean enough, but the '/code/showstory.cgi' includes entirely too much server dependent information. This URL points to a story about travel in Mexico. What does 'code/showstory.cgi' have to do with travel in Mexico?

Creating cool URL's requires the same sort of thinking as does 'normalizing' a database. In a normalized database table, all attributes are about 'the key, the whole key, and nothing but the key.' In the same way, a cool url is all about the content, the whole content, and nothing but the content. Each element in a hierarchy based URL can be thought of as a database attribute.

When deconstructing a cool url, we should be able to say 'this link is a...' about each level. In the example, this story is a story by 'rgibson' and rgibson is one of the 'users.' But 'users' are not anything to 'showstory.cgi.' At best we could assemble the sentence as 'this story is by rgibson, who is one of the users, and this thing is presented to us by 'showstory.cgi' which is a piece of code.

Yuck!

Here is the new structure:

http://www.journalsonline.com/users/stories/rgibson/Mex2000-Story_020800.html

This is better. This content is by 'rgibson,' and it is a story, and these are all the users stories, on journals online. This means that this entity, a 'user' could have things other than 'stories,' such as pictures, maps, sound or video clips, or other Rich Content. There is still room for improvement in the namespace model, and I want to improve it, but I also want to sleep late and eat Ding Dongs. Sometimes you have to say 'that is enough!'

So how does this work, under the hood and all:

'users' is a directory, 'stories' is a perl script, rgibson is a directory, and Mex2000-Story_020800.html is a file. I have used the .htaccess file to set a 'handler' so that the file 'stories' is treated as a script. The portion 'rgibson/Mex2000-Story_020800.html' is treated as 'Extra Path Information' (EPI) that the server passes on to the CGI program to process or ignore as it sees fits.

The directory /users contains an '.htaccess' file that contains the line 'SetHandler cgi-script'

This means that all files in the directory are treated as scripts. Here is the whole of the 'stories' CGI:

#!/usr/bin/perl

#showstory.cgi
use lib('/u/rgibson/vp/journalsonline.com/www/code');
use jo;
use CGI;

$query = new CGI;
$query->import_names('R');

print "Content-type: text/html\n\n";

$epi = $ENV{"PATH_INFO"};
#showstory is a sub routine in my module jo.pm.
showstory("/users/" . $epi,$R::hide_footer);

The subroutine showstory() takes a full filename, and a 'hide footer' flag, and then displays that filename within the standard Journals Online navigation scheme. In this case, the EPI will be 'stories/rgibson/Mex2000-Story_020800.html', so I prepend '/users/' to that, and that is the file to show.
Do you want to be 'in the know' about Journals Online?
Email:  

Enter your email address to get all the latest (or just to let me know that my programs are working).
   
consciousness is a social behavior
into the bite of the sea went we,
...fuller fear were we