Table of
contents
8.1.2010
The year I started blogging (blogware)
Although I haven't figured out the value of public personal blogs so far, I did recently have to admit that for Unix hackers and programmers, technical blogs do seem to fill a certain gap between forum discussions and HOWTOs/white papers.
So, I decided to join the movement and make a blog about hacks I wouldn't otherwise document, if merely for a future personal reference :-)
So, to start was to find an appropriate blogware, but I instantly got confused by the choices.
It didn't take me long to figure out that I wasn't really after anything fancy or bloated with features, but I did have an existing web site and would have preferred something with the exact same look.
Also, I'm more comfortable creating the HTML content myself, as I can use the tools I'm already familiar with (e.g. for creating syntax highlighted code print-outs).
So I decided to write a few scripts myself to serve as the blogware, and make my first blog entry about them. :-)
I hardly think these scripts are of much interest to anyone, to be perfectly honest.
But if it so happens that you're exactly in the same situation as I were, and would like to save a day of quality time with your favourite scripting language, here you go.
Blog software
The idea is to have an existing HTML template tagged with anchor keywords, which the scripts then replace with blog content.
For commenting, a form is created for each blog entry from a template.
The form calls a cgi script, which sanitizes the input, adds the comment to the appropriate comment file, and calls a script to re-create the static HTMLs.
Let's start off with the main script that actually generates the HTMLs from the templates.
When adding a new blog entry, just create a new directory under entries with the following files: description (date<newline>description), content.html (the actual HTML content of the blog entry), comments (empty at first).
Then call the src/genroot.pl script manually in the root directory of your blog to create the initial HTMLs.
Make sure that the HTTP service is allowed to create or change the HTML files, and to increment the comment files.
This is the main script that creates the HTMLs:
#!/usr/bin/perl
use lib "src"; # This be the path where the modules are at
use gencontent;
# Whether to create individual pages for each blog entry, or to cram
# them all in one page.
my $page_separation = 1;
my $page_root = "."; # Put the entry.html files here
# This script finds an anchor from the specified root template HTML file,
# and replaces the anchor with the blog content.
my $root_template = "src/root_template.html";
my $blog_anchor = "BLOGANCHOR";
my $root_output_file = "index.html";
# We also set the title. For root, it is "", and
# for separate entries, it's " - <description>".
my $title_anchor = "TITLEANCHOR";
open(ROOT_TEMPLATE_FILE, "<$root_template") or die "Couldn't open template file $root_template";
open(OUTPUT_FILE, ">$root_output_file") or die "Couldn't open output file $root_output_file for writing";
# We're using the same root template for individual entries
my $root_content;
if ($page_separation) {
$root_content = getTOC("null");
foreach my $entry (@entries) {
# Rewinding the root file for each entry
seek(ROOT_TEMPLATE_FILE, 0, 0);
# Entries start with a TOC where the current entry is not a link
my $entry_content = getTOC($entry) . "<br/><hr/>\n";
$entry_content .= getEntryContent($entry);
my $entry_name = (split("/", $entry))[1];
open(ENTRY_FILE, ">$page_root/$entry_name.html") or die "Couldn't open entry html $entry_name.html for writing";
# Extracting the description
open(ENTRY_DESC, "<$entry/description") or die "No description file for entry $entry";
<ENTRY_DESC>; my $desc = <ENTRY_DESC>; chomp $desc;
while (<ROOT_TEMPLATE_FILE>) {
s/$title_anchor/ - $desc/;
s/$blog_anchor/$entry_content/;
print ENTRY_FILE $_;
}
}
seek(ROOT_TEMPLATE_FILE, 0, 0);
} else {
$root_content = getBlogContent();
}
while (<ROOT_TEMPLATE_FILE>) {
s/$title_anchor//;
s/$blog_anchor/$root_content/;
print OUTPUT_FILE $_;
}
exit 0;
genroot.pl
The above script delegates content creation to the following module:
# This module actually generates the blog content
my $entry_path = "entries";
# This is the template to the comment box, to be appended directly at the
# end of each entry. We're going to find the anchor from the file,
# and replace the anchor with the name of the entry to be commented.
# (We need this to distinguish comments of different entries in comment.cgi)
my $comment_anchor = "COMMANCHOR";
my $comment_box_src = "src/comment_box.html";
@entries = <$entry_path/*>;
sub getBlogContent {
my $content;
$content .= getTOC();
$content .= "<br/><hr/>\n";
foreach my $entry (@entries) {
$content .= getEntryContent($entry);
}
return $content;
}
sub getEntryContent {
my $entry = $_[0];
my $content;
open(ENTRY_DESC, "<$entry/description") or die "No description file for entry $entry";
open(CONTENT, "<$entry/content.html") or die "No content.html file for entry $entry";
my $date = <ENTRY_DESC>;
my $desc = <ENTRY_DESC>;
chomp $date;
chomp $desc;
my $entry_name = (split("/", $entry))[1];
$content .= "<a id=$entry_name></a>";
$content .= "<h1>$date</h1>";
$content .= "<h2>$desc</h2>\n";
# We read in the content data
while (<CONTENT>) { $content .= $_; }
$content .= "\n";
# Adding comments
$content .= getCommentSection($entry);
}
sub getCommentSection {
my $entry = $_[0];
my $comment_section = "<h3>Comments</h3>\n";
# Adding the actual comments, if comments file exists
open(COMMENTS, "<$entry/comments") and $comment_section .= getComments(COMMENTS);
# Adding the comment box
$comment_section .= "<p>\n";
open(COMMENT_BOX, "<$comment_box_src") or die "Couldn't load comment box src from $comment_box_src";
while (<COMMENT_BOX>) { s/$comment_anchor/\"$entry\"/; $comment_section .= $_; }
$comment_section .= "</p><br/><hr/>\n";
return $comment_section;
}
sub getComments {
my $FILE = $_[0];
my $comments;
my $fullfile;
# Reading comment file contents to memory at once, but adding <br/> to newlines
while (<$FILE>) { $fullfile .= $_; }
# Fields are separated by " #"
my @comment_array = split(/ #/, $fullfile);
for (my $i = 0; $i < int(@comment_array/4); $i++) {
my $date = $comment_array[$i*4 + 0];
my $nick = $comment_array[$i*4 + 1];
my $e_mail = $comment_array[$i*4 + 2];
# Let's remove any whitespaces from beginning and end of these..
$date =~ s/^\s*//g; $date =~ s/\s*$//g;
$nick =~ s/^\s*//g; $nick =~ s/\s*$//g;
$e_mail =~ s/^\s*//g; $e_mail =~ s/\s*$//g;
# Cutting the message off at 8k characters in case it's spam
my $message = substr($comment_array[$i*4 + 3], 0, 8192);
# We're restoring all escaped #s
$message =~ s/\\#/#/g;
$nick =~ s/\\#/#/g;
$e_mail =~ s/\\#/#/g;
$comments .= "<h1><small>$date</small></h1><p>";
$comments .= "<div id=blog_comment><pre wrap>\n$message</pre>";
# Adding the commenter's nick/e-mail
if (length $nick || length $e_mail) {
my $name;
if (length $nick) { $name = $nick; }
else { $name = $e_mail; }
if (length $e_mail) {
$comments .= "- <a href=\"mailto:$e_mail\">$name</a>";
} else {
$comments .= "- $name";
}
$comments .= "\n<br/>";
}
$comments .= "</div></p>\n";
}
return $comments;
}
sub getTOC {
my $TOC = "<h1>Table of contents</h1>\n<p>\n";
my $passive_entry = $_[0]; # Can be empty
$TOC .= "<table>\n";
foreach my $entry (@entries) {
open(ENTRY_DESC, "<$entry/description") or die "No description file for entry $entry";
my $date = <ENTRY_DESC>;
my $desc = <ENTRY_DESC>;
chomp $date;
chomp $desc;
my $entry_name = (split("/", $entry))[1];
if ($entry eq $passive_entry) {
$TOC .= "<tr><td>$date</td><td>$desc</td></tr>\n";
} else {
# Okay this is a kludge I added when I decided to include separate
# entry pages also. passive_entry should be set to something non-existant
# when root TOC is generated in "separate entries" mode.
# This way it knows not to make the links aliases, but rather separate htmls.
if ($passive_entry) {
$TOC .= "<tr><td>$date</td><td><a href=\"$entry_name.html\">$desc</a></td></tr>\n";
} else {
$TOC .= "<tr><td>$date</td><td><a href=\"#$entry_name\">$desc</a></td></tr>\n";
}
}
}
$TOC .= "</table>\n</p>\n";
return $TOC;
}
return 1;
gencontent.pm
This is the CGI script that processes comments posted via HTTP:
#!/usr/bin/perl
# This file takes a comment POST as input, updates
# comment files and finally the HTML(s)
my $refresh = "src/genroot.pl";
use CGI;
my $cgi = new CGI;
# Make sure that no one is sending malicious file paths here..
if ($cgi->param("src") =~ m/^\// || $cgi->param("src") =~ m/\.\./) {
die print "Content-type: text/html\n\nPath (" . $cgi->param("src") . ") was rigged, refusing to write";
}
# We find the comments file from a path derived from "src" input
my $filepath = $cgi->param("src")."/comments";
open(COMMENTS, ">>$filepath") or die print "Content-type: text/html\n\nCouldn't open $filepath";
if (-s $filepath > 500000) {
die print "Content-type: text/html\n\nComments exceed 500kB, suspecting spam.";
}
print COMMENTS getDate()." #".getSaneName()." #".getSaneEmail()." #".getSaneComment()." #";
# We refresh the HTMLs and return back to the previous page
system($refresh);
print $cgi->redirect($ENV{HTTP_REFERER});
sub getDate {
my @timedata = localtime(time);
return "$timedata[3].".($timedata[4]+1).".".($timedata[5]+1900);
}
sub getSaneName {
return clean($cgi->param("nick"));
}
sub getSaneEmail {
return clean($cgi->param("email"));
}
sub getSaneComment {
return clean($cgi->param("content"));
}
# As you can see, input sanitation is very crude.
# Plx comment if you see something especially worrying.
sub clean {
return escapeSeparator(escapeAngleBrac($_[0]));
}
sub escapeAngleBrac {
my $input = $_[0];
$input =~ s/</</g;
$input =~ s/>/>/g;
return $input;
}
sub escapeSeparator {
my $input = $_[0];
$input =~ s/#/\\#/g;
return $input;
}
comment.cgi
Grab this package to get all the example templates and such as well.
Comments
8.1.2010
5.8.2010